Stawinoga, NicolaiNicolaiStawinogaLal, SohanSohanLalCosenza, BiagioBiagioCosenzaSalzmann, PhilipPhilipSalzmannThoman, PeterPeterThomanFahringer, ThomasThomasFahringer2026-01-142026-01-142025-12-25Future Generation Computer Systems 179: 108337 (2026)https://hdl.handle.net/11420/60806Highly scalable parallel applications can efficiently solve expensive computational problems when run on a large number of compute nodes. However, selecting the optimal number of nodes for a compute job of a given size is non-trivial, and allocating too few or too many nodes may not yield the expected performance. Knowing the scaling behavior of an application in advance enables us, for example, to make optimal use of the available hardware resources. We introduce a novel, portable approach to predict the scalability of parallel applications written in modern high-level programming models. We propose a predictive compiler-runtime framework based on Celerity, a task-based distributed runtime system that enables executing SYCL codes on clusters. The framework targets a broad range of computing systems, from CPU to GPU clusters, and proposes a model that combines machine learning, communication modeling and DAG heuristics. Experimental results on two large-scale clusters, JUWELS and Marconi-100, show accurate scalability prediction of unseen single and multi-task applications.en0167-739XFuture generation computer systems2025Elsevier BVhttps://creativecommons.org/licenses/by/4.0/Computer Science, Information and General Works::004: Computer SciencesComputer Science, Information and General Works::005: Computer Programming, Programs, Data and SecurityA portable compiler-runtime approach for scalability predictionJournal Articlehttps://doi.org/10.15480/882.1646310.1016/j.future.2025.10833710.15480/882.16463Journal Article