Score: 1

SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure

Published: April 3, 2025 | arXiv ID: 2504.02364v1

By: Apurv Deepak Kulkarni, Siavash Ghiasvand

Potential Business Impact:

Tests how fast computers can handle lots of information.

Business Areas:
Big Data Data and Analytics

Recent advancements in data stream processing frameworks have improved real-time data handling, however, scalability remains a significant challenge affecting throughput and latency. While studies have explored this issue on local machines and cloud clusters, research on modern high performance computing (HPC) infrastructures is yet limited due to the lack of scalable measurement tools. This work presents SProBench, a novel benchmark suite designed to evaluate the performance of data stream processing frameworks in large-scale computing systems. Building on best practices, SProBench incorporates a modular architecture, offers native support for SLURM-based clusters, and seamlessly integrates with popular stream processing frameworks such as Apache Flink, Apache Spark Streaming, and Apache Kafka Streams. Experiments conducted on HPC clusters demonstrate its exceptional scalability, delivering throughput that surpasses existing benchmarks by more than tenfold. The distinctive features of SProBench, including complete customization options, built-in automated experiment management tools, seamless interoperability, and an open-source license, distinguish it as an innovative benchmark suite tailored to meet the needs of modern data stream processing frameworks.

Country of Origin
🇩🇪 Germany

Repos / Data Links

Page Count
14 pages

Category
Computer Science:
Distributed, Parallel, and Cluster Computing