Bringing Computation to the data: Interoperable serverless function execution for astrophysical data analysis in the SRCNet
By: Manuel Parra-Royón , Julián Garrido-Sánchez , Susana Sánchez-Expósito and more
Potential Business Impact:
Lets telescopes analyze huge amounts of data faster.
Serverless computing is a paradigm in which the underlying infrastructure is fully managed by the provider, enabling applications and services to be executed with elastic resource provisioning and minimal operational overhead. A core model within this paradigm is Function-as-a-Service (FaaS), where lightweight functions are deployed and triggered on demand, scaling seamlessly with workload. FaaS offers flexibility, cost-effectiveness, and fine-grained scalability, qualities particularly relevant for large-scale scientific infrastructures where data volumes are too large to centralise and computation must increasingly occur close to the data. The Square Kilometre Array Observatory (SKAO) exemplifies this challenge. Once operational, it will generate about 700~PB of data products annually, distributed across the SKA Regional Centre Network (SRCNet), a federation of international centres providing storage, computing, and analysis services. In such a context, FaaS offers a mechanism to bring computation to the data. We studied the principles of serverless and FaaS computing and explored their application to radio astronomy workflows. Representative functions for astrophysical data analysis were developed and deployed, including micro-functions derived from existing libraries and wrappers around domain-specific applications. In particular, a Gaussian convolution function was implemented and integrated within the SRCNet ecosystem. The use case demonstrates that FaaS can be embedded into the existing SRCNet ecosystem of services, allowing functions to run directly at sites where data replicas are stored. This reduces latency, minimises transfers, and improves efficiency, aligning with federated, data-proximate computation. The results show that serverless models provide a scalable and efficient pathway to address the data volumes of the SKA era.
Similar Papers
Bringing computation to the data: A MOEA-driven approach for optimising data processing in the context of the SKA and SRCNet
Distributed, Parallel, and Cluster Computing
Moves computer work to where the telescope data is.
Characterizing FaaS Workflows on Public Clouds: The Good, the Bad and the Ugly
Distributed, Parallel, and Cluster Computing
Helps cloud programs run faster and cheaper.
Dynamic Function Configuration and its Management in Serverless Computing: A Taxonomy and Future Directions
Software Engineering
Helps cloud programs run faster and cheaper.