Score: 0

Gaia: Hybrid Hardware Acceleration for Serverless AI in the 3D Compute Continuum

Published: November 1, 2025 | arXiv ID: 2511.13728v1

By: Maximilian Reisecker , Cynthia Marcelino , Thomas Pusztai and more

Potential Business Impact:

Makes AI run faster and cheaper everywhere.

Business Areas:

IaaS Software

Serverless computing offers elastic scaling and pay-per-use execution, making it well-suited for AI workloads. As these workloads run in heterogeneous environments such as the Edge-Cloud-Space 3D Continuum, they often require intensive parallel computation, which GPUs can perform far more efficiently than CPUs. However, current platforms struggle to manage hardware acceleration effectively, as static user-device assignments fail to ensure SLO compliance under varying loads or placements, and one-time dynamic selections often lead to suboptimal or cost-inefficient configurations. To address these issues, we present Gaia, a GPU-as-a-service model and architecture that makes hardware acceleration a platform concern. Gaia combines (i) a lightweight Execution Mode Identifier that inspects function code at deploy time to emit one of four execution modes, and a Dynamic Function Runtime that continuously reevaluates user-defined SLOs to promote or demote between CPU- and GPU backends. Our evaluation shows that it seamlessly selects the best hardware acceleration for the workload, reducing end-to-end latency by up to 95%. These results indicate that Gaia enables SLO-aware, cost-efficient acceleration for serverless AI across heterogeneous environments.

HAS-GPU: Efficient Hybrid Auto-scaling with Fine-grained GPU Allocation for SLO-aware Serverless Inferences

Distributed, Parallel, and Cluster Computing

Makes AI run faster and cheaper.

4 May 2025 1

88%

Serverless GPU Architecture for Enterprise HR Analytics: A Production-Scale BDaaS Implementation

Distributed, Parallel, and Cluster Computing

Makes computer analysis faster, cheaper, and trustworthy.

22 Oct 2025 0

87%

AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs

Distributed, Parallel, and Cluster Computing

Lets old computers run smart AI programs.

6 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇦🇹 Austria

Page Count

10 pages

Gaia: Hybrid Hardware Acceleration for Serverless AI in the 3D Compute Continuum

Makes AI run faster and cheaper everywhere.

Technical Abstract

HAS-GPU: Efficient Hybrid Auto-scaling with Fine-grained GPU Allocation for SLO-aware Serverless Inferences

Serverless GPU Architecture for Enterprise HR Analytics: A Production-Scale BDaaS Implementation

AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs