Serverless Approach to Running Resource-Intensive STAR Aligner
By: Piotr Kica, Michał Orzechowski, Maciej Malawski
Potential Business Impact:
Makes computer programs run faster and cheaper.
The application of serverless computing for alignment of RNA-sequences can improve many existing bioinformatics workflows by reducing operational costs and execution times. This work analyzes the applicability of serverless services for running the STAR aligner, which is known for its accuracy and large memory requirement. This presents a challenge, as serverless services were designed for light and short tasks. Nevertheless, we successfully deploy a STAR-based pipeline on AWS ECS service, propose multiple optimizations, and perform experiment with 17 TBs of data. Results are compared against standard virtual machine (VM) based solution showing that serverless is a valid alternative for small-scale batch processing. However, in large-scale where efficiency matters the most, VMs are still recommended.
Similar Papers
Accelerating Cloud-Based Transcriptomics: Performance Analysis and Optimization of the STAR Aligner Workflow
Distributed, Parallel, and Cluster Computing
Makes reading genetic code faster and cheaper.
Analysis of cost-efficiency of serverless approaches
Software Engineering
Saves money by using computers smarter.
Combining Serverless and High-Performance Computing Paradigms to support ML Data-Intensive Applications
Distributed, Parallel, and Cluster Computing
Lets computers process big data faster without big machines.