From Prompt to Pipeline: Large Language Models for Scientific Workflow Development in Bioinformatics
By: Khairul Alam, Banani Roy
Potential Business Impact:
AI helps scientists build DNA analysis tools faster.
Scientific Workflow Systems such as Galaxy and Nextflow are essential for scalable, reproducible, and automated bioinformatics analyses. However, developing and understanding scientific workflows remains challenging for many domain scientists due to the complexity of tool/module selection, infrastructure requirements, and limited programming expertise. This study explores whether state-of-the-art Large Language Models such as GPT-4o, Gemini 2.5 Flash, and DeepSeek-V3 can assist in generating accurate, complete, and usable bioinformatics workflows. We evaluate a set of representative workflows covering tasks such as RNA-seq, SNP analysis, and DNA methylation across both Galaxy (graphical) and Nextflow (script-based) platforms. To simulate realistic usage, we adopt a tiered prompting strategy: each workflow is first generated using an instruction-only prompt; if the output is incomplete or incorrect, we escalate to a role-based prompt, and finally to chain-of-thought prompting if needed. The generated workflows are evaluated against community-curated baselines from the Galaxy Training Network and nf-core, using criteria including correctness, completeness, tool appropriateness, and executability. Results show that LLMs exhibit strong potential in workflow development. Gemini 2.5 Flash produced the most accurate and user-friendly workflows in Galaxy, while DeepSeek-V3 excelled in Nextflow pipeline generation. GPT-4o performed nicely with structured prompts. Prompting strategy significantly influenced output quality, with role-based and chain-of-thought prompts enhancing correctness and completeness. Overall, LLMs can reduce the cognitive and technical barriers to workflow development, making SWSs more accessible to novice and expert users. This work highlights the practical utility of LLMs and provides actionable insights for integrating them into real-world bioinformatics workflow design.
Similar Papers
From Prompt to Pipeline: Large Language Models for Scientific Workflow Development in Bioinformatics
Software Engineering
Helps scientists build complex data tools easily.
Benchmarking Large Language Models on Multiple Tasks in Bioinformatics NLP with Prompting
Computation and Language
Tests AI's ability to solve biology problems.
Towards LLM-Powered Task-Aware Retrieval of Scientific Workflows for Galaxy
Software Engineering
Finds the right science tools for any job.