Comparing Open-Source and Commercial LLMs for Domain-Specific Analysis and Reporting: Software Engineering Challenges and Design Trade-offs
By: Theo Koraag , Niklas Wagner , Felix Dobslaw and more
Potential Business Impact:
Helps computers write financial reports automatically.
Context: Large Language Models (LLMs) enable automation of complex natural language processing across domains, but research on domain-specific applications like Finance remains limited. Objectives: This study explored open-source and commercial LLMs for financial report analysis and commentary generation, focusing on software engineering challenges in implementation. Methods: Using Design Science Research methodology, an exploratory case study iteratively designed and evaluated two LLM-based systems: one with local open-source models in a multi-agent workflow, another using commercial GPT-4o. Both were assessed through expert evaluation of real-world financial reporting use cases. Results: LLMs demonstrated strong potential for automating financial reporting tasks, but integration presented significant challenges. Iterative development revealed issues including prompt design, contextual dependency, and implementation trade-offs. Cloud-based models offered superior fluency and usability but raised data privacy and external dependency concerns. Local open-source models provided better data control and compliance but required substantially more engineering effort for reliability and usability. Conclusion: LLMs show strong potential for financial reporting automation, but successful integration requires careful attention to architecture, prompt design, and system reliability. Implementation success depends on addressing domain-specific challenges through tailored validation mechanisms and engineering strategies that balance accuracy, control, and compliance.
Similar Papers
Evaluating Large Language Models (LLMs) in Financial NLP: A Comparative Study on Financial Report Analysis
Computation and Language
Tests AI to find best answers for money news.
Towards Automated Regulatory Compliance Verification in Financial Auditing with Large Language Models
Computation and Language
Helps check if money papers follow rules.
Toward Automated and Trustworthy Scientific Analysis and Visualization with LLM-Generated Code
Software Engineering
AI writes code for scientists' data.