Score: 0

Towards Automated Regulatory Compliance Verification in Financial Auditing with Large Language Models

Published: July 22, 2025 | arXiv ID: 2507.16642v1

By: Armin Berger , Lars Hillebrand , David Leonhard and more

Potential Business Impact:

Helps check if money papers follow rules.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The auditing of financial documents, historically a labor-intensive process, stands on the precipice of transformation. AI-driven solutions have made inroads into streamlining this process by recommending pertinent text passages from financial reports to align with the legal requirements of accounting standards. However, a glaring limitation remains: these systems commonly fall short in verifying if the recommended excerpts indeed comply with the specific legal mandates. Hence, in this paper, we probe the efficiency of publicly available Large Language Models (LLMs) in the realm of regulatory compliance across different model configurations. We place particular emphasis on comparing cutting-edge open-source LLMs, such as Llama-2, with their proprietary counterparts like OpenAI's GPT models. This comparative analysis leverages two custom datasets provided by our partner PricewaterhouseCoopers (PwC) Germany. We find that the open-source Llama-2 70 billion model demonstrates outstanding performance in detecting non-compliance or true negative occurrences, beating all their proprietary counterparts. Nevertheless, proprietary models such as GPT-4 perform the best in a broad variety of scenarios, particularly in non-English contexts.

Page Count
10 pages

Category
Computer Science:
Computation and Language