Score: 1

On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing Tools

Published: June 5, 2025 | arXiv ID: 2506.05182v2

By: Shivani Upadhyay , Messiah Ataey , Syed Shariyar Murtaza and more

Potential Business Impact:

Helps computers understand complex charts and tables.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The proliferation of complex structured data in hybrid sources, such as PDF documents and web pages, presents unique challenges for current Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) in providing accurate answers. Despite the recent advancements of MLLMs, they still often falter when interpreting intricately structured information, such as nested tables and multi-dimensional plots, leading to hallucinations and erroneous outputs. This paper explores the capabilities of LLMs and MLLMs in understanding and answering questions from complex data structures found in PDF documents by leveraging industrial and open-source tools as part of a pre-processing pipeline. Our findings indicate that GPT-4o, a popular MLLM, achieves an accuracy of 56% on multi-structured documents when fed documents directly, and that integrating pre-processing tools raises the accuracy of LLMs to 61.3% for GPT-4o and 76% for GPT-4, and with lower overall cost. The code is publicly available at https://github.com/OGCDS/FinancialQA.

Information Extraction From Fiscal Documents Using LLMs

Computation and Language

Lets computers understand government money reports.

3 Nov 2025 1

90%

The Effectiveness of Large Language Models in Transforming Unstructured Text to Standardized Formats

Artificial Intelligence

Turns messy text into organized lists.

4 Mar 2025 1

89%

Multi-Modal Vision vs. Text-Based Parsing: Benchmarking LLM Strategies for Invoice Processing

Computation and Language

Helps computers understand invoices better.

29 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇦 Canada

Repos / Data Links

github.com

Page Count

15 pages

On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing Tools

Helps computers understand complex charts and tables.

Technical Abstract

Information Extraction From Fiscal Documents Using LLMs

The Effectiveness of Large Language Models in Transforming Unstructured Text to Standardized Formats

Multi-Modal Vision vs. Text-Based Parsing: Benchmarking LLM Strategies for Invoice Processing