Uni-Parser Technical Report
By: Xi Fang , Haoyi Tao , Shuwen Yang and more
Potential Business Impact:
Reads science papers and patents super fast.
This technical report introduces Uni-Parser, an industrial-grade document parsing engine tailored for scientific literature and patents, delivering high throughput, robust accuracy, and cost efficiency. Unlike pipeline-based document parsing methods, Uni-Parser employs a modular, loosely coupled multi-expert architecture that preserves fine-grained cross-modal alignments across text, equations, tables, figures, and chemical structures, while remaining easily extensible to emerging modalities. The system incorporates adaptive GPU load balancing, distributed inference, dynamic module orchestration, and configurable modes that support either holistic or modality-specific parsing. Optimized for large-scale cloud deployment, Uni-Parser achieves a processing rate of up to 20 PDF pages per second on 8 x NVIDIA RTX 4090D GPUs, enabling cost-efficient inference across billions of pages. This level of scalability facilitates a broad spectrum of downstream applications, ranging from literature retrieval and summarization to the extraction of chemical structures, reaction schemes, and bioactivity data, as well as the curation of large-scale corpora for training next-generation large language models and AI4Science models.
Similar Papers
Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs
CV and Pattern Recognition
Lets computers understand math in papers.
MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns
CV and Pattern Recognition
Reads messy, complex documents perfectly.
DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
CV and Pattern Recognition
Finds errors in scanned documents better than others.