Accelerating End-to-End PDF to Markdown Conversion Through Assisted Generation
By: Changxu Duan
Converting data from machine-unreadable formats like PDFs into Markdown has the potential to enhance the accessibility of scientific research. Existing end-to-end decoder transformer models can transform screenshots of PDFs into Markdown, offering more flexibility than pipeline-based methods. Yet, decoding text token by token from scratch is inefficient, especially when dense text can be directly copied from the PDF. To address this challenge, this paper modifies Prompt Lookup Decoding (PLD) to extract candidate sequences directly from PDF files, leveraging the high n-gram overlap between PDFs and their Markdown equivalents. A new method, Copy Lookup Decoding (CLD), is introduced here to enhance PLD's candidate generation mechanism. Experiments demonstrate that CLD can accelerate the conversion process by up to 1.70$\times$ at original quality. The codebase for this paper is open-source on GitHub (https://github.com/Fireblossom/CopyLookup).
Similar Papers
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Artificial Intelligence
Makes AI art match your exact ideas.
Decoding Large Language Diffusion Models with Foreseeing Movement
Machine Learning (CS)
Makes AI write better by choosing words smarter.
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
Computation and Language
Makes AI write much faster, almost as good.