Towards Scalable Web Accessibility Audit with MLLMs as Copilots
By: Ming Gu , Ziwei Wang , Sicen Lai and more
Potential Business Impact:
Helps make websites work for everyone.
Ensuring web accessibility is crucial for advancing social welfare, justice, and equality in digital spaces, yet the vast majority of website user interfaces remain non-compliant, due in part to the resource-intensive and unscalable nature of current auditing practices. While WCAG-EM offers a structured methodology for site-wise conformance evaluation, it involves great human efforts and lacks practical support for execution at scale. In this work, we present an auditing framework, AAA, which operationalizes WCAG-EM through a human-AI partnership model. AAA is anchored by two key innovations: GRASP, a graph-based multimodal sampling method that ensures representative page coverage via learned embeddings of visual, textual, and relational cues; and MaC, a multimodal large language model-based copilot that supports auditors through cross-modal reasoning and intelligent assistance in high-effort tasks. Together, these components enable scalable, end-to-end web accessibility auditing, empowering human auditors with AI-enhanced assistance for real-world impact. We further contribute four novel datasets designed for benchmarking core stages of the audit pipeline. Extensive experiments demonstrate the effectiveness of our methods, providing insights that small-scale language models can serve as capable experts when fine-tuned.
Similar Papers
A11YN: aligning LLMs for accessible web UI code generation
Software Engineering
Makes websites work for everyone, not just some.
No-Human in the Loop: Agentic Evaluation at Scale for Recommendation
Artificial Intelligence
Tests AI to judge other AI fairly.
Who Gets Left Behind? Auditing Disability Inclusivity in Large Language Models
Computers and Society
Helps AI give better advice to all people.