Score: 0

Scalable Offline ASR for Command-Style Dictation in Courtrooms

Published: June 7, 2025 | arXiv ID: 2507.01021v1

By: Kumarmanas Nethil , Vaibhav Mishra , Kriti Anandan and more

Potential Business Impact:

Lets many people talk to computers at once.

Business Areas:

Speech Recognition Data and Analytics, Software

We propose an open-source framework for Command-style dictation that addresses the gap between resource-intensive Online systems and high-latency Batch processing. Our approach uses Voice Activity Detection (VAD) to segment audio and transcribes these segments in parallel using Whisper models, enabling efficient multiplexing across audios. Unlike proprietary systems like SuperWhisper, this framework is also compatible with most ASR architectures, including widely used CTC-based models. Our multiplexing technique maximizes compute utilization in real-world settings, as demonstrated by its deployment in around 15% of India's courtrooms. Evaluations on live data show consistent latency reduction as user concurrency increases, compared to sequential batch processing. The live demonstration will showcase our open-sourced implementation and allow attendees to interact with it in real-time.

WhisperKit: On-device Real-time ASR with Billion-Scale Transformers

Sound

Lets phones understand your voice super fast.

14 Jul 2025 0

88%

LibriVAD: A Scalable Open Dataset with Deep Learning Benchmarks for Voice Activity Detection

Sound

Helps computers hear talking in loud places.

19 Dec 2025 2

87%

AS-ASR: A Lightweight Framework for Aphasia-Specific Automatic Speech Recognition

Audio and Speech Processing

Helps people with speech problems talk to computers.

6 Jun 2025 0

View PDF Login to Bookmark

Page Count

2 pages

Scalable Offline ASR for Command-Style Dictation in Courtrooms

Lets many people talk to computers at once.

Technical Abstract

WhisperKit: On-device Real-time ASR with Billion-Scale Transformers

LibriVAD: A Scalable Open Dataset with Deep Learning Benchmarks for Voice Activity Detection

AS-ASR: A Lightweight Framework for Aphasia-Specific Automatic Speech Recognition