Efficient Multilingual Dialogue Processing via Translation Pipelines and Distilled Language Models
By: Santiago Martínez Novoa , Nicolás Rozo Fajardo , Diego Alejandro González Vargas and more
This paper presents team Kl33n3x's multilingual dialogue summarization and question answering system developed for the NLPAI4Health 2025 shared task. The approach employs a three-stage pipeline: forward translation from Indic languages to English, multitask text generation using a 2.55B parameter distilled language model, and reverse translation back to source languages. By leveraging knowledge distillation techniques, this work demonstrates that compact models can achieve highly competitive performance across nine languages. The system achieved strong win rates across the competition's tasks, with particularly robust performance on Marathi (86.7% QnA), Tamil (86.7% QnA), and Hindi (80.0% QnA), demonstrating the effectiveness of translation-based approaches for low-resource language processing without task-specific fine-tuning.
Similar Papers
TiME: Tiny Monolingual Encoders for Efficient NLP Pipelines
Computation and Language
Makes computer language tasks faster and use less power.
Efficient Speech Translation through Model Compression and Knowledge Distillation
Computation and Language
Makes translation apps smaller and faster.
Distilling Multilingual Vision-Language Models: When Smaller Models Stay Multilingual
Computation and Language
Makes AI understand many languages better, even when smaller.