An Empirical Study of On-Device Translation for Real-Time Live-Stream Chat on Mobile Devices
By: Jeiyoon Park , Daehwan Lee , Changmin Yeo and more
Potential Business Impact:
Makes AI work well on your phone.
Despite its efficiency, there has been little research on the practical aspects required for real-world deployment of on-device AI models, such as the device's CPU utilization and thermal conditions. In this paper, through extensive experiments, we investigate two key issues that must be addressed to deploy on-device models in real-world services: (i) the selection of on-device models and the resource consumption of each model, and (ii) the capability and potential of on-device models for domain adaptation. To this end, we focus on a task of translating live-stream chat messages and manually construct LiveChatBench, a benchmark consisting of 1,000 Korean-English parallel sentence pairs. Experiments on five mobile devices demonstrate that, although serving a large and heterogeneous user base requires careful consideration of highly constrained deployment settings and model selection, the proposed approach nevertheless achieves performance comparable to commercial models such as GPT-5.1 on the well-targeted task. We expect that our findings will provide meaningful insights to the on-device AI community.
Similar Papers
Beyond the Cloud: Assessing the Benefits and Drawbacks of Local LLM Deployment for Translators
Computation and Language
Lets translators use AI without sending data online.
Hardware optimization on Android for inference of AI models
Machine Learning (CS)
Makes phone AI apps run much faster.
Scaling Laws for Energy Efficiency of Local LLMs
Artificial Intelligence
Makes AI work on phones, faster and cheaper.