Make a Video Call with LLM: A Measurement Campaign over Five Mainstream Apps
By: Jiayang Xu , Xiangjie Huang , Zijie Li and more
Potential Business Impact:
Tests how well AI video chats work.
In 2025, Large Language Model (LLM) services have launched a new feature -- AI video chat -- allowing users to interact with AI agents via real-time video communication (RTC), just like chatting with real people. Despite its significance, no systematic study has characterized the performance of existing AI video chat systems. To address this gap, this paper proposes a comprehensive benchmark with carefully designed metrics across four dimensions: quality, latency, internal mechanisms, and system overhead. Using custom testbeds, we further evaluate five mainstream AI video chatbots with this benchmark. This work provides the research community a baseline of real-world performance and identifies unique system bottlenecks. In the meantime, our benchmarking results also open up several research questions for future optimizations of AI video chatbots.
Similar Papers
Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI
Networking and Internet Architecture
Makes AI video chats feel like talking to a person.
ChatBench: From Static Benchmarks to Human-AI Evaluation
Computation and Language
Tests how well people and AI work together.
AI Idea Bench 2025: AI Research Idea Generation Benchmark
Artificial Intelligence
Tests AI's best new ideas for science.