Score: 0

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Published: December 31, 2025 | arXiv ID: 2512.24618v1

By: Junru Lu , Jiarui Qin , Lingfeng Qiao and more

We introduce Youtu-LLM, a lightweight yet powerful language model that harmonizes high computational efficiency with native agentic intelligence. Unlike typical small models that rely on distillation, Youtu-LLM (1.96B) is pre-trained from scratch to systematically cultivate reasoning and planning capabilities. The key technical advancements are as follows: (1) Compact Architecture with Long-Context Support: Built on a dense Multi-Latent Attention (MLA) architecture with a novel STEM-oriented vocabulary, Youtu-LLM supports a 128k context window. This design enables robust long-context reasoning and state tracking within a minimal memory footprint, making it ideal for long-horizon agent and reasoning tasks. (2) Principled "Commonsense-STEM-Agent" Curriculum: We curated a massive corpus of approximately 11T tokens and implemented a multi-stage training strategy. By progressively shifting the pre-training data distribution from general commonsense to complex STEM and agentic tasks, we ensure the model acquires deep cognitive abilities rather than superficial alignment. (3) Scalable Agentic Mid-training: Specifically for the agentic mid-training, we employ diverse data construction schemes to synthesize rich and varied trajectories across math, coding, and tool-use domains. This high-quality data enables the model to internalize planning and reflection behaviors effectively. Extensive evaluations show that Youtu-LLM sets a new state-of-the-art for sub-2B LLMs. On general benchmarks, it achieves competitive performance against larger models, while on agent-specific tasks, it significantly surpasses existing SOTA baselines, demonstrating that lightweight models can possess strong intrinsic agentic capabilities.

Unified Mind Model: Reimagining Autonomous Agents in the LLM Era

Artificial Intelligence

Builds smart robot helpers that learn and think.

5 Mar 2025 0

88%

From Language to Action: A Review of Large Language Models as Autonomous Agents and Tool Users

Computation and Language

AI learns to think, plan, and improve itself.

24 Aug 2025 0

88%

PustakAI: Curriculum-Aligned and Interactive Textbooks Using Large Language Models

Computation and Language

Helps AI answer school questions accurately.

13 Nov 2025 0

View PDF Login to Bookmark

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Technical Abstract

Unified Mind Model: Reimagining Autonomous Agents in the LLM Era

From Language to Action: A Review of Large Language Models as Autonomous Agents and Tool Users

PustakAI: Curriculum-Aligned and Interactive Textbooks Using Large Language Models