Ministral 3
By: Alexander H. Liu , Kartik Khandelwal , Sandeep Subramanian and more
Potential Business Impact:
Computers understand pictures and solve hard problems.
We introduce the Ministral 3 series, a family of parameter-efficient dense language models designed for compute and memory constrained applications, available in three model sizes: 3B, 8B, and 14B parameters. For each model size, we release three variants: a pretrained base model for general-purpose use, an instruction finetuned, and a reasoning model for complex problem-solving. In addition, we present our recipe to derive the Ministral 3 models through Cascade Distillation, an iterative pruning and continued training with distillation technique. Each model comes with image understanding capabilities, all under the Apache 2.0 license.
Similar Papers
Olmo 3
Computation and Language
Makes computers smarter for talking and thinking.
MiniLingua: A Small Open-Source LLM for European Languages
Computation and Language
Makes AI understand many languages on your phone.
Training Report of TeleChat3-MoE
Computation and Language
Builds super smart AI faster and bigger.