In machine translation, two new models are shaking up the field. The HY-MT1.5-1.8B and HY-MT1.5-7B combine efficiency with strong performance. They set new benchmarks for their size and question the need for ultra-large models dominating the market.
The Story
Machine translation powers global communication, breaking language barriers every day. Larger models have long led the pack, demanding heavy compute and resources. The HY-MT1.5 series flips this script. They deliver nearly the same quality with fewer parameters, cutting costs and resource use.
These models outperform bigger open-source options like Tower-Plus-72B and Qwen3-32B, plus commercial APIs such as Microsoft Translator and Doubao Translator. According to recent research (arXiv:2512.24092v1), they hit about 90% of the performance of ultra-large proprietary models like Gemini-3.0-Pro. They also excel on benchmarks like WMT25 and tests involving Mandarin-minority languages.
The Context
The HY-MT1.5 models were built by a team including Mao Zheng, Zheng Li, Tao Chen, Mingyang Song, and Di Wang. With 1.8 billion and 7 billion parameters, these models deliver strong results without the massive compute costs of their larger rivals. They use a multi-stage training process: general and machine translation pre-training, supervised fine-tuning, on-policy distillation, and reinforcement learning.
The 1.8B model shows exceptional efficiency, beating much larger models in Chinese-foreign and English-foreign translation tasks. The 7B model sets a new standard for its size, reaching 95% of Gemini-3.0-Pro’s score on the Flores-200 benchmark and outperforming it on WMT25 and Mandarin-minority language tests.
Beyond translation quality, these models handle advanced features like terminology control, context-aware translation, and format preservation. This versatility suits both everyday and specialized translation needs.
Key Takeaways
- Parameter Efficiency: HY-MT1.5 models deliver top performance with fewer parameters, lowering costs and resource demands.
- Competitive Edge: They outperform larger open-source baselines and some commercial APIs, setting new size-class records.
- Industry Shift: These models challenge the trend toward ever-larger proprietary systems, hinting at a new direction in machine translation.
- Advanced Capabilities: Support for terminology intervention and context-aware translation broadens their use cases.
- Research Leadership: Developed by a skilled team, these models showcase cutting-edge AI translation advances.
The HY-MT1.5-1.8B and HY-MT1.5-7B models mark a turning point. They prove you don’t need to build bigger to build better. This could reshape machine translation’s future, making high-quality AI-driven translation more accessible and efficient.