It is well known that increasing both the amount of data and the size of AI models can make them much smarter. However, researchers and companies still don’t have much experience in scaling up extremely large models, whether they are traditional (dense) models or Mixture-of-Experts (MoE) models. Recently, DeepSeek V3 revealed some important details about this process.
At the same time, we have been developing Qwen2.5-Max, a powerful MoE model trained on over 20 trillion words. We also improved it using Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). Today, we are excited to share its performance and announce that the Qwen2.5-Max API is now available through Alibaba Cloud. You can also try it out on Qwen Chat!
How Well Does Qwen2.5-Max Perform?
We tested Qwen2.5-Max against top AI models, both private and open-source, using different benchmarks:
- MMLU-Pro – Tests knowledge using college-level questions.
- LiveCodeBench – Measures coding skills.
- LiveBench – Tests general AI abilities.
- Arena-Hard – Evaluates how well the AI aligns with human preferences.
Key Results:
- Qwen2.5-Max performed better than DeepSeek V3 in Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond.
- It also showed strong results in MMLU-Pro.
When comparing base models (without extra fine-tuning), we tested Qwen2.5-Max against leading open-source models, such as:
- DeepSeek V3 (a top MoE model).
- Llama-3.1-405B (largest open-weight dense model).
- Qwen2.5-72B (one of the best open-weight dense models).
The results show that Qwen2.5-Max is among the best models available today. With further improvements in training, future versions will be even better.
Try Qwen2.5-Max
You can now chat with Qwen2.5-Max on Qwen Chat! It supports chatting, searching, and interacting with different features.
If you want to use the Qwen2.5-Max API, follow these steps:
- Create an Alibaba Cloud account.
- Activate Alibaba Cloud Model Studio.
- Get an API key from the console.
Since Qwen’s API is compatible with OpenAI’s, you can use it in Python just like OpenAI’s models. Here’s an example:
Python:
pythonCopyEditfrom openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv("API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-max-2025-01-25",
messages=[
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Which number is larger, 9.11 or 9.8?'}
]
)
print(completion.choices[0].message)
Looking Ahead
Scaling AI models is not just about making them bigger; it’s about making them smarter. We are working on advanced reinforcement learning techniques to improve AI’s reasoning and problem-solving skills. Our goal is to push AI beyond human intelligence and explore new frontiers of knowledge.