AIME 2026 Results: Both Open and Closed Models Score Above 90%

The AIME 2026 (American Invitational Mathematics Examination) results are out, and both closed and open AI models are now scoring above 90% on this challenging mathematical reasoning benchmark.
Key Highlights
- Both proprietary (closed) and open-source models exceed 90% accuracy
- DeepSeek V3.2 can run the entire test for approximately bash.09 in API costs
- This represents a significant milestone in mathematical reasoning capabilities
What This Means
AIME is traditionally one of the most challenging high school mathematics competitions, featuring problems that require sophisticated mathematical reasoning. AI models achieving 90%+ accuracy demonstrates remarkable progress in complex reasoning abilities.
Cost Efficiency
The fact that DeepSeek V3.2 can achieve competitive results at just bash.09 for the entire test highlights the rapidly decreasing cost of advanced AI capabilities, making sophisticated reasoning more accessible.
Why This Matters
The achievement of over 90% accuracy by both closed and open AI models signifies a pivotal moment in the evolution of AI technologies. It showcases the potential for AI to assist not only in educational contexts but also in real-world applications where complex problem-solving is required. This advancement may encourage further investment and development in AI systems, particularly in areas that require high-level cognitive functions.
Key Takeaways
- The performance of AI models in AIME 2026 indicates a leap in their mathematical reasoning capabilities.
- Both proprietary and open-source models are reaching similar levels of accuracy, promoting healthy competition and innovation in the AI space.
- Cost-effective solutions like DeepSeek V3.2 are making advanced AI tools more accessible to a broader audience.
- This progress could inspire educational institutions to integrate AI tools into their curricula, enhancing learning experiences.
Getting Started
For those interested in leveraging AI for mathematical reasoning or other complex tasks, starting with tools like DeepSeek V3.2 is straightforward. Users can sign up for an API key on the DeepSeek website, enabling them to access the model's capabilities. Once registered, developers can integrate the API into their applications or use it for personal projects, allowing for experimentation with AI-driven problem-solving.
Full results: matharena.ai
📖 Read the full source: r/LocalLLaMA
👀 See Also

ACP Bug Investigation: Protocol Mismatch Causes 'metadata is missing' Error with Local Ollama
A confirmed bug in the ACP/OpenClaw integration prevents acpx spawn commands from working with local Ollama models due to a protocol mismatch where acpx expects JSON but receives text output.

Revolutionize API Monitoring Across Providers with onWatch
Discover how onWatch, a powerful new tool, streamlines tracking your AI API quota usage across multiple providers, ensuring you stay within limits and optimize resource allocation.

Google Quietly Buying Play Store Code to Train AI Coding Tools
Google is emailing Android developers offering to pay for their app codebases to train AI coding tools, as part of a confidential pilot program.

The AI Dependency Trap: Why Over-Reliance on LLMs May Erode Core Skills
A contrarian take arguing that heavy reliance on AI chatbots will lead to atrophy of critical thinking, writing, research, and learning abilities.