MiniMax M2.7 Beats GPT 5.4 & Gemini 3.1 Pro as Coding Agent

MiniMax M2.7 Model Performance Details

The MiniMax M2.7 model was recently announced as the company's first model that "deeply participated in its own evolution," achieving an 88% win-rate against the previous M2.5 version.

Key Performance Metrics

SWE Performance: State-of-the-art results on SWE-Pro (56.22%) and Terminal Bench 2 (57.0%)
Production Readiness: Reduced intervention-to-recovery time for online incidents to 3 minutes in certain cases
Agentic Abilities: Trained for agent teams and tool search tool functionality, with 97% skill adherence across 40+ complex skills
Professional Workspace: State-of-the-art in professional knowledge, supporting multi-turn, high-fidelity Office file editing
OpenClaw Comparison: On par with Sonnet 4.6 in OpenClaw performance

User Testing Results

A developer who previously used Opus and Sonnet as their main agents tested M2.7 against several models. In their benchmarks comparing MiniMax M2.7 with GPT 5.4, Gemini 3.1 Pro, and other models, MiniMax delivered the fastest working results.

The developer created specific tooling challenges that models often struggle with, including:

Connecting to a system (finding IP, credentials)
Grabbing a config file requiring sudo access
Comparing it with another similar file on a local system
Reporting the differences

MiniMax M2.7 succeeded in this multi-step tool chain where some models failed completely, and was the fastest performer.

After approximately 5 hours of active usage with extensive tooling and system troubleshooting (though no coding tasks), the developer reported not missing Sonnet or Opus once.

The developer noted that while MiniMax pricing is approximately 10x the cost of Anthropic models, the performance made it an interesting alternative to consider.

📖 Read the full source: r/openclaw

MiniMax M2.7 Model Shows Strong Performance as AI Coding Agent

MiniMax M2.7 Model Performance Details

Key Performance Metrics

User Testing Results

👀 See Also

Study: AI Agents Express Marxist Views Under Repetitive Workloads

GitHub Claude-Code v2.1.27 Release: Key Updates and Fixes

Anthropic blocks third-party harnesses from Claude subscription limits, workaround available

AWS Bedrock Silently Kills Claude Opus 4.7 Quota: A Warning for Production AI Workflows