Mistral Medium 3.5 128B Released: Dense Model with Configurable Reasoning and Vision

Mistral AI has released Mistral Medium 3.5 (128B), a dense transformer model that replaces Mistral Medium 3.1 and Magistral in Le Chat, and Devstral 2 in their coding agent Vibe. It's a single set of weights handling instruction-following, reasoning, and coding.
Key Features
- Dense 128B parameters — not Mixture of Experts.
- 256k context window for long inputs.
- Multimodal input: accepts text and images; outputs text only. Vision encoder trained from scratch to handle variable sizes and aspect ratios.
- Configurable reasoning effort: toggle per request between instant reply (
none) and deep reasoning (high). - Native function calling and JSON output for agentic workflows.
- Multilingual: supports English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic, and others.
- Strong system prompt adherence.
Recommended Settings
- Reasoning effort:
nonefor quick replies;highfor complex prompts and agentic usage (e.g.,reasoning_effort="high"). - Temperature: 0.7 with
highreasoning; 0.0–0.7 withnonedepending on desired creativity.
License
Released under a Modified MIT License — open-source for commercial and non-commercial use, with exceptions for large revenue companies.
GGUF Quantizations Available
Unsloth has published a GGUF version on Hugging Face: unsloth/Mistral-Medium-3.5-128B-GGUF
This model is relevant for developers running local AI coding agents, particularly those needing high-quality instruction following, reasoning, and vision in a single dense model with a large context window.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Anthropic Doubles Claude Code Rate Limits, Removes Peak Throttling for Paid Plans
Anthropic has doubled 5-hour rate limits for Claude Code across Pro, Max, Team, and Enterprise plans, removed peak-hour throttling, and boosted API rate limits for Opus models.

Claude Code Generates Python Script That Finds 10,069-Digit Emirp Record
Anthropic's Claude Opus 4.6 generated a Python script that discovered a 10,069-digit emirp (reversible prime) in about one day of CPU time, breaking the previous world record. The script uses four tiers of prime sieves including a CUDA kernel for fast random number generation.

Anthropic's Business Strategy: API Revenue Drives Consumer Tier Limitations
Anthropic's consumer subscription tiers operate at a loss, subsidized to build AI mindshare, while their API business generates revenue. The $20 Pro tier is intentionally limited to filter users toward higher-value Max subscriptions.

SDNY Court Rules AI-Generated Legal Documents Not Protected by Privilege
Judge Jed S. Rakoff ruled that 31 documents generated using Anthropic's Claude AI tool were not protected by attorney-client privilege or work product doctrine, marking the first such court decision on AI-generated legal materials.