Sarvam AI releases 30B and 105B open-source LLMs with Indian training infrastructure

Model specifications and architecture
Sarvam 30B and Sarvam 105B are reasoning models trained from scratch on large-scale, high-quality datasets curated in-house across pre-training, supervised fine-tuning, and reinforcement learning stages. Training was conducted entirely in India on compute provided under the IndiaAI mission.
Both models use a Mixture-of-Experts (MoE) Transformer backbone with sparse expert routing to scale parameter count without increasing compute per token. The architecture supports long-context inputs through rotary positional embeddings, RMSNorm-based stabilization, and attention designs optimized for efficient KV-cache usage during inference.
Sarvam 30B uses Grouped Query Attention (GQA) to reduce KV-cache memory while maintaining performance. Sarvam 105B extends the architecture with greater depth and Multi-head Latent Attention (MLA), a compressed attention formulation that reduces memory requirements for long-context inference. Both models use sparse expert feedforward layers with 128 experts but differ in expert capacity and routing configuration.
Training and data details
The 30B model was trained on 16T tokens, while the 105B model was trained on 12T tokens. Pre-training data spans code, general web data, specialized knowledge corpora, mathematics, and multilingual content with substantial allocation to the 10 most-spoken Indian languages.
Training used sigmoid-based routing scores rather than traditional softmax gating, which improves expert load balancing and reduces routing collapse. An expert-bias term stabilizes routing dynamics and encourages more uniform expert utilization across training steps.
Pre-training was conducted in three phases: long-horizon pre-training, mid-training, and a long-context extension phase. The 105B model achieved benchmark superiority over the 30B model early in training, suggesting efficient scaling behavior.
Performance and deployment
Sarvam 105B performs well on reasoning, programming, and agentic tasks across benchmarks. Sarvam 30B is optimized for real-time deployment with strong performance on real-world conversational use cases. Both models achieve state-of-the-art results on Indian language benchmarks, outperforming significantly larger models.
Sarvam 30B powers Samvaad, Sarvam's conversational agent platform. Sarvam 105B powers Indus, their AI assistant built for complex reasoning and agentic workflows.
Access and implementation
Weights can be downloaded from AI Kosh (30B, 105B) and Hugging Face (30B, 105B). For local inference with Transformers, vLLM, and SGLang, refer to the Hugging Face models page for sample implementations. Both models are accessible via Sarvam's API at their API dashboard.
📖 Read the full source: HN LLM Tools
👀 See Also

Claude adds inline interactive charts and diagrams to conversations
Claude now creates custom charts, diagrams, and visualizations directly within chat conversations, allowing users to tweak and modify visualizations as discussions develop. The feature is available in beta on all plan types and appears inline rather than in side panels.

Developer's experience with Claude AI: From thinking partner to cognitive outsourcing
A developer shares an 8-month experience using Claude AI daily, noting a shift from using it to refine existing thinking to outsourcing initial thinking entirely. The post describes two distinct cognitive approaches: AI as a thinking partner versus AI as a first-pass generator.

Cron auto-update broke OpenClaw due to config validation error
A cron job set up to auto-update OpenClaw encountered a config validation issue with the cliBackends field, causing connection loss. The fix involved removing the problematic section and restarting the gateway.

NVIDIA Vera CPU Launched for Agentic AI Workloads
NVIDIA has launched the Vera CPU, a processor designed specifically for agentic AI and reinforcement learning workloads, claiming 50% faster performance and twice the efficiency compared to traditional rack-scale CPUs.