Gemini 3.1 Flash Live: Google's latest audio model with improved benchmarks and watermarking

What's new in Gemini 3.1 Flash Live
Google has released Gemini 3.1 Flash Live, their highest-quality audio and voice model designed for real-time dialogue. The model delivers improved speed and natural rhythm for voice-first AI applications.
Key technical details
- Benchmark scores: 90.8% on ComplexFuncBench Audio (multi-step function calling with constraints) and 36.1% on Scale AI's Audio MultiChallenge (complex instruction following with "thinking" on)
- Improved capabilities: Better tonal understanding, recognition of acoustic nuances like pitch and pace, and dynamic adjustment to user frustration or confusion
- Watermarking: All audio generated includes SynthID watermark for AI content detection
- Multilingual support: Available in over 200 countries and territories
Availability and access
- For developers: Available in preview via Gemini Live API in Google AI Studio
- For enterprises: Included in Gemini Enterprise for Customer Experience
- For general users: Accessible via Search Live and Gemini Live
The model enables building voice-ready agents that handle complex tasks in noisy environments and supports longer conversation threads during extended interactions.
📖 Read the full source: HN AI Agents
👀 See Also

Claude AI Recovers 11-Year-Old Bitcoin Wallet Worth $400K by Finding Backup and Fixing Brute-Force Bug
A user recovered a 5 BTC wallet (worth ~$400K) after 11 years by feeding their entire college computer files into Claude. The AI found an older backup wallet and identified a bug in btcrecover's password combination logic, enabling successful decryption.

Weekly Multimodal AI Roundup: Holotron-12B, Nemotron Omni, GlyphPrinter, and More
This week's multimodal AI highlights include Holotron-12B for computer-use tasks, NVIDIA's Nemotron Omni models integrating language+vision+voice, GlyphPrinter for accurate text rendering in image generation, and several open-source projects for video enhancement, 3D segmentation, and multi-agent systems.

Microsoft's BitNet Enables 100B Parameter LLM Inference on Single CPU
Microsoft's open-source BitNet project achieves 100B parameter LLM inference at 5-7 tokens/second on a single CPU, with the 2B parameter model using 0.4GB memory and 29ms latency while matching full-precision models on benchmarks.

ICML 2026 Desk-Rejects 2% of Papers for LLM Review Policy Violations
ICML 2026 rejected 497 papers (~2% of submissions) after detecting 795 reviews (~1% of all reviews) where reviewers violated explicit agreements not to use LLMs. The detection method involved watermarking PDFs with hidden LLM instructions.