Anam Cara-3: Advancements in Interactive AI Avatars

Anam has released its latest model, cara-3, designed to create interactive avatars. The avatar utilizes a two-stage pipeline where a diffusion transformer converts audio into motion embeddings (including head position, eye gaze, lip shape, and expression). These embeddings are then applied to a reference image to generate video frames, allowing for animation of any face without the need for retraining.
Notably, Cara-3 can achieve a time-to-first-frame of approximately 70ms on an H200, which supports many concurrent avatar sessions on a single GPU. This speed is partly due to the novel flow matching variant used for audio-to-motion transformation, as conventional techniques proved unstable.
An independent blind evaluation showed that Cara-3 outperformed competitors like HeyGen, Tavus, and D-ID, scoring 24% higher on average across various metrics. Responsiveness, as evidenced by a Spearman correlation coefficient of 0.697, is shown to impact user experience more than visual quality (0.473).
Anam has also open-sourced their training data pipeline backbone, Metaxy, to facilitate iterative development without retaking costly steps.
📖 Read the full source: HN AI Agents
👀 See Also

AI Usage in Development Hits 93%, Yet Productivity Gains Stagnate at 10%
The use of AI coding assistants is prevalent among developers, with 93% adopting them. However, the productivity boost remains limited to just 10%.

AI Should Elevate Your Thinking, Not Replace It — Koshy John on the Hidden Divide in Engineering
Koshy John argues that engineers who outsource thinking to AI for short-term productivity gains are building a hollow foundation, while those who use AI to remove drudgery and operate at a higher level create real long-term value.

Cognitive Debt: When AI Output Outpaces Understanding
A Reddit post discusses 'cognitive debt' — the gap between AI-generated output and the team's understanding of it — and argues that creative control means knowing what you shipped. The post itself was written with Claude's help, meta-commenting on the irony.

Kimi K2.6 vs Claude Opus 4.7: A Practical Coding Showdown on a Minetest Mod + Google Sheets Integration
A developer tested Kimi K2.6 and Claude Opus 4.7 on building a Minetest bounty board mod with a TypeScript backend and Google Sheets logging. Opus succeeded in both tasks; Kimi passed the local task but failed the integration. Costs: Opus ~$3.59 local, $16.03 integrated; Kimi $0.39 local, $5.03 failed.