Claude Code vs Codex: 36 vs 28 files, $2.50 vs $2.04, infinite loop caught — real-world comparison
Someone on r/ClaudeAI ran a head-to-head comparison of Claude Code and Codex (via Cursor) on two practical tasks—same prompts, same MCP setup (GitHub + Slack), same machine. No benchmarks, real builds.
Tasks
- Task 1: PR triage bot — Read open PRs, score by complexity (files ×2, lines/10, +3 for no labels, +5 for no reviewers), write a markdown report, post Slack alerts for high scores. Required retries, error logging, strict TypeScript, no
any. - Task 2: Real-time code review UI — React + TypeScript, WebSockets, inline comment threads, optimistic updates with rollback, virtualized diff viewer, WS reconnect with exponential backoff. No UI libraries.
Claude Code results
- Ran
/mcpto verify tools before writing code - Built 36 files in ~12 minutes
- Wrote an unprompted two-client WebSocket smoke test (broadcast: 3ms)
- Zero
any, passed typecheck first try - UI worked immediately
Codex (via Cursor) results
- Failed Task 1: GitHub MCP wasn't reachable through Cursor's execution path. Handled it cleanly (retried 3x, logged errors, didn't crash), but no delivery.
- Task 2: Shipped a working UI in ~15 minutes, smoke test passed at 5ms
- Hit TypeScript errors on first compile and an infinite React loop (
useEffectcalling hydrate repeatedly). Needed a ref guard patch. - 28 files, more compact architecture
Cost (estimated, both tasks)
- Claude: ~$2.50
- Codex: ~$2.04
- Difference: ~18-23%
Takeaways
Neither agent “won”. Claude feels like pairing with someone who verifies everything before touching the keyboard. Codex feels like a senior dev who wants to ship and move on. Both got WebSocket broadcast under 10ms—six months ago that wasn't a given. No any leaks, no hallucinated tool names.
📖 Read the full source: r/ClaudeAI
👀 See Also

TEMM1E v3.1.0: AI Agent That Self-Fine-Tunes Using User Interactions
TEMM1E v3.1.0 introduces Eigen-Tune, a system that captures LLM interactions as training data, scores quality from user behavior, and fine-tunes local models via LoRA with zero added LLM cost. Tested on Apple M2, it corrected temperature conversions from 72°F = '150°C' to '21.2°C' after 10 conversations.

TRELLIS.2 Image-to-3D Ported to Run Natively on Apple Silicon
A developer has ported Microsoft's 4B parameter TRELLIS.2 image-to-3D model to run natively on Apple Silicon via PyTorch MPS, replacing CUDA-specific operations with pure-PyTorch alternatives. The port generates ~400K vertex meshes from single photos in about 3.5 minutes on M4 Pro with 24GB memory.

NarrateAI MCP Server Demo Shows Claude Adding Voiceover to Videos
A live demo shows Claude using the NarrateAI MCP server to automatically narrate videos from a URL, handling async polling and generating narration by analyzing silent screen recordings.

Agent Safehouse: macOS-native sandboxing for local AI coding agents
Agent Safehouse is a macOS-native sandboxing tool that prevents local AI agents from accessing files outside your project directory using kernel-level enforcement. It's a single shell script with no dependencies that works with Claude Code, Codex, OpenCode, Amp, Gemini CLI, Aider, Goose, Auggie, Pi, Cursor Agent, Cline, Kilo, Code Droid, and other agents.