Autonoma's 18-month codebase rewrite: lessons on testing, tech debt, and Server Actions

Why a successful product needed a complete rewrite
Autonoma, a company that pivoted multiple times (enterprise search, documentation generation, coding agent, QA testing platform), developed a product for over 1.5 years, closed clients, raised funding from a major industry player, and hired a team of 14. Despite this traction, they decided to throw away their entire codebase and start over.
The no-tests era and its consequences
Initially, the team used a TypeScript monorepo with no strict mode and no tests. This worked with 2 engineers who owned large portions of the codebase, but became disastrous after hiring. The codebase developed null issues, undefined behavior, and bad error handling, leading to bugs appearing "out of the blue" and even losing a client. The founder initially prohibited tests to maintain a culture of shipping fast, but later realized this affected product quality and productivity.
Technical decisions driving the rewrite
The original product was built during the GPT-4 era (not 4o) when models required extensive guardrails. They built sophisticated Playwright and Appium wrappers with complex inspections and 7 clicking strategies that would self-heal on the fly. With model advancements, this sophisticated inspection is no longer necessary, making the legacy codebase with tech debt less valuable.
Dropping Next.js and Server Actions
The team is moving away from Next.js and Server Actions, citing several issues:
- Server Actions are async, requiring useEffect blocks or manual state handling in React
- They're hard to test - testing requires creating Prisma objects with in-memory databases or mocking
- No dependency injection capability
- They execute sequentially globally, creating a "manufactured Python Global Interpreter Lock but in TypeScript"
The new implementation starts with tests from the ground up and uses the most strict TypeScript mode.
📖 Read the full source: HN AI Agents
👀 See Also

Kimi k2.5: Breaking New Ground in AI Automation
Kimi k2.5 has set a new standard for AI automation, boasting advanced capabilities that are turning heads in the tech community. Discover how it is reshaping the landscape.

Opus 4.7 Token Efficiency: German Prompts Burn Up to 2x Tokens vs English
A Claude Pro subscriber reports that using German with Opus 4.7 consumed 100% of session tokens in seconds, while English used 37%. The tokenizer inefficiency stems from compound nouns and umlauts, causing 1.5–2x token usage.

GPT 5.5 vs Claude: A Developer's Refactoring Battle Report
A developer used GPT 5.5 to plan and Claude to code a massive 36k-line C refactoring. GPT 5.5 impressed with clear plans but burned through 85% of usage in 2 hours on the $30 plan.

Google DeepMind Workers Vote to Unionize Over Military AI Deals
London-based Google DeepMind employees voted to unionize, demanding Google halt AI contracts with US and Israeli militaries, citing concerns over ethical guidelines removal.