Tolan's AI-Enabled Engineering Interview Process

Tolan has redesigned their engineering interview process to reflect how engineers actually work with AI coding agents. Instead of traditional algorithmic questions, they focus on practical skills that matter when AI writes most production code.
The Interview Structure
Candidates spend a morning at their San Francisco office working on a small problem that Tolan has solved themselves. The problem comes from a bare-bones Figma file or short spec, typically representing a simple flow or lightweight feature that would normally take a day or two to build.
Candidates have just a few hours to work on the problem, which isn't enough time to create a polished product. The constraint is intentional—they want to see how candidates work within limitations.
AI Tools Encouraged
Candidates are explicitly encouraged to use AI to solve the problem. Tolan provides licenses for Claude, Codex, Cursor, or Gemini if needed. The key expectation is that candidates must balance LLM-generated code against their own judgment—even if they aren't writing the code, they own the output.
What they're looking for:
- How candidates approach the problem
- How they structure a solution
- How they think through constraints
- How they decide what actually matters
Evaluation Criteria
After the work session, there's a 20–30 minute conversation about what was created. Interviewers ask what candidates would improve if they had more time, what they'd change before sending for review, and what they'd change before shipping.
Red flags include:
- Candidates who use LLMs to think through how the project should be completed (like screenshotting Figma and asking Claude to solve it)
- Candidates who don't question unclear specs
- Candidates who say "I'm still not sure what this part does" but wouldn't change anything before human review
Positive signals include:
- Clarifying problem statements and exploring edge cases
- Recognizing tradeoffs
- Pointing out when something feels weird or doesn't seem right
- Showing creativity (like building a mini-game to entertain users during LLM response waits)
- Knowing when work isn't good enough and how to improve it
The core philosophy: In a world where implementation is getting easier, what matters most is judgment. Working code isn't the finish line—understanding and maintaining it is.
📖 Read the full source: HN AI Agents
👀 See Also

Defining AI Agents: The Workflow Test
A Reddit discussion questions whether many AI agent products are essentially chatbots with a to-do list, proposing a test based on their ability to complete workflows across multiple tools without manual intervention.

Anthropic Report Details Mass Distillation of Claude by Chinese AI Firms
Anthropic published evidence that DeepSeek, Moonshot AI, and MiniMax used 24,000 fake accounts and over 16 million exchanges to distill Claude's capabilities, compromising safety mechanisms in the copied models.

RTX 4090 vs H100 for Fine-Tuning Llama-3-8B: A Cost-Performance Comparison
A developer tested fine-tuning Llama-3-8B on both an RTX 4090 and rented H100 instances. The 4090 setup cost $2,000 upfront and took 24 hours, while H100 rental cost about $80 and completed in 4 hours.

Palantir AI to be embedded across US military according to report
A report indicates the US military plans to embed Palantir's AI technology across all branches. The article generated 37 points and 24 comments on Hacker News.