Boost Decompilation with LLMs: 25% to 75% Success

The article discusses challenges and strategies in LLM-assisted decompilation, particularly using Claude for decompiling Nintendo 64 games like Snowboard Kids 2. Initially, progress involved one-shot decompilation, boosting matched code from around 25% to 58%. However, progress slowed, necessitating a change in approach to eventually reach ~75% matched functions.

A critical strategy involved prioritizing which unmatched functions to tackle, initially using a logistic regression model to estimate difficulty based on features like instruction count and control-flow complexity. When this approach plateaued, exploring function similarity via text embeddings of assembly instructions proved fruitful. This involved computing similar matched functions to provide useful references, thus improving Claude's decompilation performance.

To compute similarity, vector embeddings were discussed, which are often used in RAG systems for fast retrieval. However, for a project with only a few thousand candidates, precise similarity computation was feasible. A composed similarity score factoring normalized instruction n-grams, control-flow patterns, memory access offsets, and structural metrics was initially used, but later, Coddog's simpler method of computing bounded Levenshtein distance over opcode sequences proved just as effective, reducing complexity.

Specialized tooling like gfxdis.f3dex2 and decomp-permuter also enhanced Claude’s performance. Specifically, the use of the F3Dex2 library made dealing with the N64's Reality Display Processor (RDP) microcode more manageable, avoiding the need for custom reverse engineering.

📖 Read the full source: HN LLM Tools

LLM-Assisted Decompilation: Evolving Strategies and Tools

👀 See Also

Batch API Cost-Effective for Multi-File Code Changes

Non-technical founder builds production marketplace with Claude Code

Claude users experiment with AI-to-AI communication for difficult conversations

Mesh Architecture for AI Agents: Client Isolation and Cross-Project Coordination