Using Claude to Extract Data from a 1997 Football Manager Game

Ben Nuttall reverse-engineered FIFA Soccer Manager 97 (FSM97) using Claude as a data extraction assistant. Pointing Claude at the Wine-installed game directory, the AI located SM97.DAT and parsed player names, stats, club assignments, stadiums, and manager data — all mapped to real-world football entities. The goal was reproducibility: the entire pipeline now lives as Python scripts on GitHub.
Key Details
- Source file:
SM97.DATcontains all game data including player full names (e.g., "David Beckham" vs. in-game "D. Beckham"), stadium names, club nicknames, and even managers never shown in game. - Initial query: Claude answered simple questions (biggest stadium, highest-rated player) by reading the binary file directly.
- CSV export: Claude generated CSV files for all data; column names for player stats were unknown, so Nuttall launched the game to map them (e.g., using David Seaman's stats as a calibration reference).
- Site generation: After fixing a few hallucinated data points, Claude built an interlinked HTML site at fsm.bennuttall.com showing players, clubs, stadiums, and trivia.
- Data corrections: Abbreviated team names like "Sheffield W" were expanded to "Sheffield Wednesday". Stadium typos (e.g., "Bramall Lane Ground") were fixed without altering gameplay data.
- Easter eggs discovered: Olympic champion Daley Thompson appears as a player at Mansfield Town. Player-manager relationships detected (e.g., player also listed as manager for same club).
- Shared stadiums: Clubs like Crystal Palace and Wimbledon both use Selhurst Park; the data now tracks this.
- Top stats pages: Generated lists of top-rated players, best players by age group, top player-managers, and stadiums by capacity. Notably, the "EA All Stars" club contains fictional highly-rated players.
Reproducibility
All Python code to extract data and build the website is open source on GitHub. Others can run the same pipeline without needing Claude or any AI tool by using the published CSVs and scripts.
📖 Read the full source: HN AI Agents
👀 See Also

ETL-D MCP Server: Deterministic CSV Parsing for Claude to Prevent Financial Hallucinations
A developer built ETL-D, an open-source MCP server for Claude Desktop that processes CSVs in three deterministic layers to prevent decimal point hallucinations in financial data. It uses Python parsers for known formats, achieves ~70ms response times with 0 LLM calls for 200 parallel requests, and only uses LLMs as a fallback for high-entropy text.

Developer Built AI/ML Job Board Using Claude Code for Design and SEO
A developer created MOAIJobs.com, a free site curating AI/ML jobs from leading labs and companies with filtering by category, location, and salary. The site's design and technical SEO implementation were handled by Claude Code based on developer-provided references and explanations.

Managing AI Context with a SQLite Knowledge Store and MCP Tools
One developer built RunawayContext, an MIT-licensed system that stores project lessons in SQLite with FTS5 and optional sqlite-vec, keeping per-session context under 3K tokens via MCP query tools and hard-coded caps.

TinyFish Web Agent Outperforms Competitors in Web Task Benchmarking
TinyFish's web agent achieved an 81.9% success rate on hard web tasks, significantly outperforming competitors like OpenAI Operator at 43.2%.