Claude Sleuth: A 56-Task Investigation Workflow for Claude AI

What Claude Sleuth Does
Claude Sleuth is a 6-phase, 56-task workflow designed for Claude AI that structures complex investigations. The workflow consists of: Operational Direction, Intelligence Collection, Collation & Entity Resolution, Chronological & Relational Processing, Hypothesis & Reasoning, and concludes with a Final Report. It provides templates for every step and reference files for each task, which are output by task_runner.py upon completion of each gate. The system works across all Claude platforms including mobile, not just CLI.
Core Architecture
The system maintains persistent investigation state across sessions via Cloudflare D1, storing entities, relationships, timelines, evidence, grades, and the Investigation Notebook. It includes 16-section Cognitive Surrogate Profiling from documentary evidence, advancing the profile whenever subject information is synthesized, plus a 12-technique reasoning framework with a diagnose function for impasses, competing framing, or stuck points.
Analytical Frameworks
- Admiralty 6x6: Grades source reliability (A–F) and credibility (1–6) independently before any claim enters the record
- ACH: Derives conclusions via the Inconsistency Principle — surviving hypotheses have the least evidence against them
- ICD 203: Maps every probabilistic statement to a 7-tier scale, prohibiting vague qualifiers
Output Conventions
- Timestamps: ISO 8601, normalized to UTC
- Entity records: POLE schema with mandatory source, date_observed, analyst_id, and confidence fields
- Network edges: source_node, target_node, relationship_type, evidence_ref; edges are directed (source → target)
- Evidence custody: SHA-256 hash, capture timestamp, analyst ID, storage location
- Probability language: ICD 203 7-tier scale
Script Reference
task_runner.py: Drives the 56-task pipeline (next,done,status,jump,peek,notebook,reset)template_builder.py: Assembles Markdown working documents fromtemplates/by phase, step, or task IDsource_grader.py: Admiralty 6x6 source reliability and credibility grading with action recommendationsentity_resolver.py: Fellegi-Sunter probabilistic record linkage; deterministic matching on unique identifierscorporate_intel.py: Aggregates company data from UK Companies House, SEC EDGAR, GLEIF LEI, and ICIJ Offshore Leaksdomain_intel.py: Domain reconnaissance via DNS, RDAP, crt.sh, Shodan InternetDB — zero authentication requiredusername_enum.py: Async username enumeration across social platforms using Maigret, Sherlock, or WhatsMyNamesanctions_screen.py: Fuzzy name matching against OFAC SDN, UK HMT, and other public sanctions listsevidence_preservation.py: Forensic web capture: screenshots, HTML, WARC, Wayback submission, SHA-256 chain of custodycontent_archiver.py: Async media download and cataloguing via yt-dlp, gallery-dl, and Playwright with manifest generationchronological_matrix.py: UTC-normalised timeline construction; gap detection, source conflict flagging, CSV exportnetwork_graph.py: Directed POLE relationship graph; in/out-degree, PageRank, community detection, HTML/GEXF exportgeolocation.py: EXIF GPS extraction, solar position/shadow analysis, historical weather correlation, reverse geocodingfinancial_analysis.py: SEC EDGAR financial anomaly detection: Benford's Law, YoY variance, Altman Z-Scorereport_generator.py: ICD 203-compliant briefings and findings memos via Jinja2 templates; optional WeasyPrint PDF export
Who It's For
This workflow is designed for developers and analysts using Claude AI for structured investigations, intelligence gathering, or complex research projects requiring standardized methodologies and persistent state management.
📖 Read the full source: r/ClaudeAI
👀 See Also

Upfront: A Claude Code Plugin That Forces Thinking Before Coding
Upfront is a Claude Code plugin with 20 skills that challenges developers before generating code. It uses three commands: /upfront:feature to push back on vague requirements, /upfront:plan to break work into ~400 LOC phases, and /upfront:build to execute with TDD and review per phase.

Claude Code Used to Simulate 4,000+ Blind Werewolf Games with LLMs
A developer used Claude Code to build a simulator where LLMs play blind one-night Werewolf, running ~4,600 games across OpenAI and xAI models. The experiment revealed consistent name-based voting patterns despite minimal game signals.

Replacing complex retrieval pipelines with simple git shell commands for LLM agents
A developer replaced their entire AI agent retrieval pipeline (sentence-transformers, rank-bm25, two-pass LLM pipeline) with a single tool that lets the agent execute read-only shell commands against a git repository, reducing Docker image size by ~3GB and eliminating timeout issues.

Measuring Off-Task Token Spend in Claude Code: The 'Undeclared-Intent' Metric
A developer built a metric to quantify compute spent on unintended execution paths in Claude Code sessions, finding that 22.8% of tokens went to off-task work.