llm-hasher: Local PII Detection and Tokenization for Hybrid LLM Workflows

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source
llm-hasher: Local PII Detection and Tokenization for Hybrid LLM Workflows
Ad

llm-hasher addresses a specific security gap in hybrid LLM workflows: when you run local LLMs but still call external services like OpenAI, Claude, or Gemini for certain tasks, your PII still leaves your infrastructure in plaintext. This tool runs PII detection entirely locally using Ollama, so no data leaves your systems during the detection phase.

How It Works

The process follows three steps: detect PII locally, tokenize it before external LLM calls, then restore the original values after processing. This prevents sensitive data from being exposed to third-party services.

Detection Approach

The detection system uses a hybrid approach:

  • Regex patterns for structured data types: credit cards, IBAN numbers, email addresses, and IPv4 addresses
  • Ollama with llama3.2:3b (by default) for contextual detection of unstructured PII: names, addresses, national IDs, passports, and dates of birth

Technical Implementation

Mappings between original PII and tokens are stored in an AES-256-GCM encrypted SQLite vault. Deployment is simplified with Docker Compose, which spins up both Ollama and the llm-hasher service with a single command.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also