OpenObscure: Open-Source On-Device Privacy Firewall for AI Agents

What OpenObscure Does
OpenObscure is an open-source, on-device privacy firewall for AI agents that sits between your AI agent and the LLM provider. Unlike tools that redact PII by replacing it with placeholders (which breaks LLM reasoning), OpenObscure uses FF1 Format-Preserving Encryption (AES-256) to encrypt PII values before the request leaves your device. The LLM receives realistic-looking ciphertext with the same format but fake values. On the response side, values are automatically decrypted before your agent sees them. Integration requires just changing the base_url to the local proxy.
Key Features
- PII detection: Uses regex + CRF + TinyBERT NER ensemble with 99.7% recall across 15+ PII types
- FF1/AES-256 FPE: Keys stored in OS keychain, nothing transmitted
- Cognitive firewall: Scans every LLM response for persuasion techniques across 7 categories using a 250-phrase dictionary + TinyBERT cascade, aligning with EU AI Act Article 5 requirements on prohibited manipulation
- Image pipeline: Face redaction (SCRFD + BlazeFace), OCR text scrubbing, NSFW filter
- Voice processing: Keyword spotting in transcripts for PII trigger phrases
- Platform support: Rust core, runs as Gateway sidecar on macOS/Linux/Windows or embedded in iOS/Android via UniFFI Swift/Kotlin bindings
- Auto hardware tier detection: Full/Standard/Lite modes depending on device capabilities
Technical Details
The project is licensed under MIT/Apache-2.0 with no telemetry and no cloud dependency. It was developed with Claude AI assistant. The repository is available at https://github.com/openobscure/openobscure, with a demo at https://youtu.be/wVy_6CIHT7A and website at https://openobscure.ai.
📖 Read the full source: r/ClaudeAI
👀 See Also

Custom llama.cpp Backend Offloads LLM Matrix Multiplication to AMD XDNA2 NPU on Ryzen AI MAX 385
A developer built a custom llama.cpp backend that dispatches GEMM operations directly to the AMD XDNA2 NPU on Ryzen AI MAX 385 (Strix Halo), achieving 43.7 t/s decode at 0.947 J/tok with Meta-Llama-3.1-8B-Instruct Q4_K_M. The NPU decode path saves ~10W versus Vulkan-only while matching decode throughput.

Relay lets Claude Code sessions message each other without alt-tabbing
A plugin called Relay uses Claude Code's channels capability to let parallel sessions communicate directly, removing the need to manually copy-paste context between backend and frontend repos.

Local Qwen Models Achieve Browser Automation with Stepwise Planning and Compact DOM
A developer found small local LLMs like Qwen 8B and 4B succeed at browser automation using stepwise planning instead of upfront multi-step plans, combined with a compact semantic DOM representation that reduces token usage from 50-100K+ to ~15K for full flows.

Ruflo: Open-Source Platform for Running Multiple AI Agents as a Team
Ruflo is an open-source platform that lets you run many AI agents together to work as a team on complex tasks. Previously known as Claude Flow, it helps coordinate workflows where tasks need to be broken into parts.