Local Fine-Tuning of Llama 3.2-1B for Secret Detection Surpasses Wiz's Model

A developer has documented their successful local fine-tuning of Llama 3.2-1B for secret detection in code, surpassing the metrics of a similar model from Wiz. The project was conducted entirely with local AI tools, avoiding proprietary APIs.
Key Results and Approach
The developer aimed to replicate or beat Wiz's results of 86% precision and 82% recall. After a few weekends of work, they achieved 88% precision and 84.4% recall simultaneously with a fine-tuned Llama 3.2-1B model. They also benchmarked Qwen 3.5-2B and 4B models, which outperformed the 1B model at the cost of higher VRAM usage and longer inference times.
Dataset and Training Process
The work relied solely on publicly available data, which was insufficient, so procedural generation was used to augment and improve the dataset. All labeling was done locally using the Qwen3-Coder-Next model. A key training objective was to have the models output structured JSON. Initially, untrained models (Llama & Qwen) scored 0% on schema compliance, but after training, this improved to 98-100%.
Challenges and Learnings
The developer encountered several issues during the process:
- Included a high entropy class that was detrimental to training; this was identified and removed.
- Discovered that 4,500 of the 'negative' samples in the dataset actually contained real-world passwords, meaning the model was being trained to ignore secrets. Fixing this improved recall on passwords.
The developer has published a full technical write-up with training stats, examples, and a step-by-step breakdown of the process.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Professor Builds AI Detection Bias Game with Claude Code
A UK professor built Flagged, a browser game that simulates AI detection decisions in academia using Claude Code. The game reveals how detection tools produce false positive rates up to 61.3% for non-native English speakers.

RunLobster AI agent builds functional dashboard from natural language request
A developer reports that RunLobster built and deployed a complete dashboard with Stripe integration and authentication in response to a single natural language command, completing in minutes what would normally take days.

OpenClaw user automates parking payments by reverse engineering government portal
An OpenClaw user created a script that automatically pays for parking by reverse engineering a local government portal, reducing costs from $3 per transaction to zero by running locally on a Mac mini.

Practical Lessons from Deploying OpenClaw on Secure VPS
A Reddit user shares specific deployment advice: audit all Skills and Plugins for security and token efficiency, start with a basic setup, and use a VPS for economy and smaller attack surface. Proper setup can free up 40% of time on repetitive work.