Using Claude Code to Automate AI Research Experiments for 12 Hours

Automated AI Research with Claude Code
A developer documented using Claude Code to automate AI research experiments for 12 hours straight. The project focused on CLaaS, a real-time continual learning framework that moves context into weights using self-distillation.
Experimental Setup
The goal was to tune self-distillation training runs to maximize a model's compliance to different preference verifiers, such as concise responses and no emojis. Experiments ran locally on an RTX 5090 overnight.
System Architecture
The repository was built to be highly configurable:
- Every tunable parameter exposed via CLI using Hydra config management
- HTML dashboards for every training step and evaluation run
- Metrics, inputs, and outputs made observable through dashboards
- Claude Code could query dashboards via curl requests to check progress
Experiment Management
The workflow was controlled by a local EXPERIMENTS.md file with specific rules:
- Each experiment could change at most one variable or make one code change
- Between experiments, the model had to either accept or revert the previous change based on results
- Any new code changes had to be exposed via config for later tuning
- The model recorded all progress, hypotheses, and outcomes in the file as a running journal
- Used a "Ralph Wiggum loop" with the goal of maximizing preference compliance
Results
Over 12 hours, the system ran 9 experiments:
- Found and fixed a model collapse bug on the first run
- Tuned gradient steps per batch to 4
- Tuned learning rate to 3e-5
- Compliance improved from 0.000 to 1.000
- Token usage was surprisingly low because most time was spent waiting for training runs between experiments
The same task was also run with Codex for 2 hours using a plain prompt, and it independently converged on the same hyperparameters.
Project repository: https://github.com/kfallah/CLaaS
📖 Read the full source: r/ClaudeAI
👀 See Also

Building an Asian-market AI CEO persona for OpenClaw with native Chinese thinking
A developer built Eve, an AI CEO persona specifically designed for HK/TW/CN markets, addressing the problem of English personas with poor Chinese translation. The solution includes three separate voice modes, Asian-specific memory decay, platform-aware routing, and local competitor monitoring.

A TDD Development Flow Using AI Agents for Website Projects
A developer shares their workflow for building websites using AI coding agents with TDD, detailing setup steps, iterative processes, and specific commands for running tests with local models like Qwen3.5-27B.

Mass Parallelizing Claude Code: Lessons from Building a 220K-Line App
A developer with no formal coding background built a full-stack mobile app using Claude Code, running 3-4 parallel instances to process 4 billion tokens across 500+ files. Key techniques include handoff documents, CLAUDE.md files, custom slash commands, and systematic codebase audits.

Using Local LLM to Monitor Minecraft Bot AFK Sessions
A developer used a local LLM to monitor their Minecraft bot running Baritone for mining jobs, setting up screen monitoring to receive alerts when the bot dies or disconnects from the server.