Off Grid Mobile App Adds On-Device AI Tool Use with 3x Speed Improvement

Off Grid, an on-device AI mobile app, has been updated to add tool use capabilities and significant performance improvements. The app now allows AI models to call tools offline without requiring API keys, servers, or cloud functions.
Key Features and Performance
The update introduces automatic tool loops for web search, calculator, date/time functions, and device information access. According to the developer, this bridges the gap between "local toy" and "useful assistant" by enabling 3B parameter models to reason, call tools, and synthesize results directly on your phone.
Performance improvements come from configurable KV cache options. Users can now choose between three KV cache types:
f16q8_0q4_0
With q4_0 cache, models that previously generated 10 tokens/second now reach 30 tokens/second. The app includes a performance nudge feature that suggests faster settings after the first generation.
Model Support and Platform Availability
Off Grid supports GGUF format models, including:
- Qwen 3
- Llama 3.2
- Gemma 3
- Phi-4
- Other GGUF-compatible models
The app is now available on both major app stores without sideloading requirements. It can be installed directly from the App Store and Google Play.
Core Functionality and Philosophy
What hasn't changed in this update:
- MIT licensed and fully open source
- Zero data leaves the device (no analytics, telemetry, or anonymous usage data)
- Offline capabilities including text generation (15-30 tokens/second), image generation (5-10 seconds on NPU), vision AI, voice transcription, and document analysis
The developer states the project is motivated by the belief that "the phone in your pocket should be the most private computer you own — not the most surveilled."
📖 Read the full source: HN AI Agents
👀 See Also

AutoDream: 11-hook memory system for Claude Code with safety features
AutoDream is an open-source tool that adds project memory persistence and command safety to Claude Code. It uses 11 hooks across 6 events to inject context, block dangerous commands, and survive the /compact operation.

Super Claude browser extension makes Claude.ai UI fully customizable
A developer built a browser extension that lets users customize every aspect of Claude.ai's interface — colors, fonts, layout, plus usage tracking and token counting. The extension works on Chrome and Firefox and was developed using Claude itself.

Claude Code Routines Tunes CLI Performance 2.4x in 20+ PRs
Using Claude Code's Routines on a 2-hour cron to autonomously tune an open-source CLI (Repomix), resulting in 20+ auto-generated PRs and a 2.4x runtime improvement.

DELIGHT: Local Orchestrator Uses Multiple ChatGPT Sessions as Coordinated Agents
DELIGHT is a local orchestrator that runs multiple hidden ChatGPT browser sessions simultaneously, coordinating them like a team of agents without requiring API keys or GPU resources. It connects to OpenClaw as an action layer to apply changes to real files and run tests.