Single-page chatbot interface for locally running Gemma 4 26B A4B

A developer has created a single-page HTML chatbot interface designed to work with Gemma 4 26B A4B running locally. The implementation connects to LM Studio's API and provides a complete chatbot interface in a single HTML file.
Technical Implementation
The system runs Gemma 4 26B A4B locally with a 32K context window, achieving 50-65 tokens per second. The model is sharded between two GPUs: a 7900 XT and a 3060 Ti.
Interface Features
- Full streaming support for real-time responses
- Markdown rendering for formatted output
- Model selector for switching between available models
- Six parameter sliders for fine-tuning model behavior
- Message editing with history branching capabilities
- Regenerate function for response regeneration
- Abort button to stop generation mid-stream
- System prompt support for custom instructions
Development Details
The developer notes that Claude was used to fix two DOM bugs that Gemma couldn't resolve. All other development work was completed using Gemma 4. The project is available on GitHub for examination and use.
This type of single-page interface is particularly useful for developers working with local LLMs who want a lightweight, customizable chat interface without the overhead of complex web applications. The integration with LM Studio's API makes it compatible with various local models beyond just Gemma.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Developer shares hybrid AI coding workflow: Claude for planning, local models for execution
A developer built a pipeline using Claude 3.5 Sonnet for task planning and local Qwen2.5-Coder models via Ollama for code generation, achieving 85% token reduction compared to using Claude alone.

ClawWatcher Reaches 200 Users, Reports $28K+ in Collective OpenClaw API Savings
ClawWatcher, a tool that tracks OpenClaw API costs in real-time, has reached 200 users. According to its creator, users have collectively saved over $28,000 in API costs, with an average cost reduction of 45%.

Open-Source Claude Code Plugins for Agentic Commerce Protocols
OrcaQubits has released eight open-source Claude Code plugins that implement agentic commerce protocols including UCP, ACP, AP2, and A2A, with MIT licensing and support for platforms like Magento 2, BigCommerce, and WooCommerce.

re_gent: Git for AI Coding Agents – Version Control for Agent Activity
re_gent is an open-source tool that provides version control for AI agent sessions, tracking every tool call, storing prompts and file diffs, and enabling commands like `rgt log`, `rgt blame`, and `rgt rewind` (coming soon).