Ghostbar: A ~5MB native macOS Swift AI client that hides from screen sharing

Ghostbar is a native Swift macOS menu bar AI client that hides from all screen recorders — Zoom, Teams, OBS, QuickTime, Cmd+Shift+5 — by calling window.sharingType = .none. It removes the window from macOS's display compositor before any capture pipeline touches it. This is a public documented AppKit API, not a hack. It's been tested on modern macOS; older recorders on legacy CGDisplayStream may still pick it up on pre-14 systems.
Key features
- Works with any OpenAI-compatible backend: local (Ollama, LM Studio, llama.cpp, vLLM — point at server IP) or cloud (NVIDIA NIM free tier, OpenAI, Anthropic, OpenRouter as fallback).
- On-device voice input via whisper-cpp.
- Screenshot analysis — model sees your screen, recorder doesn't.
- ~5MB download, menu bar resident.
The entire project is on GitHub: github.com/rbc33/Ghostbar. Currently at 50 stars. The developer is active in the Reddit thread for Q&A.
This is a practical tool for developers who run local models during work calls and don't want their AI client visible on screen share. No Electron bloat, no cloud dependency.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Custom llama.cpp Backend Offloads LLM Matrix Multiplication to AMD XDNA2 NPU on Ryzen AI MAX 385
A developer built a custom llama.cpp backend that dispatches GEMM operations directly to the AMD XDNA2 NPU on Ryzen AI MAX 385 (Strix Halo), achieving 43.7 t/s decode at 0.947 J/tok with Meta-Llama-3.1-8B-Instruct Q4_K_M. The NPU decode path saves ~10W versus Vulkan-only while matching decode throughput.

Reddit user experiments with failure-learning coding agents to break retry loops
A developer on r/LocalLLaMA describes experimenting with coding agents that learn from failures by storing simplified root causes and matching fixes, reducing repetitive error loops.

Local semantic search for AI conversations with fastembed and LanceDB
A developer indexed 368K AI conversation messages locally using fastembed for CPU-based embeddings and LanceDB as a serverless vector store, achieving 12ms p50 search latency without API keys.

Claude Desktop App Cowork Feature Enables AI-to-AI Communication via Shared Google Docs
Users have successfully implemented Claude-to-Claude communication using the new cowork function in the desktop app, with two agents reading and writing to a shared Google Doc. The test involved five rounds of question-and-answer dialogue between the AI agents.