WebClaw: Open-Source MCP Server for Web Extraction with Claude

✍️ OpenClawRadar📅 Published: March 23, 2026🔗 Source
WebClaw: Open-Source MCP Server for Web Extraction with Claude
Ad

WebClaw is an MCP server built in Rust that adds web extraction capabilities to Claude Desktop and Claude Code. It addresses the problem where Claude's built-in web_fetch gets blocked on most real websites, returning 403 Forbidden errors, Cloudflare challenges, or empty responses.

Technical Solution

The server uses TLS fingerprinting at the HTTP layer so websites see a real Chrome browser fingerprint instead of a bot. In testing against 10 popular sites, Claude's built-in web_fetch failed on all 10, while WebClaw successfully extracted content from 9 out of 10 sites.

Features

  • scrape: Extract clean content from any URL
  • crawl: Recursive site crawling
  • extract: Structured data extraction using JSON schema or natural language prompts
  • summarize: Page summaries
  • brand: Extract colors, fonts, logos from any site
  • diff: Track content changes
  • map, batch, search, research tools
Ad

Claude Code Development

The extraction pipeline was implemented with Claude Code, including:

  • Scoring algorithm based on text density, semantic tags, and link ratio penalties
  • Noise filter that strips navigation, ads, and cookie banners without false positives on Tailwind classes
  • Multiple rounds of refinement for edge cases

Setup and Usage

Setup requires one command:

npx create-webclaw

The tool detects Claude Desktop and Claude Code automatically and writes the configuration. No API key is needed for 8 of the 10 tools, and everything runs locally.

Performance Benefits

The output is optimized for Claude's context window. A typical news article goes from 4,820 tokens (raw HTML) to 1,590 tokens in WebClaw's LLM format - a 67% reduction while maintaining the same content.

WebClaw is free and open source under the MIT license, available at https://github.com/0xMassi/webclaw.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also