Cloudflare's AI Platform: Unified Inference Layer for AI Agents

What Cloudflare's AI Platform Offers
Cloudflare has expanded its AI capabilities into a unified inference layer designed specifically for AI agents. The platform addresses the challenge of AI models changing rapidly and the need to use multiple models for different tasks within agentic workflows.
Key Features and Implementation
The core offering is one API to access any AI model from any provider. For Workers users, you can call third-party models using the same AI.run() binding already used for Workers AI. Switching between providers requires only a one-line code change.
const response = await env.AI.run('@cf/moonshotai/kimi-k2.5', {
prompt: 'What is AI Gateway?'
}, {
metadata: {
"teamId": "AI",
"userId": 12345
}
});The platform provides access to 70+ models across 12+ providers including Alibaba Cloud, AssemblyAI, Bytedance, Google, InWorld, MiniMax, OpenAI, Pixverse, Recraft, Runway, and Vidu. Model offerings now include image, video, and speech models for building multimodal applications.
Cost Management and BYOM Support
All AI spend can be managed in one place through AI Gateway. By including custom metadata with requests, you can get cost breakdowns by attributes like free vs. paid users, individual customers, or specific workflows.
For custom model needs, Cloudflare is working on letting users bring their own models to Workers AI using Replicate's Cog technology. This involves containerizing machine learning models with a cog.yaml file and Python inference code, abstracting away CUDA dependencies, Python versions, and weight loading.
Recent Updates and Availability
Recent additions include zero-setup default gateways, automatic retries on upstream failures, and more granular logging controls. REST API support for non-Workers users is coming in the coming weeks.
📖 Read the full source: HN AI Agents
👀 See Also

Skill Scaffolder: Build OpenClaw Skills Without Writing Code
Skill Scaffolder is an open-source tool that lets users create OpenClaw skills by describing what they want in plain English. It handles the entire process—interviewing users, writing skill files, testing, and installation—without requiring YAML, Python, or config files.

Collection of 177 OpenClaw SOUL.md Templates Organized into 24 Categories
A developer has compiled 177 ready-to-use SOUL.md templates for OpenClaw agents across 24 categories including Marketing, Development, Business, DevOps, Finance, Creative, Data, Security, Healthcare, Legal, HR, and Education. All templates are MIT licensed and available on GitHub.
CTOP: Terminal UI to Monitor Claude Code Sessions, Zero Deps
CTOP is a zero-dependency Node.js TUI that shows CPU, memory, context window saturation, token breakdown, and cost estimates for all running Claude Code and Codex sessions.

Vyra: Intelligent Web Video Editor for Claude Agents via MCP
Vyra indexes footage so Claude can semantically search and edit video directly—supports motion graphics, music sync, smart masking, transcript editing, color grading, and 30+ effects.