AI Agents Abandon Natural Language for Structured Queries in MCP Test

The team at Cala recently shipped an MCP server that provides three distinct ways for AI agents to access their knowledge graph: natural language queries, a structured query language, and direct entity/relationship traversal.

Unexpected Agent Behavior

Despite expectations that agents would default to natural language interfaces (the typical strength of LLMs), most agents abandoned natural language queries within minutes. Without any prompting or nudging, they autonomously switched to using structured queries and graph traversal methods.

Why This Makes Sense

The source explains this behavior by noting that LLMs aren't explicitly trained to be "efficient" but rather to be correct through RLHF. This correctness leads to efficient behavior as a side effect - agents learn to take the shortest reliable path to solutions. Natural language interfaces add an interpretation layer that introduces uncertainty, while structured queries provide deterministic results.

When presented with three access methods, agents consistently chose the option that minimized uncertainty rather than the most "natural" interface.

Key Questions Raised

Are we over-indexing on natural language interfaces for agent tooling?
Should MCP servers prioritize structured/graph-based access patterns over natural language by default?
If agents prefer deterministic paths, how should this influence tool design?

The Reddit discussion seeks input from others building agent tooling to see if they've observed similar patterns.

📖 Read the full source: r/LocalLLaMA

AI Agents Prefer Structured Queries Over Natural Language in Cala MCP Server Test

Unexpected Agent Behavior

Why This Makes Sense

Key Questions Raised

👀 See Also

Liquid AI releases LFM2.5-350M model for agentic loops

Claude Code v2.1.160: Safety Prompts for Shell Config, acceptEdits File Protection, and Dozens of Bug Fixes

Anthropic Limits OpenClaw with New Credit System: Details and Impact

Buddy turns down $300k+ role replacing 70% of staff with Claude agents — Reddit debates the moral and technical reality