CLI Design Patterns for AI Agents: Unix Help & Safety

CLI Interface Protocol Clarification

The biggest misconception from Part 1 was that "CLI" meant giving an LLM a Linux terminal. CLI is actually an interface protocol: text command in → text result out. Implementation can happen in two ways:

As a binary or script in the shell's PATH — becomes a CLI tool that runs in a real shell
As a command parser inside your code — when the LLM outputs run(command="weather --city Tokyo"), you parse the string and execute it directly in your application code with no shell involved

The key is making the LLM feel like it's using a CLI. In the author's system, most commands never touch the OS — they're Go functions dispatched by a command router. Only commands that genuinely need a real OS (running scripts, installing packages) go to an isolated micro-VM. The agent doesn't know and doesn't care which layer handles its command.

Agent-Friendly CLI Design Principles

Two Core Philosophies

Philosophy 1: Unix-Style Help Design

tool --help → list of top-level commands
tool <command> --help → specific parameters and usage for that subcommand

This allows the agent to discover capabilities on demand without stuffing all documentation into context upfront.

Philosophy 2: Tips Thinking

Every response — especially errors — should include guidance that reduces unnecessary exploration.

Bad example:

> cat photo.png [error] binary file

Good example:

> cat photo.png [error] cat: binary file detected (image/png, 182KB). Use: see photo.png (view image) Or: cat -b photo.png (base64 encode)

Why this matters: invalid exploration wastes tokens. In multi-turn conversations, this waste accumulates — every failed attempt stays in context, consuming attention and inference resources for every subsequent turn. A single helpful hint can save significant tokens across the rest of the conversation.

Safe CLI Design

When CLI commands involve dangerous or irreversible operations, the tool itself should provide safety mechanisms.

Dry-Run / Change Preview — Preventing Mistakes

For operations within the agent's authority but with hard-to-reverse consequences. The goal is to let the agent (or human) see what will happen before committing.

> dns update --zone example.com --record A --value 1.2.3.4 ⚠ DRY RUN: A record for example.com: 5.6.7.8 → 1.2.3.4 Propagation: ~300s. Not instantly reversible. To execute: add --confirm

The preview should clearly show what the current state is and what it will change to. The agent confirms with --confirm.

Human Authorization — Operations Beyond the Agent's Autonomy

For operations requiring human judgment or approval — no matter how confident the agent is, it cannot complete these on its own.

Approach 1: Blocking Push Approval

> pay --amount 500 --to vendor --reason "office supplies for Q2" ⏳ Approval required. Notification sent to your device. Waiting for response... ✓ Approved. Payment of $500 completed. [exit:0 | 7.2s]

Like Apple's device login verification — the CLI sends a push notification directly to the human's device with full context (amount, recipient, reason). The CLI blocks until the human approves or rejects, then returns the result to the agent.

Approach 2: Verification Code / 2FA

> transfer --from savings --to checking --amount 10000 ⚠ This operation requires 2FA verification. Reason: transferring $10,000 between accounts. A code has been sent to your authenticator. Re-run with: --otp <code>

📖 Read the full source: r/LocalLLaMA