for agents

For agents.

The A in AOI is the part everything else is for.

Tool authors get a contract that makes their CLI agent‑readable. The standards spec is for them. The conformance badge is for them.

The reason any of this matters is what it changes for the agent on the other side. The agent is the only audience that can't compensate for a bad tool interface by reading between the lines. Either the interface is legible, or the agent is hallucinating, scraping, retrying blind, and burning context.

Thirty‑two concrete things that change for the agent, when the tool on the other end is AOI‑conforming.

§ 01

Stops being a parsing problem

Comprehension & predictability.

One vocabulary across every conforming tool.
The agent learns aoi:meta, hit, match, aoi:summary, aoi:error, aoi:warning, aoi:check, aoi:plan, aoi:heartbeat — once. Every conforming tool speaks them. Per‑tool prompt engineering shrinks dramatically.
Discovery before invocation.
tool schema and tool capabilities tell the agent what the tool can do, what events it emits, and what flags it accepts — without calling it. No more "let me run --help and parse the ASCII table."
Typed events with a discriminator.
The agent dispatches on event.type, never on regex against prose. "Is this line a result, a warning, or a status update?" is a typed answer, not an inference.
Versioned schemas.
schema_version in aoi:meta tells the agent which event shape to expect. When a tool updates, the agent knows whether its existing parser still applies — or whether to opt into a new major version explicitly. No silent breakage.

§ 02

LLM-specific

Context‑window economics.

Bounded results with cursors.
--limit 50 returns at most 50 events. The agent's context budget isn't blown by an inadvertent list that dumps a hundred thousand rows.
Compact, line‑delimited JSON.
No banners, ASCII art, color codes, or progress bars eating tokens. Every byte the agent reads is data, not chrome.
Streaming arrival.
The agent starts reasoning over the first result while the rest are still being produced. Time‑to‑first‑useful‑token drops, and long operations remain observable.
Selective consumption.
The agent can jq -c 'select(.type=="hit")' the stream before ingesting it — only paying token cost for the event types it cares about.

§ 03

Stops being a forensic exercise

Verification.

Terminal summary.ok=true is the only honest success signal.
The agent stops trusting exit code 0 alone — which lies often enough to matter (buffered writes, partial completion, silent crashes). "The work finished" and "the process exited" become distinguishable.
EOF without summary is the crash signal.
A consumer can detect a crashed tool even when its language runtime swallowed the exit code. Cross‑language. Cross‑platform.
Structured errors with category + retryable.
The agent's retry policy is tool‑agnostic. category:"rate_limited" and retryable:true mean the same thing in every conforming tool. No per‑tool exception parsing, no scraping error messages for hints.
Audit records on every mutation.
Each created / updated / deleted event carries a stable ID. The agent confirms the action took effect, hands the IDs to memory or downstream audit, never assumes.
doctor for pre‑flight readiness.
Before doing real work, the agent asks the tool "are credentials valid, is the endpoint reachable, is config loaded?" and gets structured aoi:check events back. No destructive call to find out.

§ 04

Beyond prompt engineering

Safety the agent can steer.

Read‑only by default.
The agent operating under uncertainty defaults to non‑destructive behavior. Mutation requires explicit intent.
--dry-run returns typed aoi:plan events.
The agent (or its supervisor) can preview what an action would do, in machine‑readable form, before committing. The model reasons over a plan instead of trusting itself blindly.
--confirm and --confirm-count N.
Bulk destructive operations refuse to run unless the agent passes the count it expected. A wrong invocation produces a refusal, not a disaster.
--idempotency-key.
Safe retry of writes. The agent can re‑issue a failed create without doubling the effect — the tool recognizes the key and returns the prior result.
Secret redaction in meta.
args_redacted:true confirms the tool didn't echo the agent's API key back into its own output stream. The agent doesn't have to scrub before logging.

§ 05

Sources, transformers, sinks

Composition.

Three roles, one contract.
Every conforming tool plays one of three pipeline roles — a source emits JSONL, a transformer reads JSONL and emits JSONL, a sink reads JSONL and performs side effects. The agent composes them through ordinary shell pipes; the envelope is the same across all three.
JSONL → JSONL by construction.
No regex‑bridging column 3 of tool A's ASCII table to feed it to tool B. The interface holds across the chain. The same shell ecosystem that already handles structured streams (jq, tee, pipefail) works unchanged.
Pipefail‑safe chains.
With set -o pipefail and a terminal‑summary check, the agent composes five tools and trusts the final exit + summary. Upstream failure short‑circuits the whole pipeline.
Domain vs control events differentiated.
A downstream transformer or sink ignores upstream aoi:meta / aoi:summary (control) and consumes only hit / match (data) — automatically, without configuration.
Heartbeats distinguish quiet from dead.
For long‑running operations, the agent's supervisor knows whether to keep waiting or to kill and retry. Liveness is part of the contract, not a TCP keepalive hack.

§ 06

At agent altitude

Planning & reasoning.

Capabilities manifest is queryable.
The agent asks "which commands are destructive? bounded? support cursor / dry‑run / idempotency?" — and gets a typed answer. Reasoning happens at plan time, not at trial‑and‑error time.
Cost surface visible at plan time.
Read‑only vs mutating, bounded vs unbounded, idempotent vs not — all declared, all inspectable before any call. The model picks the lowest‑risk path that satisfies the goal.
Hallucination‑resistant flags.
The agent doesn't have to guess "does this tool support --recursive?" It asks capabilities and gets the truth. Wrong flags fail loudly with category:"usage", not silently with wrong behavior.
Reproducibility.
Same args + same tool_version + same schema_version produce the same event shape. The agent can cache, replay, diff, and detect drift.

§ 07

The boring primitives that matter

Operational control.

Structured cancellation.
An aborted long operation produces a summary.ok=false, reason:"interrupted", partial:true event when possible. The agent that cancelled knows how much was done — not just that it stopped.
Universal pagination model.
One --cursor pattern, every tool. The agent learns it once, applies it everywhere. No per‑tool pagination implementations to memorize.
Truncation is explicit.
summary.truncated:true with summary.next_cursor tells the agent there's more — no parsing the last line for "..." or "and 47 more results."

§ 08

The agent doesn't manage OAuth

Environment & auth.

Auth happens beneath the call.
The agent invokes a tool and the user's existing credential ecosystem — keychain, AWS credential chain, kubeconfig, ssh agent, gpg keyring, Vault token, gh PAT, corporate SSO — is already wired. The agent doesn't manage OAuth flows or token refresh. This is huge. Auth deliberately out of scope ↗
Environment isolation per invocation.
Each batch‑mode call is a clean process. The agent doesn't carry state leaks across operations. (In session mode, the agent gets the inverse benefit: persistent state across requests when it actually wants that.)

§ closing

Every item above is a property an agent can rely on at plan time rather than discover at run time. That shift — from inference to contract — is what makes autonomous tool use tractable at all.

The corresponding work for tool authors is small. The conformance contract is § 18 of the AOI‑CLI specification — typically an afternoon of work to retrofit an existing CLI. See Adopt for working reference implementations in five languages.