essay · why

Why a
standard.

Why agent‑operable tools need a contract, why now, and what changes when the interface between agents and the world becomes typed.

§ 01

The world outside the model.

Today’s models can think, write, plan, and converse. They cannot yet operate. Every real action — every file written, ticket created, deployment shipped, sensor read, transaction approved — happens in the world outside the model. To reach it, an agent has to use a tool.

The tools an agent has today are the ones humans built, for humans, in an era when no autonomous system was ever going to read their output. They print tables, color, prompts, progress bars, ambiguous errors, and pipe exit codes that carry centuries of POSIX convention but no contract anything machine‑facing could call. An agent calling such a tool is a blindfolded actor performing in a theater it cannot see.

§ 02

What scraping terminals costs.

The current shape of agent‑tool integration is brittle in five specific ways, and each one is a category of failure that is happening, right now, in production agentic systems:

Read failures. An agent reads a tool’s output by scraping a table. The tool ships a new release that changes a column header. The agent silently mis‑parses every row and acts on ghost data.
Write failures. An agent invokes a write command. The process exits 0. The change did not actually happen — the tool buffered, the network blipped, the upstream rolled back. Without a terminal summary, exit 0 is just "the process finished," not "the work succeeded."
Verify failures. An agent issues a mutation. It cannot tell, after the fact, whether the mutation produced the intended outcome, a partial outcome, or a duplicate. So it either re‑issues (and doubles the effect) or assumes (and hallucinates the audit).
Bound failures. An agent runs a list command. The tool dumps fifty thousand rows. The agent now holds context full of irrelevant data, has timed out, has overrun budget, or has crashed the orchestrator.
Discovery failures. An agent doesn’t know what a tool can do until it runs it. It guesses flags from the help text. It hallucinates options that don’t exist. It cannot reason about the cost or risk of a call without making it.

Each of these is a class of bug that an agentic system cannot fully solve through prompting alone. They are interface bugs. They live in the contract between the agent and the tool.

§ 03

Interfaces, not improvements.

The instinct of every individual tool team is to make their tool better — faster, prettier, more reliable. That is necessary work and it does not fix the problem. A thousand individually excellent tools with a thousand different output conventions is still fragmentation.

A standard is not the same as a quality bar. A standard is what makes a thousand improvements compose. It is the difference between every railway laying its own gauge and a continent that can move a freight car from any port to any depot without unloading it. The improvement to any single tool is local. The improvement of a standard is systemic.

Machine Mode is that systemic improvement. It does not tell any tool what to do better. It tells every tool to expose the same shape of interface to the same kind of caller.

§ 04

Moments when standards happened.

Every layer of software has had a moment when an interface convention became the standard the whole layer organized around:

POSIX (1988).: Operating systems agreed on a process model, a file abstraction, a signal vocabulary. Everything from shell scripts to multi‑million‑line codebases became portable across machines that had previously spoken mutually unintelligible dialects.
HTTP (1991).: Distributed systems agreed on a request envelope, a status taxonomy, a header model. The web was not a technology before this. It was a standard.
OpenAPI (2011).: Web APIs agreed on a schema language for endpoints, types, and authentication. Automated SDK generation, contract testing, and cross‑language interop stopped being craft work.
MCP (2024).: Agent runtimes agreed on a transport and invocation layer for tools. Discovery, invocation, and response shapes became something every agent framework could speak in common, instead of every vendor inventing its own tool‑calling glue. The agentic ecosystem began to compose.

Each was contested before it was settled. Each one looked, at the time, like overkill for the size of the problem it was solving. Each one became invisible — the precondition for everything that followed. Agentic computing is at the same moment. Tools are the layer. Machine Mode is the standard.

§ 05

Why not just OpenAPI?

OpenAPI is a great answer for HTTP services. It is mature, widely adopted, well‑tooled, and most modern remote APIs already publish one. If your tool is a remote service, OpenAPI gives you most of what an "agent‑operable HTTP interface" would require.

The reason this site is about CLIs first is that CLIs are the surface OpenAPI was never designed for, and they are the surface most agent failures actually live on. Running kubectl logs my‑pod, git rebase, terraform apply, find . -newer x, or docker exec is not "an HTTP API I haven't built yet." These are local‑process operations against local state. The operational primitives differ in kind:

Three I/O channels, not one. Stdin, stdout, and stderr each carry different signals — request data, structured results, diagnostics. HTTP has one request body and one response body.
Exit codes and signals. CLI completion is a two‑level system: process exit code plus terminal summary event. HTTP collapses both into a status code.
Pipes and process composition. Shell pipelines are a real and durable composition primitive. There is no equivalent in OpenAPI — gRPC streaming approximates it for services, but not for the local boundary.
Local state. Filesystem, environment variables, working directory, terminal session. These are the inputs and outputs of most CLI work. None of them have an OpenAPI vocabulary.
No network. Many useful tools must work disconnected, in containers without egress, or against local sockets. OpenAPI assumes a network endpoint exists.
Auth that already works. A CLI runs as the user, with the user's keychain, env, AWS credential chain, kubeconfig, ssh agent, gpg keyring, Vault token, gh PAT, corporate SSO session already in place. Each of these ecosystems has years of investment in MFA, role assumption, cert pinning, browser‑based flows, and renewal. CLI tools inherit all of it for free. HTTP and RPC protocols that try to centralize auth at the wire layer keep running into the wall of real‑world federated identity — MCP's evolving OAuth story is the current example. AOI doesn't standardize auth at all; it standardizes the interface the tool exposes once auth is already in place.
Tools that aren't services at all. ffmpeg transcodes a video. pandoc converts a document. jq reshapes a JSON stream. gpg signs a file. rustc compiles a program. These are transformations and computations, not services — they take local inputs and produce local outputs, and the value is the work, not state. Wrapping them as HTTP services would either run them in long‑lived processes (changing what they are) or require expensive file shipping over a network. The CLI is the right surface for them. AOI covers it natively.

The honest framing isn't AOI versus OpenAPI. It is Machine Mode is the standard; OpenAPI is its de facto profile for the HTTP surface; AOI‑CLI is its first formal profile for the process surface. The ten factors apply to both — a well‑designed OpenAPI service already satisfies most of them. The AOI‑HTTP profile, when it lands, will reference OpenAPI as the wire format and add the supplementary requirements (idempotency, terminal completion, dry‑run, structured error retryability) as a profile document or vendor extension.

What this site is building, then, is the profile for the surface nobody else is standardizing — and the meta‑framework that places it alongside OpenAPI rather than against it.

§ 06

What changes for the agent.

The agent is the only audience that can't compensate for a bad tool interface by reading between the lines. A human at a terminal sees a typo in the output and infers the intent. A downstream script regexes its way past a renamed column. An agent does neither — it hallucinates a column that isn't there, retries blindly, burns context on chrome it can't even tell is chrome, and eventually does the wrong thing with confidence.

A typed, discoverable, bounded, safe interface is what lets the agent stop guessing. Six categories of agent failure shift from "application problem the model has to solve" to "platform guarantee the interface already provides":

Comprehension stops being a parsing problem. Typed events with a type discriminator mean the agent dispatches on shape, not on regex against prose. The envelope is the same across every conforming tool.
Planning stops being trial‑and‑error. Capability and schema discovery let the agent reason over what a tool can do before calling it. Cost, risk, and side‑effect surface become legible at plan time.
Action stops being a hope. Dry‑run plans, confirmed counts, and idempotency keys move from "team best practice" to "behavior the interface demands." Retry stops being a foot‑gun. Bulk destruction stops being one wrong flag away.
Verification stops being a forensic exercise. Terminal summary.ok is unambiguous. Audit records are structured. The agent doesn't reconstruct what happened — the interface told it.
Composition stops needing glue script. One tool's output is another tool's input by construction. Agents chain operations the way humans chain shell pipelines, with predictable failure semantics across the chain.
Context budget stops being burned on chrome. Bounded reads, compact line‑delimited JSON, and selective consumption mean every token the agent reads is data. Long lists stop blowing the window. Streaming arrival drops time‑to‑first‑useful‑token.

The full enumeration — thirty‑two specific things that change for the agent on the other end of a conforming tool — lives at For agents. It is the central reason this standard exists; the rest of the site is in service to it.

§ 07

Why it has to be small.

A standard succeeds when it is small enough to remember and strict enough to enforce. Machine Mode is intentionally narrow. It does not dictate transport, framework, language, packaging, or operating system. It dictates the shape of the contract at the process boundary: typed events, declared schema, bounded reads, explicit writes, terminal completion, structured error.

Everything beyond that is the implementor’s business. That is the property that lets a standard cross ecosystems instead of inviting a battle inside one.

§ 08

What this site is.

This site is the public home of Machine Mode. The foundations enumerates the ten factors. The AOI‑CLI specification is the first formal profile. The conformance badge is for tools that already implement it.

Future profiles will cover other surfaces — HTTP services, message queues, embedded devices, physical actuators, other agents. The factors are the same. The wire format is not.

If you build a tool that an autonomous system might call, or you build an autonomous system that calls tools, you are in the audience for this standard. The work of getting there is collective. Pull requests welcome.