Token & context economy
Every token a tool emits is a token the agent pays for and must reason around. This is the page most tied to current model capability, so each rule is paired with its assumption. It refines invariant I8.
Bound output by default
Section titled “Bound output by default”- List operations MUST have a sane default limit (e.g. 50) and MUST signal truncation
(a note to stderr, or a
nextCursorin JSON) — never silently cut. - Provide field projection (
--select a,b.c) so the agent fetches only what it needs. - Provide pagination via an opaque cursor; emit
nextCursorand treat its absence as end-of-results. The caller must never have to construct or parse the cursor. - Offer a verbosity control (
--concisedefault,--detailed) so the agent can opt into more.
Return handles, not haystacks
Section titled “Return handles, not haystacks”- Prefer returning IDs / paths / stable references and letting the agent drill down on demand, over dumping every nested object. Summarize, then fetch.
- Resolve opaque identifiers to human-readable names where feasible (a
uuid→ a name); it improves the agent’s precision and cuts hallucination. Drop low-signal cruft (mime_type,256px_image_url) from default output.
Consolidate, don’t mirror the API
Section titled “Consolidate, don’t mirror the API”Don’t expose one command per upstream endpoint. Build few, task-shaped commands that return
the high-signal result for a real job (a search_X that returns the relevant lines with context,
not a read_all_X the agent must then filter). Mirroring an API one-to-one is an
antipattern.
The honest part: what depends on today’s limits
Section titled “The honest part: what depends on today’s limits”These rules exist because context is currently scarce and model recall degrades as it fills (“context rot”). That premise is real today and drives concrete anchors — e.g. capping a single tool response on the order of ~25k tokens is a reasonable default now.
But the premise is not permanent, and this standard says so out loud:
- If context windows grow by orders of magnitude and recall stays flat across them, the default limits should rise — the principle (bound by default; let the agent ask for more) stays; the numbers move.
- If models get materially better at ignoring irrelevant tokens, aggressive projection/pagination becomes a smaller win.
- What will not change soon: emitting a 10,000-row table by default is hostile regardless of window size, because it crowds out the agent’s reasoning budget and its own prior context.
Treat every specific number on this page as a tunable tied to a capability assumption, tracked in the capability-assumptions table. Treat “bounded, projectable, paginated, summarize-then-drill” as durable.