Trust tiers

When does a tool run automatically vs. require confirmation? The decision is a small pure function of the tool's tier and the notebook's trust level.

Trust tiers are the model behind whether the workbench runs a tool automatically, asks first, or requires an explicit OK.

The decision is a small pure function with two inputs:

  1. The tool's tier — how impactful is this operation?
  2. The notebook's trust level — how much have you authorized this notebook to do without supervision?

The output is one of three actions: auto, gated, or confirm.

Tool tiers

Every tool in the MCP registry declares its tier:

TierExamplesImpact
safegoldencheck.profile, infermap.map_schemaRead-only. Doesn't write to the data store.
mutatinggoldenmatch.dedupe, goldenflow.transformProduces persistent state — events, postflight reports, golden records.
destructivecorrections.merge, corrections.splitHard to reverse without rolling back the audit log.

A tool's tier is declared in mcp_registry.py and never changes at runtime.

Notebook trust levels

Every notebook you create starts at cautious and you can ratchet it up as you build confidence:

LevelMeaning
cautiousDefault. Mutating and destructive tools require explicit confirmation.
trustedYou've validated this notebook's behavior. Mutating tools run automatically; destructive tools still confirm.
autonomousEverything runs automatically. Use for production pipelines that have a track record.

You change the trust level via the notebook header. The change is logged in the audit trail.

The matrix

This is the decision table — the same one encoded in trust_gate.py:gate_action():

Tool tier ↓ / Notebook trust →cautioustrustedautonomous
safeautoautoauto
mutatingconfirmautoauto
destructiveconfirmconfirmauto
  • auto — tool dispatches immediately, event is appended, postflight is rendered when done.
  • confirm — UI surfaces a "Run this?" prompt with the rendered parameters. You explicitly approve before dispatch.
  • gated — reserved (currently equivalent to confirm, kept in the type for future fine-grained gating).

Why a pure function

A pure function with three inputs and three outputs is easy to test, easy to render in docs, and easy to reason about. There's no hidden context, no "well, it depends on the user's plan, and whether this is business hours, and …".

If you want a different policy — say, destructive-on-autonomous still needs confirm — you change the function in one place and every dispatch in the system picks it up. The trust matrix is intentionally a small surface.

Cross-references

If you want to extend trust gating (e.g., introduce a tier-4 "regulatory-impact" gate that requires a co-signature), the existing table is the place to start — add a row, add tests, ship.