Trust tiers

When does a tool run automatically vs. require confirmation? The decision is a small pure function of the tool's tier and the notebook's trust level.

Trust tiers are the model behind whether the workbench runs a tool automatically, asks first, or requires an explicit OK.

The decision is a small pure function with two inputs:

The tool's tier — how impactful is this operation?
The notebook's trust level — how much have you authorized this notebook to do without supervision?

The output is one of three actions: auto, gated, or confirm.

Tool tiers

Every tool in the MCP registry declares its tier:

Tier	Examples	Impact
`safe`	`goldencheck.profile`, `infermap.map_schema`	Read-only. Doesn't write to the data store.
`mutating`	`goldenmatch.dedupe`, `goldenflow.transform`	Produces persistent state — events, postflight reports, golden records.
`destructive`	`corrections.merge`, `corrections.split`	Hard to reverse without rolling back the audit log.

A tool's tier is declared in mcp_registry.py and never changes at runtime.

Notebook trust levels

Every notebook you create starts at cautious and you can ratchet it up as you build confidence:

Level	Meaning
`cautious`	Default. Mutating and destructive tools require explicit confirmation.
`trusted`	You've validated this notebook's behavior. Mutating tools run automatically; destructive tools still confirm.
`autonomous`	Everything runs automatically. Use for production pipelines that have a track record.

You change the trust level via the notebook header. The change is logged in the audit trail.

The matrix

This is the decision table — the same one encoded in trust_gate.py:gate_action():

Tool tier ↓ / Notebook trust →	cautious	trusted	autonomous
safe	`auto`	`auto`	`auto`
mutating	`confirm`	`auto`	`auto`
destructive	`confirm`	`confirm`	`auto`

auto — tool dispatches immediately, event is appended, postflight is rendered when done.
confirm — UI surfaces a "Run this?" prompt with the rendered parameters. You explicitly approve before dispatch.
gated — reserved (currently equivalent to confirm, kept in the type for future fine-grained gating).

A pure function with three inputs and three outputs is easy to test, easy to render in docs, and easy to reason about. There's no hidden context, no "well, it depends on the user's plan, and whether this is business hours, and …".

If you want a different policy — say, destructive-on-autonomous still needs confirm — you change the function in one place and every dispatch in the system picks it up. The trust matrix is intentionally a small surface.

Cross-references

Implementation: backend/app/services/trust_gate.py
Tool metadata + tiers: backend/app/services/mcp_registry.py
Tests: 17 parametrized cases in backend/tests/test_trust_gate.py cover every combination of tier × trust × edge case.

If you want to extend trust gating (e.g., introduce a tier-4 "regulatory-impact" gate that requires a co-signature), the existing table is the place to start — add a row, add tests, ship.

← Lineage and provenance

Trust tiers

Tool tiers

Notebook trust levels

The matrix

Why a pure function

Cross-references