Trust tiers
When does a tool run automatically vs. require confirmation? The decision is a small pure function of the tool's tier and the notebook's trust level.
Trust tiers are the model behind whether the workbench runs a tool automatically, asks first, or requires an explicit OK.
The decision is a small pure function with two inputs:
- The tool's tier — how impactful is this operation?
- The notebook's trust level — how much have you authorized this notebook to do without supervision?
The output is one of three actions: auto, gated, or confirm.
Tool tiers
Every tool in the MCP registry declares its tier:
| Tier | Examples | Impact |
|---|---|---|
safe | goldencheck.profile, infermap.map_schema | Read-only. Doesn't write to the data store. |
mutating | goldenmatch.dedupe, goldenflow.transform | Produces persistent state — events, postflight reports, golden records. |
destructive | corrections.merge, corrections.split | Hard to reverse without rolling back the audit log. |
A tool's tier is declared in mcp_registry.py and never changes at runtime.
Notebook trust levels
Every notebook you create starts at cautious and you can ratchet it up as you build confidence:
| Level | Meaning |
|---|---|
cautious | Default. Mutating and destructive tools require explicit confirmation. |
trusted | You've validated this notebook's behavior. Mutating tools run automatically; destructive tools still confirm. |
autonomous | Everything runs automatically. Use for production pipelines that have a track record. |
You change the trust level via the notebook header. The change is logged in the audit trail.
The matrix
This is the decision table — the same one encoded in trust_gate.py:gate_action():
| Tool tier ↓ / Notebook trust → | cautious | trusted | autonomous |
|---|---|---|---|
| safe | auto | auto | auto |
| mutating | confirm | auto | auto |
| destructive | confirm | confirm | auto |
auto— tool dispatches immediately, event is appended, postflight is rendered when done.confirm— UI surfaces a "Run this?" prompt with the rendered parameters. You explicitly approve before dispatch.gated— reserved (currently equivalent toconfirm, kept in the type for future fine-grained gating).
Why a pure function
A pure function with three inputs and three outputs is easy to test, easy to render in docs, and easy to reason about. There's no hidden context, no "well, it depends on the user's plan, and whether this is business hours, and …".
If you want a different policy — say, destructive-on-autonomous still needs confirm — you change the function in one place and every dispatch in the system picks it up. The trust matrix is intentionally a small surface.
Cross-references
- Implementation:
backend/app/services/trust_gate.py - Tool metadata + tiers:
backend/app/services/mcp_registry.py - Tests: 17 parametrized cases in
backend/tests/test_trust_gate.pycover every combination of tier × trust × edge case.
If you want to extend trust gating (e.g., introduce a tier-4 "regulatory-impact" gate that requires a co-signature), the existing table is the place to start — add a row, add tests, ship.