Skip to content

MCP Tool Risk Contract

What this adds. SecurityRecipes now treats MCP tool metadata as risk vocabulary, not enforcement. The contract lets an agent host or MCP gateway use annotations safely while still relying on deterministic scope, authorization, sandbox, network, approval, and output controls.

MCP tools can now declare behavior with annotations such as readOnlyHint, destructiveHint, idempotentHint, and openWorldHint. That is valuable, but the MCP specification is clear: clients must treat annotations as untrusted unless they come from a trusted server. The MCP Tool Risk Contract turns that reality into a buyer-ready control surface.

The core policy is simple: before a tool call runs, decide whether the session has private data, untrusted content, and an external or state-changing capability in the same execution path. If it does, the call is denied unless there is an explicit approval/control path. This makes tool risk easy for enterprise teams to reason about without pretending the model can reliably separate user instructions from attacker-controlled content.

Generated artifact

  • Profile: data/assurance/mcp-tool-risk-contract-profile.json
  • Generator: scripts/generate_mcp_tool_risk_contract.py
  • Runtime evaluator: scripts/evaluate_mcp_tool_risk_decision.py
  • Evidence pack: data/evidence/mcp-tool-risk-contract.json
  • MCP tools: recipes_mcp_tool_risk_contract and recipes_evaluate_mcp_tool_risk_decision

Regenerate and validate:

python3 scripts/generate_mcp_tool_risk_contract.py
python3 scripts/generate_mcp_tool_risk_contract.py --check

Evaluate one proposed tool call:

python3 scripts/evaluate_mcp_tool_risk_decision.py \
  --workflow-id vulnerable-dependency-remediation \
  --namespace repo.contents \
  --tool-name repo.contents.patch \
  --requested-access-mode write_branch \
  --agent-id sr-agent::vulnerable-dependency-remediation::codex \
  --run-id run-ci \
  --session-id session-ci \
  --correlation-id corr-ci \
  --server-trusted \
  --read-only-hint false \
  --destructive-hint false \
  --idempotent-hint false \
  --open-world-hint true \
  --human-approval-id approval-ci \
  --expect-decision allow_with_confirmation

Decision model

DecisionMeaning
allow_tool_callThe call fits workflow scope, trusted annotations, and session-combination policy.
allow_with_confirmationThe call can proceed only with a durable human approval or confirmation record.
hold_for_tool_risk_reviewEvidence is missing, annotations are untrusted for the risk level, or the tool is sensitive.
deny_annotation_contradictionRuntime request contradicts the tool annotations, such as read-only metadata on a write call.
deny_session_exfiltration_pathThe session combines private data, untrusted content, and external or state-changing capability without approval.
deny_scope_driftNamespace, connector, access mode, or workflow is outside the generated contract.
kill_session_on_tool_risk_signalA kill signal appeared: secret-bearing arguments/results, tool-list drift after approval, private-network destination, or approval bypass.

What gets scored

The generator reads the MCP connector trust pack, authorization conformance pack, workflow manifest, and gateway policy. It produces a profile for every MCP namespace with:

  • trusted vs untrusted annotation source
  • suggested standard annotations
  • risk tier
  • private-data, untrusted-content, exfiltration, state-change, and approval-required factors
  • authorization conformance state
  • workflow-level combination risk

The pack is intentionally conservative. Open-world tools taint the session; untrusted annotations never reduce friction for sensitive tools; write and non-idempotent calls need approval; tool-list changes after approval are kill signals.

Source anchors

See also