MCP Tool Risk Contract
MCP tools can now declare behavior with annotations such as
readOnlyHint, destructiveHint, idempotentHint, and
openWorldHint. That is valuable, but the MCP specification is clear:
clients must treat annotations as untrusted unless they come from a
trusted server. The MCP Tool Risk Contract turns that reality into a
buyer-ready control surface.
The core policy is simple: before a tool call runs, decide whether the session has private data, untrusted content, and an external or state-changing capability in the same execution path. If it does, the call is denied unless there is an explicit approval/control path. This makes tool risk easy for enterprise teams to reason about without pretending the model can reliably separate user instructions from attacker-controlled content.
Generated artifact
- Profile:
data/assurance/mcp-tool-risk-contract-profile.json - Generator:
scripts/generate_mcp_tool_risk_contract.py - Runtime evaluator:
scripts/evaluate_mcp_tool_risk_decision.py - Evidence pack:
data/evidence/mcp-tool-risk-contract.json - MCP tools:
recipes_mcp_tool_risk_contractandrecipes_evaluate_mcp_tool_risk_decision
Regenerate and validate:
python3 scripts/generate_mcp_tool_risk_contract.py
python3 scripts/generate_mcp_tool_risk_contract.py --checkEvaluate one proposed tool call:
python3 scripts/evaluate_mcp_tool_risk_decision.py \
--workflow-id vulnerable-dependency-remediation \
--namespace repo.contents \
--tool-name repo.contents.patch \
--requested-access-mode write_branch \
--agent-id sr-agent::vulnerable-dependency-remediation::codex \
--run-id run-ci \
--session-id session-ci \
--correlation-id corr-ci \
--server-trusted \
--read-only-hint false \
--destructive-hint false \
--idempotent-hint false \
--open-world-hint true \
--human-approval-id approval-ci \
--expect-decision allow_with_confirmationDecision model
| Decision | Meaning |
|---|---|
allow_tool_call | The call fits workflow scope, trusted annotations, and session-combination policy. |
allow_with_confirmation | The call can proceed only with a durable human approval or confirmation record. |
hold_for_tool_risk_review | Evidence is missing, annotations are untrusted for the risk level, or the tool is sensitive. |
deny_annotation_contradiction | Runtime request contradicts the tool annotations, such as read-only metadata on a write call. |
deny_session_exfiltration_path | The session combines private data, untrusted content, and external or state-changing capability without approval. |
deny_scope_drift | Namespace, connector, access mode, or workflow is outside the generated contract. |
kill_session_on_tool_risk_signal | A kill signal appeared: secret-bearing arguments/results, tool-list drift after approval, private-network destination, or approval bypass. |
What gets scored
The generator reads the MCP connector trust pack, authorization conformance pack, workflow manifest, and gateway policy. It produces a profile for every MCP namespace with:
- trusted vs untrusted annotation source
- suggested standard annotations
- risk tier
- private-data, untrusted-content, exfiltration, state-change, and approval-required factors
- authorization conformance state
- workflow-level combination risk
The pack is intentionally conservative. Open-world tools taint the session; untrusted annotations never reduce friction for sensitive tools; write and non-idempotent calls need approval; tool-list changes after approval are kill signals.
Source anchors
- MCP Tools specification
- MCP Tool Annotations as Risk Vocabulary
- MCP Authorization specification
- MCP Security Best Practices
- MCP Elicitation specification
- OWASP Top 10 for Agentic Applications 2026
- NIST AI RMF
- NIST AI RMF Generative AI Profile