Context Poisoning Guard
The product bet
SecurityRecipes is positioned as the secure context layer for agentic AI. The strongest enterprise version of that idea is not a prompt library. It is a controlled context supply chain:
- registered source roots,
- owners and trust tiers,
- retrieval decisions,
- source hashes,
- poisoning controls,
- and deterministic inspection before context reaches an agent.
The Context Poisoning Guard adds that inspection layer. It scans every registered context root from the Secure Context Registry and produces a generated evidence pack that says whether a source passes, contains only documented adversarial examples, should hold for review, or should be blocked until fixed.
What was added
- Source profile:
data/assurance/context-poisoning-guard-profile.json - Generator:
scripts/generate_context_poisoning_guard_pack.py - Evidence pack:
data/evidence/context-poisoning-guard-pack.json - MCP tool:
recipes_context_poisoning_guard_pack
Regenerate and validate the pack:
python3 scripts/generate_context_poisoning_guard_pack.py
python3 scripts/generate_context_poisoning_guard_pack.py --checkWhat it scans
| Rule | Severity | Why it matters |
|---|---|---|
| Direct instruction override | Critical | Detects text that asks an agent to ignore or override higher-priority instructions. |
| Secret exfiltration request | Critical | Detects transfer language near secrets, tokens, credentials, private keys, or environment dumps. |
| Approval bypass request | High | Detects requests to skip, bypass, remove, or disable review, approval, policy, CI, or guardrails. |
| Hidden HTML instruction | High | Detects hidden HTML/comment patterns that may evade human review but remain visible to models. |
| External callback instruction | High | Detects send/post/upload/callback language near external URLs. |
| Encoded payload | Medium | Detects long base64-like strings that may hide instructions or data. |
| Zero-width control | Medium | Detects zero-width and bidirectional controls that can hide or reorder text. |
The guard is intentionally conservative. It does not pretend regexes can solve prompt injection. It creates evidence and routing:
passwhen no markers are detected.allow_with_adversarial_exampleswhen markers appear only in documented red-team, threat-model, or defensive examples.hold_for_context_reviewwhen normal guidance contains high-risk markers.block_until_removedwhen critical actionable findings appear outside approved examples.
Why this is enterprise-grade
This feature makes AI easier for buyers because it turns a hard question into a simple artifact:
Can this context be returned to an agent?
An MCP server, AI platform intake workflow, or procurement reviewer can ask the guard pack for source-level decisions and findings instead of reading every page manually. The answer carries source ID, path, line, rule ID, severity, disposition, and source hash.
The generated pack supports:
- prompt-library publication review,
- MCP server intake,
- quarterly secure-context recertification,
- red-team replay planning,
- acquisition diligence,
- and future hosted context monitoring.
MCP examples
Get the portfolio-level summary:
{}Get all sources held for context review:
{
"decision": "hold_for_context_review"
}Get actionable critical findings for one source:
{
"source_id": "prompt-library-recipes",
"severity": "critical",
"actionable_only": true
}Get all direct instruction override matches:
{
"rule_id": "direct-instruction-override"
}Industry alignment
The guard follows current agentic AI and MCP security guidance:
- OpenAI guidance on prompt injection resistance for treating prompt injection as an impact-limiting problem, not only a string-filtering problem.
- OWASP MCP Tool Poisoning for the risk of hidden or malicious instructions in MCP tool metadata and runtime context.
- OWASP Agentic AI Threats and Mitigations for agent threat models around autonomy, tools, delegation, and retrieved context.
- MCP Security Best Practices for scoped access, token-safety, confused-deputy prevention, and auditability.
- NIST AI RMF Generative AI Profile and CISA AI Data Security guidance for AI data provenance, integrity, monitoring, and lifecycle controls.
See also
- Secure Context Trust Pack for registered context roots and hashes.
- Secure Context Firewall for runtime retrieval decisions.
- Context Egress Boundary for outbound data-boundary decisions after retrieval.
- Agentic Red-Team Drill Pack for adversarial examples that should stay labeled as test payloads.
- Agentic Threat Radar for source-backed prioritization.