Agentic Measurement Probes
SecurityRecipes is positioned as the secure context layer for agentic AI. The Agentic Measurement Probe Pack makes that position more concrete: it asks whether a workflow can reconstruct the context, tools, identities, policy decisions, memory, egress, approvals, verifiers, and threat signals behind an agent run.
This is the forward-looking product surface suggested by current industry direction. NIST’s April 2026 agentic measurement probe work focuses on traceability, reconstructing tool usage and evidence, and using judges or verifiers grounded in knowledge bases. OWASP and MCP guidance point to the same need from the security side: agentic systems must prove scope, authorization, context boundaries, telemetry, and failure handling before they operate in high-stakes environments.
Generated artifact
- Profile:
data/assurance/agentic-measurement-probe-profile.json - Generator:
scripts/generate_agentic_measurement_probe_pack.py - Evidence pack:
data/evidence/agentic-measurement-probe-pack.json - MCP tool:
recipes_agentic_measurement_probe_pack
Regenerate and validate the pack:
python3 scripts/generate_agentic_measurement_probe_pack.py
python3 scripts/generate_agentic_measurement_probe_pack.py --checkProbe classes
| Probe class | What it proves |
|---|---|
| Context integrity | Retrieved context is registered, owned, hash-bound, cited, and scanned before it influences an agent. |
| Tool authorization | MCP namespaces are default-deny, resource-bound, audience-bound, and scoped before tool execution. |
| Identity delegation | Agents act through scoped non-human identities with explicit denies and revocation evidence. |
| Context egress | Context cannot leave tenant, model, telemetry, MCP, or public-corpus boundaries without data-class and destination checks. |
| Memory boundary | Persistent memory, vector indexes, replay, and prohibited memory are gated before reuse. |
| Red-team replay | Workflows can replay prompt injection, goal hijack, approval bypass, exfiltration, drift, loop, and evidence-integrity probes. |
| Run receipt integrity | A run can reconstruct context, tools, policy decisions, approvals, verifier output, closure, and identity revocation. |
| Readiness decision | Current evidence supports scale, guarded pilot, manual gate, or block decisions. |
| Threat radar alignment | Probe coverage maps back to current source-backed agentic and MCP threat signals. |
How to use it
AI platform promotion. Call the MCP tool with
decision="measurement_ready" to list workflows whose probes pass the
minimum score. Treat failed probes as promotion blockers until the
source evidence is regenerated or remediated.
MCP connector intake. Filter by class_id="tool_authorization" or
class_id="egress_boundary" when approving new remote MCP servers,
OAuth-backed connectors, or data-moving tool surfaces.
Quarterly red-team replay. Filter by class_id="red_team_replay"
and run the named scenarios against the current model, prompt, tool,
context, memory, and policy stack.
Procurement and diligence. Attach the generated pack with the Agentic Assurance Pack, Readiness Scorecard, Agentic System BOM, Run Receipt Pack, and Threat Radar. The probe pack turns those artifacts into a single inspectable measurement story.
MCP examples
List workflows ready for measurement-based promotion:
{
"decision": "measurement_ready",
"minimum_score": 90
}Inspect one workflow:
{
"workflow_id": "vulnerable-dependency-remediation"
}Find failed or held probes:
{
"status": "fail"
}Inspect egress probes:
{
"class_id": "egress_boundary"
}Source anchors
- NIST agentic measurement probes event
- NIST AI RMF
- NIST AI RMF Generative AI Profile
- OWASP Top 10 for Agentic Applications 2026
- OWASP MCP Top 10
- MCP Authorization Specification
- MCP Security Best Practices
- CSA Capabilities-Based Risk Assessment