Skip to content

Agentic Readiness Scorecard

Why this page exists. Enterprise buyers do not need another static maturity label. They need a decision surface: which agentic remediation workflows can scale now, which remain pilot-only, which require manual approval, and which are blocked by missing evidence.

The product bet

SecurityRecipes is strongest when it is the secure context layer that makes agentic remediation easy to approve. The Workflow Control Plane declares what a workflow may do. The MCP Gateway Policy Pack turns that scope into runtime decisions. The Agentic Assurance Pack explains the control story. The readiness scorecard turns all of that evidence into an adoption decision.

That matters for an enterprise or acquirer because agentic AI programs are moving from pilots to platform rollout. The hard question is no longer “can an agent fix this?” It is “which agentic workflows can we scale without inventing new governance every time?”

What was added

The readiness layer lives in source-controlled and generated artifacts:

  • data/assurance/agentic-readiness-model.json - the scoring model, weights, scale gates, blockers, and industry references.
  • scripts/generate_agentic_readiness_scorecard.py - a deterministic generator with --check mode for CI drift detection.
  • data/evidence/agentic-readiness-scorecard.json - the generated scale, pilot, gate, or block decision artifact.
  • recipes_agentic_readiness_scorecard - the MCP tool that exposes readiness decisions to agents, AI platform portals, and internal control dashboards.

Run it locally from the repo root:

python3 scripts/generate_agentic_readiness_scorecard.py
python3 scripts/generate_agentic_readiness_scorecard.py --check

What is inside the scorecard

The generated scorecard includes:

SectionPurpose
readiness_summaryWorkflow counts, average score, decision counts, pilot connector dependencies, and failure count.
workflow_readinessPer-workflow decision, score, dimension scores, blockers, connector status, identity count, drill count, and next actions.
score_dimensionsThe weighted model used to score control plane, gateway policy, identity, connector trust, adversarial eval, evidence chain, and maturity.
decision_contractThe thresholds for scale_ready, pilot_guarded, manual_gate, and blocked.
scale_plan30- and 90-day operating recommendations for enterprise rollout.
source_artifactsHashes for every artifact used to produce the decision.

The current generated pack produces four useful expansion lanes:

DecisionMeaning
scale_readyReady for controlled enterprise expansion with standard change controls.
pilot_guardedApproved for bounded use, but broad rollout waits on maturity, pilot connector promotion, or exit metrics.
manual_gateHuman program owner must approve before use.
blockedDo not run until blockers are remediated.

Why this is industry aligned

The scorecard is mapped to current primary references:

How to use it

For an AI platform review, start with readiness_summary. It tells the platform which workflows can scale and which remain pilot guarded.

For a workflow owner, inspect workflow_readiness[*].next_actions. It turns a high-level score into the concrete promotion work: graduate pilot connectors, keep crawl-stage workflows inside a pilot cohort, or resolve blockers.

For a procurement or diligence review, attach the scorecard with the assurance pack, gateway policy, identity ledger, and red-team drill pack. The scorecard gives reviewers the adoption decision; the source artifact hashes show where the decision came from.

For MCP consumers, call:

recipes_agentic_readiness_scorecard(decision="scale_ready")
recipes_agentic_readiness_scorecard(workflow_id="vulnerable-dependency-remediation")
recipes_agentic_readiness_scorecard(minimum_score=95)

CI contract

The generator fails if:

  • The readiness model weights do not sum to 100.
  • Generated source hashes drift from the workflow manifest.
  • Gateway policy no longer defaults to deny.
  • Any generated evidence pack reports validation failures.
  • A checked-in scorecard is stale in --check mode.

That is the enterprise-ready bar: scale decisions cannot drift from the evidence that justifies them.

See also