Agentic Security Remediation
Agentic automation is most valuable in places where risk reduction is measurable, the fix shape is narrow, and the blast radius of a bad change is small enough that a tight guardrail can catch it. The workflows below fit that profile. Everything else a security team runs manually — or hands to a human with a checklist — until the automation is demonstrably safer than a hurried engineer at 11 p.m.
How we decide what to automate
Before a workflow lands here, it has to satisfy four tests:
- Bounded scope. The agent can only touch files in a pre-declared allowlist (e.g. lockfiles, a specific YAML) — never arbitrary source.
- Reversible output. The agent’s output is always a PR, never a merge. A human reviewer remains the last line of defense.
- Measurable outcome. We can tell whether the fix actually moved risk, not just whether a PR landed.
- Clean failure mode. When the agent can’t fix something, it writes a structured triage note and stops — it does not guess.
Active workflows
FROM lines, refresh OS-package layers, and rebuild derived multi-stage images on OS-level CVEs.none, XXE, and friends.Program operations
The workflows above are how the program acts. The pages below are how it’s run — the measurement, review, rollout, gating, runtime, and compliance layers every program needs whether it has one workflow or ten.
Per-CVE recipes
When a finding is a named CVE — Log4Shell, xz-utils, Heartbleed, regreSSHion, the headline supply-chain story of the month — generic workflows are not always enough. The CVE has its own blast radius, its own quirks, and its own “remediation that looks right and isn’t.” See CVE Recipes for per-CVE prompts that an agent can run end-to-end without breaking the code around the fix.
On deck
Candidate workflows teams scope next typically sit in a shared backlog. The same four tests apply — bounded scope, reversible output, measurable outcome, clean failure mode — before any of them move to “Active.”
- More to come. As the orchestration spine matures, new workflows land where the cost/benefit math is clearly in agents' favor. If you have a candidate, see Contribute.
How orchestration fits together
Every agentic remediation workflow here shares one orchestration spine:
- Intake — a finding lands in the risk system (CVE feed, DLP scanner, SAST, manual report).
- Dispatch — the orchestrator decides whether the finding is eligible for an agent (scope, blast radius, cost caps).
- Run — an agent attempts the remediation inside a sandbox with a strict tool allowlist.
- Verify — tests + guardrail checks run; if anything fails, the agent stops and writes a triage note.
- Review — a human reviewer (and the owning team) approve before merge.
The orchestrator is intentionally boring — a queue, a dispatcher, and a reviewer loop. What changes over time are the three inputs the orchestrator feeds into each step: the prompt (as we learn what works), the model (as better models ship), and the tools / MCP connectors (as we connect new sources of context). See any of the per-agent pages under Agents for a worked example of that separation of concerns.
What engineers will see
- PRs tagged with an auto-remediation label (
sec-auto-remediationis the illustrative example used throughout this site — rename to your org’s convention) with a reviewer from the security team on the review line. - Triage tickets when the agent stops — these are not asks for engineers to debug the agent, they’re asks for a human fix.
- A changelog on each workflow page below, so readers can see when its behaviour changed and why.
What this section is not
- A mandate to run these workflows in your own repos. The Prompt Library is where you’d pick up recipes to run yourself.
- A promise that automation will catch everything. Every workflow lists what it won’t catch — read those sections before leaning on it.
See also
- Prompt Library — the recipes security and engineering teams share
- Agents — per-tool orchestration recipes
- Contribute — suggest a new workflow