Ruby unsafe deserialization — `Marshal.load` / `YAML.load`

Marshal.load and permissive YAML loaders are durable Ruby security traps. If untrusted bytes reach these APIs, attacker payloads can instantiate arbitrary classes and trigger dangerous code paths. This pattern appears in Rails jobs, cache/session layers, signed-cookie migrations, and background workers.

Pattern

Marshal.load(payload) where payload crosses trust boundaries (HTTP params, Redis, MQ, DB rows editable by users).
YAML.load / Psych.load on untrusted YAML.
YAML.unsafe_load in modern Ruby/Psych.
Indirect wrappers that decode serialized data before model/job processing.

Why it matters

Unsafe Ruby deserialization can become RCE via gadget chains in application or gem classes. Even without direct execution, attackers can tamper with object state to bypass authz checks, poison jobs, or trigger SSRF/file operations.

Mitigation — safe loader with strict class policy

For YAML, switch to safe_load with explicit permitted classes:

parsed = YAML.safe_load(payload, permitted_classes: [Date, Time], aliases: false)

For Marshal paths that cannot be removed immediately, enforce trusted provenance and fail-closed guards at ingress. Treat as a temporary bridge, not a steady state.

Uplift — move to JSON + explicit coercion

Preferred uplift:

Replace Marshal/YAML object payloads with JSON hashes/arrays.
Perform explicit coercion into value objects/DTOs.
Validate required fields and types before business logic.
Keep temporary legacy decode only where required, with telemetry and a removal date.

Inputs

Call sites — every Marshal.load, YAML.load, Psych.load, and unsafe_load usage.
Data provenance — where each payload originates.
Compatibility needs — which historical payloads must continue to decode during migration.

The prompt

You are remediating unsafe Ruby deserialization call sites.
Output a PR or a TRIAGE.md.

## Step 0 — Inventory

1. Search for `Marshal.load`, `YAML.load`, `Psych.load`, and
   `unsafe_load`.
2. Classify each by trust boundary: trusted-only internal,
   external/untrusted, or unknown.
3. Map legacy payload producers/consumers.

## Step 1 — Choose remediation per site

- **Untrusted or unknown:** uplift to JSON + explicit coercion.
- **Trusted-only temporary compatibility path:** mitigate with
  strict guards and bounded lifespan.

## Step 2 — Implement

For YAML sites:
- Replace with `YAML.safe_load` and minimal
  `permitted_classes` list.
- Disable aliases unless explicitly required.

For Marshal sites:
- Replace with JSON decode + schema/type validation.
- Remove `Marshal.load` from runtime paths handling external
  input.

For temporary compat paths:
- Isolate in a clearly named legacy decoder module.
- Add telemetry counters for legacy decode usage.
- Add TODO with owner and removal date.

## Step 3 — Tests

Add behavior-preservation tests:

- Valid legacy payloads decode to equivalent domain values.
- Untrusted crafted payloads are rejected.
- Unknown class tags / alias abuse fails closed.

## Step 4 — Open the PR

- Branch: `remediate/ruby-deser-<module-slug>`.
- Title: `[Security][Ruby] remove unsafe deserialize in <module>`.
- Body: inventory, trust classification, uplift/mitigation
  decisions, compatibility plan, test evidence.
- Label: `sec-auto-remediation`.

## Stop conditions

- Trust boundary cannot be determined.
- Required migration spans multiple services with no staged
  rollout plan.
- Critical path lacks tests and cannot be safely instrumented.

## Scope

- Do not ship unrelated refactors.
- Do not introduce broad `permitted_classes` catch-alls.
- Do not retain legacy decode paths without explicit expiry.

Watch for

Rails cookie/session migrations where old serializers are still enabled.
Background job payload formats shared across deploy waves.
aliases: true in YAML parsers reopening gadget vectors.
Monkey patches in initializers that re-enable unsafe loading globally.

Output contract

PR or TRIAGE.md only; no multi-service payload migration unless explicitly authorized.
Inventory lists every Marshal.load, YAML.load, Psych.load, unsafe_load, serializer initializer, and background-job/session decoder in scope.
Each call site is classified as untrusted, trusted-only temporary compatibility, unknown, or framework-owned.
Fix is labeled as JSON/coercion uplift, YAML.safe_load mitigation, legacy decoder isolation, or triage.
Tests prove valid legacy payloads still decode, unsafe classes/tags are rejected, and YAML aliases fail closed unless explicitly approved.

Verification

Before opening the PR or final triage note, verify that:

untrusted input cannot reach Marshal.load or unsafe YAML/Psych loaders;
permitted_classes lists are minimal and justified by local domain objects;
any legacy decoder has telemetry, owner, removal date, and rollback notes;
Rails cookie/session and background-job serializers are checked for deploy compatibility;
no production serialized payloads, secrets, or customer data are committed as fixtures.

Guardrails

Do not add broad Object, Symbol, or application namespace catch-alls to permitted classes.
Do not enable YAML aliases unless the owner explicitly accepts the risk.
Do not remove shared job/session formats without staged rollout planning.
Do not mix unrelated Rails initializer or serialization refactors into the same PR.

Classic Vulnerable Defaults — workflow context.
PyYAML yaml.load — analogous unsafe YAML default in Python.