CVE-2026-42601 - ArchiveBox AddView config override RCE
ArchiveBox AddView accepts a per-crawl config value from the /add/
workflow and affected releases merge that user-controlled data into crawl
configuration without enough validation. The merged configuration is later
exported to archive worker environment variables, where downloader plugins can
consume values such as extra argument lists or binary paths. An attacker who can
submit to the add endpoint can turn a URL ingestion request into command
argument injection and remote code execution on the ArchiveBox host.
The highest-risk deployment is an ArchiveBox web UI with PUBLIC_ADD_VIEW=True.
In that mode the add endpoint is intentionally reachable for public or
bookmarklet-style submissions, and the GitHub advisory notes the affected path
can be exploitable without authentication. This is also an agentic-context risk:
ArchiveBox often stores private research, compliance evidence, crawler output,
browser captures, and datasets later consumed by retrieval systems or agents.
Affected versions
- Vulnerable:
archivebox <=0.8.6rc0 - GitHub Advisory Database patched-version metadata: none listed at time of review
- Fixed/upgrade target from package intelligence:
archivebox 0.8.6rc3+ - Highest-risk condition: the ArchiveBox web app exposes
/add/to untrusted callers, especially withPUBLIC_ADD_VIEW=True, and archive worker plugins that consume*_ARGS_EXTRA,*_BINARY, or equivalent command configuration remain enabled.
Indicator-of-exposure
- The repository installs, packages, deploys, or configures ArchiveBox.
- A deployable target resolves
archivebox <=0.8.6rc0through pip, uv, Poetry, Docker, Compose, Helm, Ansible, Nix, systemd, or a vendored image. - ArchiveBox is run as a web app, not only as a local one-shot CLI.
PUBLIC_ADD_VIEW=True, public URL submission, bookmarklet-style submission, unauthenticated API ingress, or reverse-proxy routing exposes/add/.- Archive worker plugins such as yt-dlp, gallery-dl, wget, git, SingleFile, readability, mercury, or custom extractors accept extra arguments, binary paths, environment-derived command options, or user-controlled crawl config.
- The archive stores sensitive customer pages, research, private URLs, compliance evidence, authenticated browser captures, cookies, API tokens, or material later fed into a RAG/agent pipeline.
Quick checks:
# macOS / Linux
rg -n "archivebox|ArchiveBox|PUBLIC_ADD_VIEW|/add/|YTDLP_ARGS_EXTRA|GALLERYDL_ARGS_EXTRA|_ARGS_EXTRA|_BINARY" .
rg -n "0\\.8\\.6rc0|archivebox==|archivebox/archivebox|pip install .*archivebox|uvx .*archivebox" .
find . -maxdepth 5 -type f \( -iname "*archivebox*" -o -iname "docker-compose*.yml" -o -iname "Dockerfile*" -o -iname "*.service" \) -print
# Windows PowerShell
Get-ChildItem -Recurse -File | Select-String -Pattern "archivebox|ArchiveBox|PUBLIC_ADD_VIEW|/add/|YTDLP_ARGS_EXTRA|GALLERYDL_ARGS_EXTRA|_ARGS_EXTRA|_BINARY"
Get-ChildItem -Recurse -File | Select-String -Pattern "0\.8\.6rc0|archivebox==|archivebox/archivebox|pip install .*archivebox|uvx .*archivebox"
Get-ChildItem -Recurse -File -Include *archivebox*,Dockerfile*,docker-compose*.yml,*.yaml,*.yml,*.service,*.ps1,*.shRemediation strategy
- Upgrade every controlled ArchiveBox package, image, installer, bootstrap
script, and deployment manifest to
0.8.6rc3+or the newest vendor-published release that contains the fix. If a scanner still reports GHAD as having no patched version, document the release evidence used and keep containment in place until the metadata catches up. - Regenerate lockfiles, image digests, SBOMs, checksums, rendered manifests, deployment snapshots, and version evidence for every path that can run ArchiveBox.
- Disable
PUBLIC_ADD_VIEWunless the business explicitly requires anonymous submissions. Require authentication and authorization for URL ingestion in production. - Add fail-closed validation for any repository-owned wrapper, proxy, form,
operator, or extension that passes user crawl config into ArchiveBox. Reject
user-controlled command argument keys such as
*_ARGS_EXTRA, executable path keys such as*_BINARY, and environment overrides not explicitly allow-listed. - Restrict archive workers with least privilege: non-root users, read/write limits to the archive data directory, no Docker socket, no cloud metadata access, and no broad secrets in process environment variables.
- If exposure was possible, rotate secrets available to the ArchiveBox process, review submitted URLs/config fields and worker execution logs, quarantine suspicious snapshots, and separate incident review from the code-change PR.
The prompt
Model context: this prompt was generated by GPT 5.5 Extra High reasoning.
You are remediating CVE-2026-42601 / GHSA-3h23-7824-pj8r (ArchiveBox AddView
per-crawl config override leading to command argument injection and RCE).
Produce exactly one output:
- A reviewer-ready PR/change request that upgrades or contains the vulnerable
ArchiveBox deployment, blocks unsafe public add/config override paths, adds
safe verification, and documents operator cleanup, or
- TRIAGE.md if this repository does not own an affected ArchiveBox deployment,
install path, image, add-route policy, or safe patch path.
## Rules
- Scope only CVE-2026-42601 and directly related ArchiveBox ingestion and
archive-worker hardening.
- Treat archived pages, private URLs, cookies, browser profiles, API tokens,
SSH keys, cloud credentials, webhook secrets, compliance evidence, crawler
output, and agent/RAG datasets as sensitive.
- Do not submit a real exploit payload to any shared or production ArchiveBox
server.
- Do not create a proof-of-concept that runs commands, writes marker files,
opens shells, downloads payloads, exfiltrates environment variables, or reads
archive contents.
- Do not print or commit real ArchiveBox snapshots, submitted URLs, cookies,
secrets, logs, or command environments.
- Do not auto-merge.
## Steps
1. Inventory every ArchiveBox reference controlled by this repository:
Python manifests and lockfiles, uv/Poetry/pip-tools config, Dockerfiles,
Compose files, Helm charts, Kubernetes manifests, Terraform, Ansible, Nix,
systemd units, bootstrap scripts, CI jobs, image digests, SBOMs, reverse
proxy rules, web-app route policy, docs, and runbooks.
2. Determine every resolved ArchiveBox version. Treat `archivebox <=0.8.6rc0`
as vulnerable.
3. Determine every `/add/` exposure path:
- `PUBLIC_ADD_VIEW` and related public submission settings;
- ingress, reverse-proxy, gateway, service-mesh, or firewall rules;
- bookmarklet, browser-extension, API, webhook, or agent integration paths;
- whether callers are authenticated and authorized before adding URLs.
4. Identify every user-controlled config path that can reach archive workers:
per-crawl config fields, form fields, JSON APIs, environment overrides,
wrapper scripts, plugin options, `*_ARGS_EXTRA`, `*_BINARY`, and custom
extractor settings.
5. If this repository does not deploy, package, configure, or route traffic to
ArchiveBox, stop with `TRIAGE.md` listing files checked, the likely runtime
owner, the vulnerable range `archivebox <=0.8.6rc0`, and the target
`archivebox 0.8.6rc3+` or latest fixed release.
6. Upgrade every controlled ArchiveBox target to `0.8.6rc3+` or the newest
vendor-published fixed release available through the repository's normal
distribution channel. If GHAD metadata still shows no patched version,
include release/package evidence in the PR body.
7. Regenerate all derived artifacts controlled by the repository: lockfiles,
image digests, SBOMs, checksum allowlists, rendered manifests, deployment
snapshots, package metadata, and version evidence.
8. Add fail-closed containment where this repository owns configuration or
routing:
- set `PUBLIC_ADD_VIEW=False` by default for production;
- require authentication and authorization for `/add/` and equivalent
ingestion routes;
- block untrusted per-crawl config keys that influence command arguments,
executable paths, worker environment variables, or plugin enablement;
- allow-list only documented safe per-crawl fields;
- prevent public ingress from reaching `/add/` without explicit approval;
- run archive workers without broad secrets, root privileges, Docker socket
access, or cloud metadata access.
9. Add safe verification without executing commands through ArchiveBox:
- dependency/image/SBOM assertions prove every ArchiveBox target is not
`<=0.8.6rc0`;
- config tests prove production `PUBLIC_ADD_VIEW` is false or protected by
an authenticated gateway;
- route/policy tests prove unauthenticated callers cannot reach `/add/`;
- unit tests or static checks prove user-provided crawl config cannot set
`*_ARGS_EXTRA`, `*_BINARY`, or other command-affecting keys;
- secret scanning proves no snapshots, cookies, worker env, or archive data
were committed.
10. Add a PR body section named `CVE-2026-42601 operator actions` that states:
- ArchiveBox versions before and after the change;
- whether `/add/` was publicly reachable;
- whether `PUBLIC_ADD_VIEW` was enabled;
- which worker plugins could consume extra arguments or binary paths;
- which secrets available to the ArchiveBox process should be rotated;
- which submitted URLs, per-crawl config records, worker logs, and archive
snapshots require review or quarantine;
- whether any downstream agent/RAG datasets built from affected archive
output should be revalidated.
11. Run relevant validation: dependency resolution, image build, deployment
render, route/gateway policy tests, unit tests, SBOM refresh, secret scan,
and security scan available in this repository.
12. Use PR title:
`fix(sec): remediate CVE-2026-42601 in ArchiveBox`.
## Stop conditions
- No ArchiveBox deployment, install path, image, package recipe, route policy,
or runtime config is controlled by this repository.
- The resolved ArchiveBox version can only be confirmed from production access
the agent does not have.
- A fixed ArchiveBox release cannot be consumed through the allowed
distribution channel without a broader platform migration.
- Proving exposure would require command execution, reading archive contents,
printing worker environment variables, or submitting payloads to a shared or
production ArchiveBox server.
- Product requirements intentionally allow unauthenticated public submissions
with arbitrary per-crawl config; document the risk and require a
product/security decision.
- Existing snapshots or logs indicate possible command execution; stop after
preserving evidence and documenting incident-response handoff.
- Validation fails for unrelated pre-existing reasons; document the failure
instead of broadening scope.Verification - what the reviewer looks for
- No controlled ArchiveBox package, image, SBOM, deployment manifest, or
bootstrap path resolves to
archivebox <=0.8.6rc0. - The real delivery path changed: dependency pin, lockfile, image digest, install script, rendered deployment, or runbook.
- Public
/add/exposure is disabled or protected by explicit authentication, authorization, and route policy. - User-controlled crawl config cannot set command-affecting keys such as
*_ARGS_EXTRA,*_BINARY, or equivalent plugin execution options. - Verification does not include a working exploit payload or command execution proof.
- Operator actions address secret rotation, snapshot/log review, archive quarantine, and downstream agent/RAG dataset revalidation when exposure was possible.
Watch for
- Updating pip requirements while Docker, Compose, Helm, or package images still pull an older ArchiveBox release.
- Treating
PUBLIC_ADD_VIEWas harmless because the archive is “internal” but exposing it through a shared workspace, agent gateway, webhook, or bookmarklet path. - Blocking
YTDLP_ARGS_EXTRAbut leaving other*_ARGS_EXTRA,*_BINARY, or custom extractor options reachable. - Tests that prove safety by running a command through ArchiveBox. Use static checks, config tests, and route tests instead.
- Logging worker environments while trying to inspect whether secrets were exposed.
References
- GitHub Advisory Database: https://github.com/advisories/GHSA-3h23-7824-pj8r
- ArchiveBox security advisory: https://github.com/ArchiveBox/ArchiveBox/security/advisories/GHSA-3h23-7824-pj8r
- Safety advisory: https://getsafety.com/vulnerabilities/95898/
- PyPI release
archivebox 0.8.6rc3: https://pypi.org/project/archivebox/0.8.6rc3/ - NVD CVE: https://nvd.nist.gov/vuln/detail/CVE-2026-42601