Issue #13 — June 2026 | AI Security Weekly

On June 8, 2026, CISA added CVE-2026-42271 — a command-injection flaw in BerriAI’s LiteLLM proxy — to the Known Exploited Vulnerabilities catalog with a federal remediation deadline of June 22.¹ Two days later, the same agency issued Binding Operational Directive 26-04, retiring CVSS-driven prioritization in favor of a risk-based model with 3-day / 14-day / 60-day remediation clocks — the first wholesale revision to the federal vulnerability-management posture since BOD 22-01.² In the same window, the npm and PyPI registries absorbed a coordinated worm wave — Miasma, Phantom Gyp, IronWorm, and a cross-over to PyPI via malicious wheels — with 448 total artifacts across the two registries and 73 Microsoft repositories disabled in a 105-second cascade on June 5.³ The AI Security Incident Catalog at AI REKT now reads 339 incidents across the Feb–Jun 2026 window, with a $292M KelpDAO / LayerZero entry at the top and a still-unfolding “Mini Shai-Hulud” cascade flagged for monitoring.⁴ Each of these is a routine event in its own track. Read together, they describe a single question that is now on the desk of every CISO, every underwriter, and every conformity auditor with an AI estate: what exactly is in your AI stack, and can you prove it?

The Software Bill of Materials — SBOM — was the answer the open-source community gave to the equivalent question for traditional software stacks. Executive Order 14028, NTIA minimum-elements guidance, and a decade of supply-chain hardening produced a substrate where a deployer could, in principle, enumerate every dependency, transitive dependency, and version pin shipping into production. That substrate was never built for the AI stack. Model weights, training datasets, fine-tuning corpora, inference runtimes, MCP connectors, agent frameworks, embedding stores, vector indexes, retrieval pipelines, and the proxy gateways that mediate all of them are dependencies the SBOM specification was not drafted to describe. The June 2026 worm wave is the first quantitative evidence that the cost of not having a stack-level inventory is now being paid in real artifacts — 411 npm packages, 37 PyPI wheels, 73 Microsoft repositories — not in hypothetical exposure. The ASI Market Index reads 37.7 for Week 24, holding against the W23 close — a composite-level reading that obscures considerable motion in every constituent track.

“The SBOM question for AI stacks is not whether we can borrow the traditional-software answer. It is whether we can finish describing the dependency surface before the next worm wave, the next KEV addition inside an AI gateway, or the next risk-based remediation directive forces us to do it under deadline. The June 2026 window made that timeline visible.”

— ASI Intelligence Team observation, W24 2026

This edition examines the SBOM question for AI stacks as it reaches the underwriting file, the June worm wave as the first quantitative stress test of the AI-stack supply chain, the LiteLLM KEV addition as the gateway-level analogue of the agent-runtime CVEs covered in Issue #12, CISA BOD 26-04 as the federal posture shift that re-prices every patch decision in the country, the regulatory track converging on the same August 2 effective date, the W24 Market Index reading and the editorial-integrity pause behind the catch-up cycle, and the five operational moves a high-risk deployer should be making before the SBOM question lands on the intake form.

Section 01

The SBOM Question Reaches the AI Stack

Traditional SBOM Was Never Drafted for the AI Stack

The Software Bill of Materials specification, in its NTIA minimum-elements form, describes seven baseline attributes: supplier, component name, version, unique identifier, dependency relationship, author, and timestamp. The schema was drafted against a software taxonomy where every component is a versioned package, every dependency is enumerable from a manifest, and every upstream supplier is traceable through a code-signing chain. The AI stack does not fit this shape. A model weight is not a versioned package in the same sense — the same weight file produced by a different training run is a different artifact, but neither the NTIA fields nor SPDX nor CycloneDX have a canonical way to express “the provenance of the training corpus this weight was fitted against.” A fine-tuning checkpoint is a dependency, but the deployer’s upstream is rarely the entity that pinned that checkpoint to a hash. An MCP connector and the tools it exposes are dependencies in the operational sense, but the dependency graph the SBOM describes does not extend through the network boundary the agent crosses at run time.

What the Underwriter and the Regulator Are Both Asking

The treaty market has begun, in Q2 2026, to ask cedents for AI-stack inventories as part of the supply-side disclosure for portfolios with significant agent exposure. The European Commission’s Article 50 transparency guidance, whose consultation closed June 3, requires high-risk deployers to maintain machine-readable provenance for AI-generated content and the technical documentation supporting it — a documentation requirement that, in practice, can only be answered if the deployer knows what is in the stack.⁵ The two requests are asking the same question from opposite ends of the same form. An underwriter wants to know what stack components are in your portfolio so the residual risk is priceable. A regulator wants to know so the conformity narrative is auditable. The deployer who can produce a single inventory artifact that satisfies both is the deployer whose intake form will not be the rate-limiter on the renewal cycle.

Section 02

The June 2026 Worm Wave as the First Quantitative Stress Test

A coordinated worm wave moved through the npm and PyPI registries across the first two weeks of June 2026, with documented sub-campaigns including a Red Hat-scoped npm compromise (Miasma), a binding.gyp-based execution trick (Phantom Gyp), a self-propagating package (IronWorm), a repository-level pivot via .mcp.json and IDE-configuration changes, and a cross-registry jump to PyPI via malicious wheels using .pth startup hooks.³ Public counts attribute 411 npm packages and 37 PyPI wheels to the wave (448 total artifacts) and report 73 Microsoft repositories disabled in a 105-second cascade on June 5. The wave is significant not because the absolute volume is unprecedented — the npm ecosystem has absorbed larger-volume events — but because the propagation surfaces moved deliberately through the configuration substrate the AI stack now depends on: .mcp.json for agent connector wiring, binding.gyp for native-extension builds, IDE configuration for the developer workstation that increasingly serves as the agent’s control plane.

448

Total artifacts disabled across npm (411 packages) and PyPI (37 wheels) tied to the June 2026 worm wave — the first quantitative stress test of the AI-stack supply chain³

Microsoft repositories disabled in a 105-second cascade on June 5, 2026 — the configuration-substrate propagation rate the worm wave demonstrated against a hardened operator³

The cross-registry hop is the structural finding. A self-propagating campaign that lands in npm via a binding.gyp trick and then jumps to PyPI via wheel-installation hooks has crossed the boundary the SBOM tooling assumed was a partition. Most enterprise SBOM workflows enumerate dependencies per registry and per language ecosystem. An attacker who reads the AI-stack dependency graph as a single surface — node modules, Python wheels, model files, vector-store binaries, MCP connector definitions — has the structural insight the defender’s tooling does not yet operationalize. The wave does not need to compromise a majority of the stack to be effective; it needs the SBOM to be inaccurate enough that the deployer’s detection logic misses the right component.

Configuration-as-attack-surface is the second structural finding. The .mcp.json pivot reported in the campaign timeline is a configuration file whose contents the AI stack reads at agent-initialization time to determine which MCP servers to invoke and with what authorization. The traditional SBOM does not enumerate configuration files; the AI-stack SBOM has to. A deployer whose inventory captures the Python wheels and Node packages but not the .mcp.json manifests pinned in production is carrying the gap the worm wave demonstrated.

Section 03

The Gateway CVE Becomes the Federal Clock

CISA added CVE-2026-42271 to the Known Exploited Vulnerabilities catalog on June 8, 2026, with a federal remediation deadline of June 22.¹ The vulnerability is a command-injection flaw in BerriAI’s LiteLLM — one of the most widely deployed open-source proxies for routing inference traffic across model providers — specifically in the POST /mcp-rest/test/connection and POST /mcp-rest/test/tools/list endpoints, where request fields could spawn a subprocess. The fix is LiteLLM v1.83.7. Defensive mitigations published by Help Net Security include blocking the test endpoints at the proxy, restricting network access to the LiteLLM admin surface, and rotating any credentials the proxy stored on behalf of upstream providers.⁶

Operator Takeaway — Diagnostic Surfaces in AI Gateways Are Production Surfaces

The compromised LiteLLM endpoints were explicitly labelled “test” endpoints in the codebase — designed for connection-validation workflows. The exploitation pattern reinforces a doctrine point: any diagnostic or admin surface in an AI gateway is, in practice, a high-risk production surface. The gateway sits in the trust path between the deployer’s production environment and every model provider the gateway routes to; a successful exploit of the gateway is a successful supply-chain compromise of every upstream and downstream the proxy mediates. The defensive response is to treat the AI gateway as a network-segmented control plane with its own access policy — not as a developer-tier component that happened to land in production.

The KEV addition is the federal validation that the agent-gateway surface is the new vulnerability-management priority. CISA’s KEV catalog has historically been dominated by classical exploitation classes — CMS plugins, network-edge devices, enterprise productivity stacks. An AI-gateway entry in the catalog is a signal that the federal exploitation-evidence pipeline has begun reading the AI stack as a peer to those classical surfaces. The mid-June deadline is the second-order effect: every federal agency with an AI deployment that uses LiteLLM had a 14-day clock from disclosure to remediation, which means the federal posture moved from awareness to operational mitigation faster than the agent-runtime CVEs from W22–W23 did.¹

The OpenClaw CVE is the parallel high-severity entry. Published June 12, 2026 with a CVSS base score of 9.8, CVE-2026-53838 affects OpenClaw versions prior to 2026.5.27 with a state-mutation flaw in node-pairing reconnection that can confuse approval-scope decisions and bypass approval restrictions.⁷ The functional impact — approval-scope bypass — is the same class of failure as the agent-runtime CVEs in Issue #12: the connector substrate that delegates authorization is the substrate being targeted. Together, the LiteLLM and OpenClaw entries describe a vulnerability pattern that is no longer about a single product line; it is about the layer that mediates trust between an AI agent and the systems it acts upon.

Section 04

CISA BOD 26-04 Re-Prices the Patch Decision Nationwide

On June 10, 2026, CISA issued Binding Operational Directive 26-04, replacing CVSS-driven prioritization with a risk-based model and superseding BOD 19-02 and BOD 22-01.² The new directive uses four operational inputs — asset exposure, KEV status, exploit automation, and technical impact — to assign remediation timelines into four tiers: 3 days for the highest-risk vulnerabilities, 14 days for the next tier, 60 days for the third, and “fix on system upgrade” for the lowest. The 3-day tier carries a forensic-triage expectation that goes beyond patching: agencies meeting that clock are expected to produce a brief artifact describing whether the affected asset showed signs of compromise during the exposure window.

The structural significance of BOD 26-04 is not the clocks — the federal vulnerability-management posture has had clocks since BOD 19-02. The structural significance is that CVSS, as a prioritization metric, has been formally retired at the federal level in favor of an exposure-and-exploitability composite. Every patch decision a federal agency makes is now re-priced against the new tier definitions. Private-sector vulnerability management programs that took CVSS as the canonical prioritization input are now operating against a federal posture that explicitly does not. The treaty market reads this as a structural input: the underwriting question “what is the deployer’s patch discipline?” can no longer be answered purely with a CVSS-tier policy. The deployer needs an exposure-aware policy that maps onto the new BOD 26-04 tiers, or the answer to the underwriting question is “our policy is now misaligned with the federal posture.”

3 / 14 / 60

Day clocks for the three accelerated tiers in CISA BOD 26-04, replacing CVSS-only prioritization with an exposure-and-exploitability composite — the federal posture pivot the regulatory track has now formalized²

339

Documented incidents in the AI REKT AI Security Incident Catalog across the Feb–Jun 2026 window, organized by attack type and including a supply-chain cascade view⁴

The directive is the regulatory track’s answer to the gateway CVE. A LiteLLM admin surface exposed to the internet on a federal asset is now, under BOD 26-04, a 3-day clock once the CVE has KEV status. The directive operationalizes the assumption that exploited gateway components are the same kind of emergency as exploited network-edge components. The regulatory track, the supply-chain track, and the vulnerability track all collapsed into the same operational tempo this week — not as a coincidence, but because the federal posture is now structured around exposure-and-exploitability, which is exactly the dimension along which the AI stack now creates the largest residual risk.

Section 05

Article 50, the Convergence Point, and the August 2 Effective Date

The European Commission’s Article 50 transparency consultation closed June 3, 2026.⁵ Final operational language is scheduled to land before the August 2 high-risk effective date that triggers the obligations for in-scope deployers. The substantive transparency obligations — user notice when interacting with AI, labelling of synthetic content, machine-readable provenance for deepfakes — have been covered in earlier issues. What changes this week is the convergence: the same August 2 date now serves as the operational deadline for three separate hardening tracks running in parallel.

The regulatory track sets the labelling and disclosure obligations. Article 50’s requirements compound with the broader EU AI Act conformity narrative the high-risk track has been preparing against since W19. The convergence is that the labelling pipeline a deployer builds to satisfy Article 50 is, in technical terms, the same pipeline that satisfies machine-readable provenance for AI-generated artifacts — which is, in turn, the same pipeline that an SBOM workflow for AI stacks would query to determine what model produced what content at what time. The three workflows are not separate disciplines; they are different views of the same dependency graph.

The software supply-chain track sets the SBOM expectations. The August 2 effective date for the EU AI Act high-risk obligations falls inside the window during which the federal vulnerability-management posture (BOD 26-04) and the registry-ecosystem hardening (npm/PyPI worm-wave response) are still adjusting. A deployer with high-risk EU exposure and US federal supply-chain exposure is operating against three policy surfaces simultaneously — and the SBOM artifact is the one document each of those surfaces can be reconciled against.³

The talent / model-supply track sets the operator-level evidence requirements. The model provider’s security disclosure record — CVD program, KEV-eligible vulnerability submissions, model-card transparency — is the supply-side counterpart of the SBOM. A deployer building on top of a provider with a functioning disclosure substrate carries different evidence than a deployer whose providers do not publish a disclosure record. The August 2 date does not name this dimension explicitly, but the conformity narrative that satisfies the date will be read by an auditor who knows the difference.

Section 06

Market Index — W24 Reading and the Editorial-Integrity Pause

Market Index Reading — W24

37.7 for W24, against the W23 close of 37.7. A composite-level reading taken with every track moving in the same window. The vulnerability track absorbed the LiteLLM KEV addition (CVE-2026-42271) and the OpenClaw approval-scope bypass (CVE-2026-53838). The threat track absorbed the Google fraud advisory documenting AITM, Quishing, ClickFix, and calendar-phishing TTPs at scale. The regulatory track absorbed CISA BOD 26-04. The software supply-chain track absorbed the June worm wave (448 artifacts across npm + PyPI). The talent / model-supply track absorbed continued model-provider disclosure-program scaling. The research / publication track absorbed the AI REKT incident-catalog crossing 339 documented entries. Signal of the Week: practitioner / industry signal — the deterministic ranker selected the industry-context signal at score 0.7188, reflecting a week in which structural context outweighed any single headline event.

The ASI Market Index reads 37.7 for Week 24, against the W23 close of 37.7. The Signal of the Week is the practitioner / industry signal, selected by the deterministic ranker at score 0.7188 — a week-classification that reflects the absence of a single dominant headline and the presence of motion across every constituent track. The composite-level reading is the signature of a system whose subsystems are all moving by roughly equivalent magnitudes against the same week: not a quiet week, but the steady-state of a system actively absorbing pressure on every dimension simultaneously.

The per-signal readings for W24: VSS 55.3, TSS 48.1, AIRS 38.7 on the public signals; the regulatory track at 66.0, the software supply-chain track at 39.2, the talent / model-supply track at 47.6, and the research / publication track at 42.2 on the proprietary side. The vulnerability track held its reading on the strength of the LiteLLM KEV addition and the OpenClaw advisory.¹⁷ The threat track absorbed the Google fraud and scams advisory documenting AITM, Quishing, ClickFix, and calendar-phishing TTPs at production scale.⁸ The regulatory track held at 66.0 on the issuance of CISA BOD 26-04 and the proximity of the August 2 EU AI Act date.²⁵ The software supply-chain track absorbed the June worm wave.³ The research / publication track absorbed the AI REKT catalog crossing 339 incidents.⁴ The full index page carries the W24 per-signal audit.

A note on the editorial-integrity pause. This issue ships mid-cycle. The Issue #12 cadence would have placed Issue #13 in the Jun 15 publication slot. The ASI Intelligence Team held the issue back one cycle while we completed a methodology pass on the value-loss-per-incident input that feeds the AIRS scoring layer — an internal posture point: a publication that scores risk to a treaty market does not ship before its underlying numbers can be defended at the record level. The pause produced no change to W24’s composite reading, but it produced a cleaner audit substrate behind the per-incident impact figures the reader sees on the public surfaces. Weekly cadence resumes Monday, June 22, with Issue #14 against its scheduled horizon title.

Section 07

The Bottom Line — Five Moves Before the SBOM Question Lands

Watchlist — Operational Moves Before the Stack-Inventory Question Reaches the Intake Form

June 19, 2026

Inventory the AI-stack dependency graph as a single surface, not per-registry

The June worm wave demonstrated that an attacker reads npm packages, Python wheels, MCP connector definitions, and IDE configuration files as one dependency graph. The deployer’s SBOM workflow has to do the same. Enumerate model weights, fine-tuning checkpoints, embedding indexes, vector stores, MCP server manifests (.mcp.json), and proxy gateway versions alongside the classical SBOM fields. The inventory is the artifact the underwriter, the regulator, and the incident responder will all ask for — in some cases in the same week.³

Segment every AI gateway as a network-control plane, not a developer-tier service

The LiteLLM KEV addition (CVE-2026-42271) is a representative entry — the test endpoints were never intended to be production-exposed, but the gateway-as-a-developer-tool deployment pattern often leaves them reachable. A deployer who runs LiteLLM, OpenClaw, or any equivalent gateway should treat the admin surface as a network-segmented control plane with its own access policy, audit logging, and credential-rotation schedule. The remediation deadline was 14 days for federal agencies; the underwriting consequence is longer-lived.¹⁷

Re-tier the internal vulnerability-management policy against BOD 26-04

CVSS-only prioritization is now misaligned with the federal posture. A private-sector vulnerability-management program that still reads CVSS as the canonical input is operating against an outdated reference standard. Map the deployer’s policy onto the new 3-day / 14-day / 60-day tier definitions, calibrated against the four BOD 26-04 inputs (asset exposure, KEV status, exploit automation, technical impact). The treaty market will read the alignment — or its absence — in the next renewal cycle.²

Reconcile the SBOM artifact against the Article 50 conformity narrative

The labelling pipeline a deployer builds to satisfy Article 50 transparency obligations is, in technical terms, the same pipeline that feeds machine-readable provenance for AI-generated artifacts — and the same pipeline that an SBOM workflow can query for model-to-content lineage. Build the artifacts once, against a unified schema, and reference them from both the conformity narrative and the underwriting file. Eight weeks to August 2 is build time, not planning time.⁵

Treat the AI-incident catalog as a market-evidence document

The AI REKT catalog crossing 339 documented incidents (Feb–Jun 2026) — with a $292M KelpDAO/LayerZero entry at the top and a still-unfolding “Mini Shai-Hulud” cascade flagged for monitoring — is now operating at the scale where a treaty market or a regulatory counterparty can ask: what does your deployment’s incident-history posture look like against this catalog? Maintain a quarterly review of the catalog against the deployer’s own incident classification, and ensure the alignment is documentable. Defensibility precedes display.⁴