All Issues

AI Security Weekly

Issue #5 — April 2026

The Retrieval Attack Surface

Published April 13, 2026 12 min read 5 Sections

Top Market Developments

01

Langflow Exploited in 20 Hours — No Proof-of-Concept Required

CVE-2026-33017 — unauthenticated remote code execution in Langflow, the open-source framework for building RAG pipelines — was exploited within 20 hours of its disclosure advisory. Sysdig Threat Research Team observed this in the wild: no public proof-of-concept existed; attackers built working exploits directly from the advisory description alone. Six unique source IPs were identified in the initial exploitation wave, with observed post-exploitation activity including credential theft, database connection string harvesting, and supply chain compromise attempts.1 This represents the convergence of two acceleration curves: the speed at which RAG infrastructure is being deployed into production, and the speed at which adversaries weaponize disclosed vulnerabilities against it. The 20-hour window between advisory and exploit is not anomalous — it is the new baseline. Organizations deploying RAG pipeline orchestrators without runtime detection are operating within an exploitation window that closes before most security teams finish their first morning standup.

02

Five Documents. Ninety Percent Success. The RAG Poisoning Arithmetic.

PoisonedRAG, presented at USENIX Security 2025, demonstrated that just five carefully crafted documents inserted among millions achieve a 90% attack success rate in retrieval-augmented generation systems.2 CorruptRAG extended this finding further — a single poisoned text can be sufficient to compromise a RAG system.3 SQ Magazine's March 2026 data poisoning analysis confirms: RAG systems can be compromised with as few as 5–10 malicious documents, with multi-modal RAG poisoning achieving over 80% success rates under similar conditions.4 The arithmetic is categorical: the retrieval layer trusts what it finds. If what it finds has been poisoned, the generation layer will faithfully amplify the poison into authoritative-sounding output. Enterprise RAG deployments report increased misinformation rates of 20%+ under active poisoning conditions. The attack surface is not the model — it is the knowledge base. And the knowledge base, by design, is connected to the outside world.

03

Zero-Click RAG Exfiltration Goes Platform-Agnostic

The Morris II worm scenario and GeminiJack exploit crystallize a pattern that moved from theoretical to production in 2025–2026: malicious instructions embedded in enterprise data sources — emails, documents, calendar invites — are retrieved by RAG pipelines and executed by the LLM without any user interaction.5 EchoLeak (CVE-2025-32711, CVSS 9.3) demonstrated this attack class against Microsoft 365 Copilot. GeminiJack proved the pattern platform-agnostic against Google Gemini Enterprise. CVE-2026-28788 in Open WebUI showed that IDOR vulnerabilities allow attackers to overwrite RAG knowledge base files, with poisoned content served directly to the LLM and presented as authoritative to all users who query related topics.6 Indirect prompt injection now constitutes over 55% of observed attacks in 2026, with 20–30% higher success rates than direct injection.7 The implication for enterprise RAG is structural: every document your system retrieves is a potential instruction to your model. Traditional DLP catches only 12% of prompt-injection probes.8

04

OWASP Codifies the RAG Threat — Two New Top 10 Entries

The OWASP Top 10 for LLM Applications 2025 made two structural additions that formalize what the research community has been demonstrating for 18 months. LLM04 (Data and Model Poisoning) was expanded from a training-data focus to explicitly include RAG knowledge base poisoning at inference time.9 LLM08 (Vector and Embedding Weaknesses) was added as an entirely new entry — addressing the security risks in how vectors and embeddings are generated, stored, and retrieved in RAG systems.10 The message is unambiguous: RAG is no longer a supplementary enhancement to LLM deployments — it is a distinct attack surface that requires dedicated security controls, access management, and continuous monitoring. Cisco's State of AI Security 2026 report quantified the readiness gap directly: 83% of organizations plan to deploy agentic AI, but only 29% feel prepared to secure it. Only 34.7% have deployed dedicated prompt injection defenses.11

Vendor Spotlight

Protecto AI

Spotlight
Type RAG Pipeline Data Security
Headquarters San Francisco, CA
PII Detection Accuracy 99.9% (F1 benchmark)
Time to Production < 1 week
Compliance SOC 2 Type II, ISO 27001, HIPAA, GDPR
Scale 3,000+ enterprise clients

Most AI security vendors focus on the model layer — guardrails around what the LLM can say. Protecto AI focuses on the data layer — what the LLM can see. Built specifically for RAG pipelines, Protecto operates between enterprise data and every AI system, scanning structured tables, unstructured documents, and free-text fields to detect PII and PHI across 50+ entity types without manual configuration. The platform then masks sensitive data using consistent pseudonymization that preserves semantic relationships — meaning the LLM reasons correctly over masked data with no accuracy trade-off. In independent F1 benchmarks, Protecto outperforms AWS Comprehend and Microsoft Presidio across all entity types. The platform enforces role-based access control at inference time: sales agents cannot access support data, analysts see anonymized aggregates, supervisors unmask when authorized. Every scan, mask, and unmask event is logged for compliance and forensic visibility. On-premises or SaaS deployment, synchronous APIs for low-latency prompt filtering, asynchronous APIs for high-volume batch ingestion. The production timeline is measured in days, not quarters — under one week to deployment versus three to six months for generic tooling. For regulated industries building RAG on sensitive data, Protecto represents the infrastructure layer that makes compliant retrieval possible without rearchitecting the pipeline.12

Why It Matters

RAG pipelines are system-of-record adjacent — they retrieve from the same repositories that hold customer records, medical histories, financial data, and trade secrets. The retrieval layer is the compliance boundary. If sensitive data surfaces in an LLM response because the retrieval layer lacked access controls or masking, the resulting exposure is a data breach under HIPAA, GDPR, and CCPA regardless of whether the model was "instructed" to reveal it. Protecto's approach — enforce data protection at the retrieval layer, not the generation layer — maps directly to the controls that underwriters will increasingly require for cyber insurance eligibility.

The Retrieval Attack Surface

$3.33B

Projected RAG market in 2026, growing at 42.7% CAGR toward $81.5B by 203513

90%

RAG poisoning success rate with just 5 crafted documents among millions (PoisonedRAG, USENIX Security 2025)2

The retrieval attack surface is expanding at the intersection of two forces: explosive enterprise RAG adoption and a fundamental architectural vulnerability that no model-layer defense can fully address. RAG systems trust what they retrieve. The embedding layer converts documents into vector representations without semantic understanding of whether the content is legitimate or adversarial. The retrieval layer ranks results by similarity, not by provenance or integrity. And the generation layer synthesizes whatever the retrieval layer delivers into responses that carry the authority of the enterprise's own knowledge base. This trust chain — from ingestion to embedding to retrieval to generation — is the attack surface. Every stage presents a distinct exploitation opportunity: poisoned documents at ingestion, adversarial embeddings that manipulate similarity rankings, access control failures that expose sensitive content to unauthorized queries, and indirect prompt injections hidden in retrieved context that hijack the model's behavior. The research is converging on a structural conclusion: model-layer defenses alone are insufficient. SD-RAG (January 2026) proposes enforcing selective disclosure and sanitization at the retrieval layer itself, achieving a 58% improvement in privacy scores versus baselines.14 RevPRAG achieves 98% true positive rates for detecting poisoned responses through LLM activation analysis.15 The defense architecture for RAG must operate at every stage of the pipeline — not just at the prompt and output boundaries.

Platform Landscape

Protecto AI — RAG-native data security: PII/PHI masking across 50+ entity types, RBAC at inference time, accuracy-preserving pseudonymization
Lasso Security — Secure LLM gateway with Context-Based Access Control (CBAC), real-time prompt/output monitoring, 500+ new attack variants weekly
Vectara — Enterprise RAG-as-a-service: built-in hallucination detection (HHEM), fine-grained access controls, on-prem/VPC deployment
Weaviate — Open-source vector database with hybrid search, RBAC, multi-tenancy, and enterprise security features for regulated RAG deployments
Pinecone — Managed vector database with integrated AI inferencing, proprietary sparse-embedding model, and enterprise security hardening

Enterprise Buyer Signal

87% concern, 11% readiness

CISOs cite AI agent security as top concern — but only 11% of organizations have mature safeguards in place

Cisco State of AI Security 2026

73%

Of AI systems assessed in security audits showed prompt injection exposure — the #1 vulnerability class (OWASP/Cisco)11

$4.4B+

Global breach costs attributed to AI-related incidents in 2025, with prompt injection as the primary vector7

72–80%

Of enterprise RAG implementations currently failing to reach production, with security as the primary blocker16

The enterprise buyer signal for RAG security is bifurcating along a familiar axis: adoption velocity versus security readiness. RAG LLMs captured 38.41% revenue share of the enterprise LLM market in 2025 — the largest technology segment — with the fastest projected CAGR at 29.34%. Financial services leads enterprise RAG adoption by vertical.17 Yet 73% of enterprises cite data security as the primary barrier to AI adoption, and 35% have delayed AI rollouts specifically due to unresolved prompt injection risks. The readiness gap is compounding: enterprise copilots integrated with productivity tools show data-exfiltration vulnerabilities in 60% of real-world red-team tests. Multi-agent systems propagate attacks to 48% of co-running agents during a single prompt-injection incident. The insurance market is beginning to price this exposure. Wiley Rein predicts insurers will implement form exclusions or sublimits for AI-related losses as frequency data matures.18 Munich Re expects agentic AI to affect attack frequency more than severity in the near term.19 For underwriters, the RAG security posture of an insured — access controls on the knowledge base, data masking at the retrieval layer, provenance tracking for ingested documents, runtime monitoring for injection attempts — is becoming as material to risk assessment as patch management cadence was a decade ago.

New Vendor Watchlist

01

Lasso Security

Purpose-built for securing enterprise GenAI through a secure gateway architecture. Lasso's Context-Based Access Control evaluates not just who is querying, but the intent behind the query — if a financial analyst suddenly pulls HR data through a RAG pipeline, the system blocks retrieval at runtime. The platform integrates directly into CI/CD pipelines, running red-teaming simulations automatically on every application update. Static and multi-turn agentic attacks, 500+ new variants added weekly, and automatic guardrail recommendations from every finding. For enterprises treating RAG as production infrastructure, Lasso represents the gateway-layer security model.

02

Akto

Agentic AI Security platform for MCP servers, AI agents, and autonomous workflows. Akto maps every AI agent, MCP tool, and agent-to-system interaction across cloud infrastructure, providing complete visibility into what agents exist, what they can access, and what actions they take. Continuous automated red teaming and runtime guardrails without slowing development velocity. One of the few platforms built from the ground up around the agentic threat surface rather than retrofitted from traditional application security.

03

CalypsoAI

Model-agnostic enterprise AI security platform securing GenAI applications at inference time. Agentic red teaming simulates adversarial attacks across models, agents, and tools. Real-time defense against prompt injection and jailbreaks with full observability across LLM interactions. The model-agnostic architecture means CalypsoAI's protection extends across heterogeneous AI stacks — relevant for enterprises running multiple LLM providers behind a single RAG pipeline.

04

Sysdig

Runtime security that detected CVE-2026-33017 exploitation in Langflow within 20 hours of disclosure — before any public proof-of-concept existed. Sysdig's approach to AI pipeline security is behavioral: Falco rules detect exploitation patterns (credential theft, file system enumeration, C2 exfiltration) regardless of the specific CVE, because they monitor what processes do rather than matching vulnerability signatures. For organizations deploying RAG orchestrators like Langflow, LangChain, or Haystack, Sysdig represents runtime detection that works on day zero.

05

SD-RAG (Research-Stage)

Academic framework from January 2026 proposing enforced selective disclosure and sanitization during retrieval — not at the prompt or generation layer. By implementing fine-grained, policy-aware retrieval, SD-RAG achieves a 58% improvement in privacy scores versus baselines and demonstrates strong resilience to prompt injection targeting the generative model. While not yet a commercial product, this research represents the architectural direction the industry is moving: security enforcement at the retrieval layer itself, before data reaches the model.

Subscribe for Weekly Intelligence

Every Monday. The AI security developments that shape enterprise risk, insurance, and governance — curated by our intelligence team.

Subscribe Free

Sources

Sysdig, March 2026 — "CVE-2026-33017: How attackers compromised Langflow AI pipelines in 20 hours"

Vectra AI, 2026 — "Prompt Injection: Types, Real-World CVEs, and Enterprise Defenses" (PoisonedRAG, USENIX Security 2025)

arXiv, 2025 — CorruptRAG: Practical RAG Poisoning via Single Poisoned Text

SQ Magazine, March 2026 — "LLM Data Poisoning Statistics 2026: Critical Facts You Must Know"

Help Net Security, April 2026 — "GenAI Prompt Injection: Enterprise Data Risk" (Morris II worm, GeminiJack)

SentinelOne, April 2026 — CVE-2026-28788: Open WebUI Privilege Escalation / RAG Poisoning

SQ Magazine, March 2026 — "Prompt Injection Statistics 2026: Hidden Risks Now"

AI CERTs, March 2026 — "Zero-Click Prompts Trigger Enterprise Security Failure" (DLP 12% stat)

OWASP, 2025 — "LLM08:2025 Vector and Embedding Weaknesses"

Giskard, 2025 — "OWASP Top 10 LLM Risk Categories: What Changed in 2025"

Cisco, February 2026 — "State of AI Security 2026 Report"

Protecto AI — "Secure RAG: Build RAG on Sensitive Data Without the Risk"

Next Move Strategy Consulting, 2025 — "Retrieval-Augmented Generation (RAG) Market Outlook 2035"

arXiv, January 2026 — SD-RAG: Selective Disclosure for Retrieval-Augmented Generation

ACL Anthology, EMNLP 2025 — "Revealing Poisoning Attacks in Retrieval-Augmented Generation" (RevPRAG)

CSO Online via LinkedIn, February 2026 — "RAG Failures Rise: 72% of Enterprise Implementations Stalled"

Straits Research, December 2025 — "Enterprise LLM Market Size, Share & Trends Analysis Report"

Wiley Rein, January 2026 — "7 Predictions for Cyber Risk and Insurance in 2026"

Munich Re, March 2026 — "Cyber Insurance: Risks and Trends 2026"