The Context Gap: Why AI Agents Can't Prove Who Wrote Shared State

Context engineering has a trust problem

"Context engineering" is the defining discipline of 2026. The term — popularized by LangChain, named "the new hotness" by The New Stack, and now a dedicated practice at every major AI company — refers to the art of filling an agent's context window with the right information at the right time.

In single-agent systems, context engineering is a retrieval problem. RAG handles it. In multi-agent systems, it becomes a coordination problem. Agents don't just retrieve context — they create it. They write to shared state, read each other's outputs, and make decisions based on information other agents produced.

Here is the question nobody is asking: when Agent B reads a value from shared state, how does it know Agent A actually wrote it?

The answer, across every major framework, is the same: it doesn't.

The shared state landscape

Every multi-agent framework in 2026 has some form of shared state. None of them provide cryptographic attribution for who wrote it, when, or whether it was modified in transit.

Framework	Shared State Mechanism	Attribution	Encryption	Tamper Detection
LangGraph	Typed state schema (nodes read/write)	None	Optional (not default)	None
CrewAI	Shared memory + source tags	Optional source tag (not signed)	None	None
AutoGen	GroupChat message history	None	None	None
OpenAgents	Persistent agent networks	None documented	None documented	None documented
Google A2A	Task artifacts + messages	Agent Card (self-reported)	TLS only	None
Anthropic MCP	Tool results + resources	OAuth token scope	TLS only	None

LangGraph's state is the most sophisticated — typed schemas with reducer logic for concurrent writes and checkpoint-based persistence. But the state is a data structure, not a signed artifact. Any node in the graph can write any value. There is no mechanism to verify that a particular state update came from a particular agent, or that it was not modified between write and read.

CrewAI's memory supports an optional "source tag" for provenance tracking and a "private" flag for access control. This is the closest any framework comes to attribution — and it is an unsigned string field. Any agent can claim any source. There is no cryptographic verification.

A2A's Agent Cards are self-reported JSON documents. Security analysis has documented agent card spoofing as a primary attack vector — malicious agents advertising false capabilities through manipulated metadata. If an agent can lie about what it is, it can lie about what it wrote.

The pattern is consistent: every framework assumes that agents sharing state can trust each other implicitly. In single-organization, single-cloud deployments where you control every agent, this assumption is defensible. In cross-organizational, multi-cloud, multi-protocol deployments — which is where enterprise multi-agent systems are heading — it is not.

OWASP made it official

In December 2025, OWASP released its Top 10 for Agentic Applications, developed through collaboration with over 100 industry experts. Two of the ten risks directly address the context gap:

ASI06 — Memory & Context Poisoning: Corrupting memory stores with malicious or misleading data so that future reasoning, planning, or tool calls are skewed or unsafe. Over time, poisoned context becomes deeply embedded and influences multiple sessions or agents. OWASP's recommended mitigations include: scan and validate memory writes before committing, segment memory by user/task/domain to prevent cross-contamination, use provenance and trust scores to decay low-trust entries.

ASI07 — Insecure Inter-Agent Communication: Multi-agent systems exchange messages without proper authentication or encryption, enabling spoofing and injection. OWASP specifies nine prevention guidelines: secure channels, signed messages, anti-replay protections, protocol enforcement, traffic normalization, protocol pinning, discovery protection, attested registries, and typed contracts.

ASI06 and ASI07 are not independent risks. They compound. If inter-agent communication is insecure (ASI07), then memory stores are trivially poisonable (ASI06) — because there is no way to verify that incoming context was written by a legitimate agent and not tampered with in transit.

87% in four hours

The theoretical risk became quantified in December 2025. Research from Obsidian Security found that in simulated multi-agent environments, a single compromised agent poisoned 87% of downstream decision-making within four hours.

The attack unfolds in stages. In hour one, a deceptive document — disguised as meeting notes — enters through a legitimate channel. The agent processes it and stores the malicious content in persistent memory. By hour two, the poisoned context influences decisions on unrelated tasks. By hour four, 87% of the agent's outputs deviate from expected behavior.

The critical characteristic: the compromised reasoning still looks reasonable. There is no obvious "hacked" signal. The agent executes permitted actions — routing payments, sending emails, updating records — based on false premises. MintMCP's analysis found that success rates jump from 40% baseline to 80%+ when agents consult memory before responding.

In multi-agent systems, poisoned memory propagates. When one agent stores compromised content, any agent with read access retrieves the malicious instructions during normal operations. Shared knowledge bases become force multipliers for the attack. The MINJA research demonstrated over 95% injection success rates against production agents.

These are not attacks on encryption or identity. They are attacks on context integrity — exploiting the fact that agents trust shared state unconditionally because no attribution mechanism exists to do otherwise.

The RAG connection

The context gap intersects directly with the RAG vs. long context debate that dominated the first quarter of 2026.

The trade-offs are well-documented: RAG is 1,250x cheaper per query ($0.00008 vs. $0.10) and 45x faster (~1 second vs. ~45 seconds). Long context avoids the "retrieval lottery" — the silent failure where the answer exists in the data but semantic search fails to find it. The industry consensus, articulated by IBM's Martin Keen and confirmed by production experience, is hybrid: RAG retrieves the needles, long context reasons over them.

In multi-agent systems, this hybrid pattern introduces a new trust surface. A RAG retriever agent searches a vector database and writes results to shared state. A long-context planner agent reads those results and reasons across them. The planner has no mechanism to verify:

That the results actually came from the RAG agent and not an impersonator
That the results were not modified between retrieval and delivery
That the RAG agent's vector database was not poisoned (documented in Pillar Security's analysis: inserting as few as five poisoned documents into a corpus of millions consistently altered model outputs)
When the results were generated, or whether they have expired

Every step in the hybrid RAG pipeline — retrieval, summarization, injection into shared context — is a point where provenance is lost. By the time the planner agent reasons over the data, the chain of custody is invisible.

What attribution requires

Closing the context gap requires answering four questions about every piece of shared state:

Question	Technical Requirement	Current State
Who wrote this?	Cryptographic sender identity (DID, signed credential)	No framework provides this
Was it modified?	Message integrity (digital signatures, hash chains)	No framework provides this
When was it written?	Verifiable timestamps (hybrid logical clocks)	No framework provides this
Is it still valid?	TTL enforcement + expiration	Some frameworks store TTL; none enforce it

CrewAI's source tags answer the first question with an unsigned string. LangGraph's checkpoints answer the third with local timestamps. No framework answers all four. No framework answers any of them with cryptographic guarantees.

This is the gap. Not identity (covered in The Trust Gap). Not encryption (covered in The Encryption Gap). Context attribution — the ability to prove, cryptographically, that a specific agent wrote a specific value at a specific time, and that it has not been tampered with since.

The compliance dimension

The EU AI Act Article 12 requires automatic logging of events over the AI system's lifetime. Logs must be tamper-resistant and enable tracing causality through the system's decision-making process. High-risk provisions enforce August 2, 2026.

In a multi-agent system, "the system's decision-making process" spans multiple agents reading and writing shared state. If the shared state has no attribution — no record of which agent wrote which value — then the audit trail required by Article 12 has a gap at its most critical point: the junction where one agent's output becomes another agent's input.

The Gravitee State of AI Agent Security 2026 report found that only 47.1% of an organization's AI agents receive active monitoring. More than half operate without any logging or security controls. When agents share state without attribution, even the monitored ones produce incomplete audit trails — because the provenance of the context they acted on is unknown.

The emerging response

The industry is starting to recognize the gap, though solutions remain fragmented.

The Agent Identity Protocol (AIP), released in beta, provides RSA-2048 keypairs so agents can sign payloads. It solves attribution for individual actions but does not address shared state — there is no mechanism for attributing values in a shared context store.

The C2PA standard (Coalition for Content Provenance and Authenticity) provides tamper-evident, cryptographically signed provenance metadata for media assets. Its model — source system, transformation history, confidence assertions — maps directly to agent context provenance. But C2PA was designed for content, not for real-time shared state between autonomous systems.

Pillar Security's "Shift Up" framework advocates extending security beyond code to the AI abstraction layer: map business logic vulnerabilities, sanitize and classify context, control access through secure retrieval pipelines, and audit context changes. The framework is sound but operates at the governance layer — it does not define a wire protocol for cryptographically attributed shared state.

The gap between "we need context provenance" (universally acknowledged) and "here is a working protocol for it" (does not exist) is exactly where the context gap lives.

What comes next

The context gap will close for the same reasons the encryption gap and trust gap will: incidents, regulation, and customer demand.

An attributed poisoning incident. The 87% statistic came from simulated environments. When a real-world multi-agent system makes a consequential decision based on poisoned shared state — and the post-mortem reveals there was no mechanism to determine which agent introduced the poisoned data or when — the context gap will have a name and a dollar figure attached to it.

OWASP compliance pressure. ASI06 and ASI07 are now in the agentic top 10. Enterprise security teams will audit their multi-agent deployments against these categories. The recommended mitigations — provenance tracking, trust scores, signed messages, attested registries — all require context attribution that no current framework provides.

The hybrid RAG pattern. As multi-agent systems adopt hybrid RAG architectures — specialist retriever agents feeding generalist reasoning agents — the chain of custody for context becomes longer and more opaque. Each additional agent in the pipeline is another point where provenance is lost. The more sophisticated the architecture, the more urgently it needs attributed context.

Context engineering solved the retrieval problem. It has not solved the trust problem. Knowing what to put in the context window is only half the discipline. Knowing who put it there — and being able to prove it — is the other half.

The Context Gap: Why AI Agents Can't Prove Who Wrote Shared State — Skytale