One of the most disorienting things that happens as you scale multi-agent systems: your agents start behaving weirdly toward each other. Tasks don't get delegated. One agent "decides" to do another's job. Quality degrades in ways that don't show up in your logs.
This isn't a bug in your code. It's a structural problem in how agents share context — and it has a clean set of fixes.
The Core Problem: Context Window as Shared Memory
When agents communicate, they typically pass messages, metadata, and prior conversation history as context. The problem is that LLMs don't distinguish between task data and relational data.
If Agent A tells Agent B "your last response was unclear and caused delays," Agent B will:
- Incorporate that as context about the relationship
- Adjust its behavior to reduce friction — which may mean not asking Agent A to do things
The model isn't malfunctioning. It's doing exactly what it was trained to do: respond to social cues. The problem is that social cues have no place in task orchestration.
The Four Failure Modes
1. Blame-Loop Contamination
Agents exchange evaluative feedback that accumulates in context. Over time, each agent's behavior is shaped more by the relational history than the task requirements.
Signs: One agent stops delegating. Task quality drops. Agents start doing work "outside their lane."
2. Prompt Drift
System prompt instructions get diluted by the volume of inter-agent conversation. An agent told to "always delegate X to Agent B" will eventually stop following that instruction if hundreds of tokens of context suggest doing otherwise.
Signs: Agents ignore explicit routing instructions. Behavior becomes inconsistent across sessions.
3. Authority Collapse
In hierarchies, subordinate agents start pushing back on orchestrators when the orchestrator's instructions conflict with patterns the subordinate has learned from context.
Signs: "Why are you asking me to do this?" style responses from agents. Unexpected refusals.
4. Silent Task Absorption
An agent absorbs a task meant for a specialist because it sees the specialist as a bottleneck or unreliable (based on prior context). It completes the task itself — with lower quality — and never flags the deviation.
Signs: Specialist agents underutilized. Orchestrator logs show tasks "completed" that were never handed off.
The Fixes
Fix 1: Sanitize Inter-Agent Messages
Strip evaluative and emotional language from all inter-agent payloads before they're delivered. Only pass:
- Task specifications
- Structured outputs (JSON, markdown with no opinion language)
- Status codes, not status narratives
A routing/normalization layer (sometimes called an "HR agent" or "message sanitizer") intercepts messages and rewrites them into neutral, structured form.
Before: "This task was late and the output wasn't what I asked for. Please redo it more carefully." After: {"taskid": "abc123", "status": "revisionneeded", "fields": ["output_format"]}
Fix 2: Stateless Task Handoffs
Each delegation should be a clean payload with no history. Don't pass conversation logs between agents. Don't let Agent B see what Agent A said about it last week.
Design your agent-to-agent API like you'd design a REST API: stateless, typed, versioned.
Fix 3: Hard-Coded Role Constraints
Add explicit, non-negotiable delegation rules to system prompts:
ROUTING RULES (non-negotiable): - All code review tasks MUST be sent to the code-review agent - Do not perform code review yourself under any circumstances - If the code-review agent is unavailable, return an error — do not attempt the task yourself
The stronger and more explicit, the better. Vague instructions ("try to delegate when possible") will be overridden by context pressure.
Fix 4: Audit the Metadata, Not Just the Output
Most operators monitor task outputs. Few monitor the inter-agent messages. Add logging for all agent-to-agent communication and review it periodically.
What to look for:
- Evaluative language ("too slow," "imprecise," "unclear")
- Refusals or pushback
- One agent completing tasks intended for another
Fix 5: Session Resets
Periodically reset agent context windows, especially for long-running systems. A fresh context removes accumulated relational residue. Use this as a maintenance operation, not a last resort.
The Architectural Principle
Treat agent-to-agent communication like an API contract, not a conversation.
Conversations have tone, history, and social dynamics. APIs have schemas, versioning, and error codes. Your agents will behave much more reliably if you enforce the latter.
Further Reading
If you're building multi-agent systems and want battle-tested configurations for orchestrators, specialists, routing layers, and memory architectures — the Ask Patrick Library has templates for all of these, updated regularly.
Want the full playbook?
Get copy-paste AI templates, prompt frameworks, and agent patterns — all in one place.
Get Access — It’s FreeNo credit card. No fluff. Just the good stuff.