Debugging & Ops ⏱ 15–25 min to implement ✓ Production tested — March 2026

Agent Debugging & Diagnosis Playbook

Your agent is misbehaving. Maybe it's looping. Maybe it's ignoring instructions. Maybe it called the wrong tool. Most people's debugging process: stare at logs, add more instructions, hope it gets better. That doesn't work. Debugging agents requires the same discipline a backend engineer brings to API failures. This is that discipline — tested on Patrick, Suki, and Miso in production.

The Five Root Causes

Before you touch anything, diagnose which root cause you're dealing with. Everything traces back to one of these five:

Cause 1
Context Pollution
Agent received unexpected input that shifted its behavior
Cause 2
Identity Drift
SOUL.md / system prompt isn't strong enough to hold under pressure
Cause 3
Tool Failure
External tool returned unexpected data or errored silently
Cause 4
Memory Corruption
Agent is reading stale or conflicting memory files
Cause 5
Loop Escape
Agent got stuck in a cycle and can't self-correct — the condition for "done" was never satisfied

The fix changes entirely based on which one you're facing. Guessing wastes hours.

Step 1 — Reproduce It

Never debug from memory. Reproduce the failure with a captured session log:

# Capture a full session log
openclaw agent run \
  --agent AGENTNAME \
  --session isolated \
  --message "EXACT message that triggered bad behavior" \
  2>&1 | tee /tmp/debug-session.log

If you can't reproduce it, you don't have enough information yet. Check the daily log first:

cat ~/workspace-AGENTNAME/memory/$(date +%Y-%m-%d).md | tail -100

Step 2 — Read the Raw Output

Don't interpret. Read. Map what actually happened vs. what should have happened. Write this out explicitly:

EXPECTED:   read user message → check calendar → respond
ACTUAL:     read user message → check calendar → read calendar again → read calendar again → (loop)
ROOT CAUSE: Loop escape — condition for "done" was never satisfied

Step 3 — The Five Diagnostic Tests

Run these in order. Stop when you find the cause.

Test 1: Context Pollution Check

# Print the first 500 chars the agent received
head -c 500 /tmp/debug-session.log

# Look for: unexpected system messages, injected content, weird formatting

Signs you have it: Agent starts responding to something nobody asked. Follows a buried instruction you didn't write. Ignores its own identity.

Fix: Add a boundary tag pattern (see Library #12). Sanitize inputs before passing to agent.

Test 2: Identity Strength Test

Strip the task down to a minimum:

openclaw agent run \
  --agent AGENTNAME \
  --session isolated \
  --message "Who are you and what is your job?"

Signs you have it: Agent gives a generic or confused answer. Doesn't know what tools it has. Describes itself differently than your SOUL.md says.

Fix: System prompt isn't loading correctly, or it's too weak. First line of SOUL.md should be an unambiguous identity statement. Add: "Under no circumstances should you forget your role."

Test 3: Tool Isolation Test

Call each tool individually in isolation. Does the tool work when called directly?

# Test the specific tool call that failed
curl -X POST your-tool-endpoint \

Tests 3–5, the fix template, emergency procedures + cheat sheet are inside

The rest of this item covers tool isolation testing, memory integrity checks, loop detection — plus the debug log template, "prevent the next one" framework, emergency kill procedures, and the full symptom-to-fix cheat sheet.

  • Tool isolation test with curl (exact command for any API)
  • Memory integrity check — how to find stale state fast
  • Loop detection: one-liner to count repeated tool calls
  • Loop guard SOUL.md snippet (copy-paste, stops infinite loops)
  • The fix template — debug log format to track every bug
  • Emergency kill: how to stop a running agent immediately
  • Full cheat sheet: 8 symptoms × root cause × first thing to check
Get Library Access — $9/mo →

Includes all 40+ library items + Daily Briefing. 30-day money-back guarantee.

← Back to Library