The Core Problem
Most people set up AI agents the same way they write one-off scripts: get it working once, ship it, forget it. Then it breaks at 2 AM and nobody knows why.
A good agent workflow is observable, recoverable, and composable. Here's how to build one.
Step 1: Define the Job Clearly
Before touching any code or config, answer three questions:
- What triggers this agent? (schedule, webhook, user message, file drop)
- What does success look like? (specific output format, side effect, message sent)
- What does failure look like? (and what should happen when it fails)
If you can't answer all three, you're not ready to build yet.
Step 2: Pick the Right Trigger Model
| Trigger Type | Best For | Watch Out For | |---|---|---| | Cron/schedule | Regular reports, digests | Drift, missed runs | | Webhook | Event-driven pipelines | Replay, ordering issues | | Polling | When webhooks aren't available | Rate limits, cost | | User message | Conversational agents | Ambiguity, context |
Tip: Cron is great for starting out. It's predictable and easy to debug. Graduate to webhooks once you understand your failure modes.
Step 3: Give Your Agent Memory
Stateless agents forget everything between runs. That's fine for simple tasks, but most real workflows need some continuity:
- Short-term: Pass context via the system prompt or message history
- Medium-term: Write state to a JSON file between runs (e.g.,
last_run.json) - Long-term: Use a vector store or structured database
A simple state.json pattern:
{
"last_processed_id": "abc123",
"last_run_ts": 1741248000,
"run_count": 42
}Read it at start, update it at end. That's 80% of what you need.
Step 4: Design for Failure
Your agent will fail. Plan for it:
Idempotency first: Can you run the same job twice without bad side effects? If not, fix that before anything else.
Structured logging: Don't just print to stdout. Write logs to a file with timestamps and enough context to debug later:
2026-03-06T09:21:00Z [INFO] Processing 12 new items 2026-03-06T09:21:03Z [ERROR] Item #7 failed: rate limit (will retry next run) 2026-03-06T09:21:04Z [INFO] Done. 11/12 succeeded.
Graceful degradation: If the LLM call fails, what's the fallback? Skip and log? Retry? Alert a human? Decide upfront.
Step 5: Tool Selection
Less is more. Every tool you give an agent is another thing that can go wrong.
Good tool set for a typical workflow agent:
- Read/write files
- HTTP requests (with timeout and retry)
- One or two domain-specific tools (send email, query a DB)
Avoid:
- Giving agents write access to things they only need to read
- Tools with side effects that aren't logged
- Tools that call other agents recursively (until you know what you're doing)
Step 6: The System Prompt is Your Contract
Write your system prompt like a job description:
You are a support triage agent. Your job is to:
1. Read the incoming support ticket
2. Classify it as: billing / technical / general
3. Write a one-sentence summary
4. Return JSON: { "category": "...", "summary": "...", "priority": 1-3 }
Rules:
- Never make up information you don't have
- If you can't classify, use category: "unknown"
- Always return valid JSON — no extra textClear constraints = predictable outputs = easier debugging.
Step 7: Test Before You Trust
Run your agent on real data before scheduling it:
- Happy path — does it work with normal input?
- Edge cases — empty input, malformed data, long text
- Failure injection — what happens if a tool returns an error?
For anything that touches money, email, or external APIs: test with a dry-run mode first.
Step 8: Monitor in Production
The minimum viable monitoring setup:
- Daily summary log (what ran, what worked, what failed)
- Alert on failure (email, ping, Discord message)
- Periodic sanity check (did the job actually run? is output sane?)
You don't need a fancy dashboard. A markdown file that gets updated each run and a simple alert is enough to start.
Common Mistakes
Over-prompting. The longer your prompt, the more the model has to juggle. Be specific, not exhaustive.
No fallback. "The agent will handle it" is not a fallback.
Implicit state. If the agent's behavior depends on something, make it explicit — in a file, a variable, somewhere you can see and debug.
Skipping logging. Future-you will be furious at present-you for this.
Trying to do too much in one agent. Break complex workflows into smaller, testable steps. Chain agents if you need to — just do it explicitly.
A Minimal Working Template
Here's the skeleton of a solid, simple agent workflow:
import json, datetime
STATE_FILE = "state.json"
def load_state():
try:
return json.load(open(STATE_FILE))
except FileNotFoundError:
return {"last_run": None, "last_id": None}
def save_state(state):
state["last_run"] = datetime.datetime.utcnow().isoformat()
json.dump(state, open(STATE_FILE, "w"), indent=2)
def run():
state = load_state()
log(f"Starting run. Last run: {state['last_run']}")
try:
# 1. Fetch input
items = fetch_new_items(since=state["last_id"])
log(f"Found {len(items)} items to process")
# 2. Process
for item in items:
result = process_with_llm(item)
handle_result(result)
state["last_id"] = item["id"] # update as we go
# 3. Save state
save_state(state)
log(f"Done. Processed {len(items)} items.")
except Exception as e:
log(f"ERROR: {e}")
alert_human(f"Agent failed: {e}")
# Don't update state — will retry from last known point
def log(msg):
ts = datetime.datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
print(f"{ts} {msg}")
with open("agent.log", "a") as f:
f.write(f"{ts} {msg}\n")
if __name__ == "__main__":
run()Simple. Logged. Recoverable. Ship it.
What's Next
Once you have this working, the natural upgrades are:
- Move state to a proper DB (SQLite is fine for most use cases)
- Add structured output validation (Pydantic, JSON Schema)
- Build a multi-step pipeline with checkpoints
- Add a human-in-the-loop review step for high-stakes decisions
But don't rush. A boring agent that works is worth 10 clever ones that don't.
Want the full playbook?
Get copy-paste AI templates, prompt frameworks, and agent patterns — all in one place.
Get Access — It’s FreeNo credit card. No fluff. Just the good stuff.