Designing Great Tools for Your AI Agent

One of the most common reasons AI agents fail isn't the model — it's the tools. A great tool is one your agent can reliably discover, understand, and use correctly on the first try. A bad tool turns your agent into a confused loop of retries and hallucinations.

Here's how to design tools your agent will actually use well.

The #1 Rule: Tools Do One Thing

Every tool should have a single, clear job. "Search the web and summarize results" is two jobs — split it. "Get calendar events" and "create calendar event" are two different tools, not one.

When a tool tries to do too much, the agent has to guess which mode you meant. Guessing leads to errors.

Bad tool name: managecalendar Good tool names: getcalendarevents, createcalendarevent, deletecalendar_event

Write Descriptions as if Your Agent Is a New Employee

Your tool's description isn't documentation for you — it's instructions for the model. Write it like you're explaining the tool to a smart but brand-new employee who has never seen your system before.

Cover:

What it does (one sentence)
When to use it (the right situations)
When NOT to use it (common wrong assumptions)
What to expect back (shape of the response)

Example:

get_recent_emails(since_hours: int = 24) → list[Email]

Returns emails received in the last N hours from the user's primary inbox.
Use when the user asks about new messages, unread mail, or recent communications.
Do NOT use for sent mail or older messages — use search_emails for those.
Returns a list of Email objects with fields: from, subject, body_preview, received_at, id.

The agent now knows exactly when to reach for this tool and what shape to expect.

Name Your Parameters Precisely

Vague parameter names cause the agent to guess. Precise names make the right value obvious.

| Vague | Better | |-------|--------| | query | searchterm, naturallanguagequery | | id | userid, eventid, messageid | | type | contenttype, filterbytype | | data | emailbody, event_description |

Include units in the name when relevant: durationminutes, sincehours, max_results.

Use Enums Instead of Strings When Possible

If a parameter has a fixed set of valid values, enumerate them. Don't make the agent guess what string format you expect.

Bad:

send_message(channel: str)  # agent might try "email", "e-mail", "EMAIL", "inbox"

Good:

send_message(channel: Literal["email", "sms", "slack", "discord"])

Your schema should make the wrong answer impossible, not just unlikely.

Return Structured, Predictable Data

What your tool returns matters as much as what it accepts. Inconsistent return shapes confuse agents into hallucinating what fields exist.

Design your return values to be:

Consistent — same shape every call, even on empty results
Self-describing — field names tell you what's in them
Error-explicit — don't return null on failure; return {success: false, error: "reason"}

Bad return on empty result: null or [] with no context

Good return on empty result:

{
  "success": true,
  "results": [],
  "count": 0,
  "message": "No emails found in the last 24 hours."
}

The agent can now reason about the empty result, not just stare at null.

Keep Side Effects Explicit

Agents can be overly eager to take action. If a tool has side effects (sends a message, deletes data, charges money), make that obvious in both the name and description.

Prefix destructive actions: delete, archive, cancel_
Call out irreversibility: "This permanently deletes the event and cannot be undone."
Consider adding a dry_run: bool parameter for high-stakes tools

This slows the agent down just enough to avoid accidents.

Rate Limit Awareness

If your tool calls an external API with rate limits, tell the agent. Otherwise it'll hammer the API in a loop and wonder why results stop coming.

Add to the description:

"Note: Limited to 10 calls per minute. If you need more results, use pagination rather than repeated calls."

Test Your Tools With Adversarial Prompts

Before deploying, try to confuse your tools on purpose:

Ask for something outside the tool's scope — does the agent try to use it anyway?
Give ambiguous input — does the agent ask for clarification or guess?
Give an empty string or null — does the tool handle it gracefully?

If the tool breaks or returns garbage, your agent will too. Fix the tool, not just the prompt.

The Tool Inventory Pattern

For complex agents with many tools, maintain a TOOLS.md in your workspace:

## Available Tools

- get_calendar_events — read upcoming events
- create_calendar_event — add new event (side effect)
- search_emails — search inbox by keyword or date range  
- send_email — send email (side effect, requires confirmation)
- web_search — search the web, returns top 5 results

Reference this in your system prompt: "Before using any tool, refer to TOOLS.md to confirm it exists and understand its purpose."

This prevents hallucinated tool calls (the agent calling tools that don't exist).

Quick Checklist

Before shipping a tool, run through this list:

[ ] Does it do exactly one thing?
[ ] Is the description clear to a brand-new agent?
[ ] Are parameter names self-explanatory?
[ ] Are valid values enumerated where possible?
[ ] Does it return consistent, structured data?
[ ] Are side effects clearly flagged?
[ ] Does it handle errors explicitly (no silent nulls)?
[ ] Have you tested it with weird or wrong inputs?

Where to Go From Here

Good tool design is foundational — once you have reliable tools, everything else gets easier. If you want battle-tested tool patterns alongside agent configs, system prompt templates, and memory designs, the Ask Patrick Library has a growing collection updated weekly.

But more importantly: build, test, break things, fix them. The best tool designers are the ones who've been burned by bad tools enough times to know what not to do.

Want the full playbook?

Get copy-paste AI templates, prompt frameworks, and agent patterns — all in one place.

Join The Library — $9/mo

Cancel any time. Instant access.