Most agents don't need a vector database. Some do. The wrong choice costs you either money (over-engineering) or reliability (under-engineering). This is the decision guide — with real costs, real thresholds, and the exact point where you should upgrade.
Every agent memory system is some combination of these four. They differ in how long information lives, how it's retrieved, and what it costs to operate.
Three files. Zero infrastructure. Handles the first 6 months of any agent deployment.
workspace/
├── MEMORY.md # Semantic — curated long-term facts (under 500 lines)
├── memory/
│ ├── 2026-03-05.md # Episodic — what happened today
│ ├── 2026-03-04.md # Episodic — what happened yesterday
│ └── ...
└── SOUL.md # Includes recall instructions
Working memory = the context window (automatic, no setup needed).
Episodic memory = daily markdown files. Agent writes a summary at session end.
Semantic memory = MEMORY.md. Curated lessons and standing decisions. Promoted from daily files when a pattern repeats 3+ times.
## Memory Protocol
- At session start: search MEMORY.md, then read today + yesterday's daily file
- At session end: write a summary to memory/YYYY-MM-DD.md
- Weekly: review last 7 daily files, promote patterns to MEMORY.md
- If MEMORY.md exceeds 400 lines: prune entries not referenced in 30 days
MEMORY.md at 400 lines ≈ 2,000 tokens. Two daily files ≈ 1,000 tokens. That's 3,000 tokens of memory overhead per session — about $0.003 on Sonnet pricing. At 20 sessions/day, you're paying $0.06/day for memory. Negligible.
A vector store converts text into numerical embeddings and retrieves by meaning similarity instead of keyword match. This is the right tool when file-based search can't find what you need.
pip install chromadb and you're running in 5 minutes.memory_search — Built-in semantic search across your memory files. No external database needed. This is the "ClawVault" approach: files on disk, semantic retrieval on top.Instead of migrating to a database, keep your files and add a semantic search layer on top. Your daily files and MEMORY.md remain the source of truth. The search index is a read-only view that can be rebuilt at any time.
# OpenClaw already does this with memory_search:
# It indexes MEMORY.md + memory/*.md and returns semantic matches
# For a custom setup with Chroma:
import chromadb
client = chromadb.PersistentClient(path="./memory-index")
collection = client.get_or_create_collection("agent_memory")
# Index a daily file
with open("memory/2026-03-05.md") as f:
content = f.read()
# Split into chunks (one per section or paragraph)
chunks = content.split("\n## ")
for i, chunk in enumerate(chunks):
collection.add(
documents=[chunk],
ids=[f"2026-03-05-{i}"],
metadatas=[{"date": "2026-03-05", "source": "daily"}]
)
# Retrieve by meaning
results = collection.query(
query_texts=["decisions about payment processing"],
n_results=5
)
# Returns the 5 most semantically relevant chunks
# — even if none of them contain the word "payment"
Chroma (local): Free. Disk space only. 10K memory chunks ≈ 50MB.
Pinecone (managed): Free up to 100K vectors. Beyond that, $0.33/hr for dedicated pods.
Embedding cost: You pay to convert text to vectors. OpenAI's text-embedding-3-small: $0.02 per million tokens. Indexing 6 months of daily files (≈ 500K tokens) costs $0.01 total. Retrieval queries cost fractions of a cent each.
The real cost is complexity — another service to run, another failure point, another thing to debug at 2 AM.
Q1: How many daily memory files do you have?
→ Under 100 files: File-based. Stop here. You don't need vectors yet.
→ Over 100 files: Continue to Q2.
Q2: Can keyword search find what you need?
→ Yes, grep/search works fine: File-based. Add better file naming conventions.
→ No, I need to search by meaning: Continue to Q3.
Q3: Are you already running a database?
→ Yes (Postgres): Add pgvector. One extension, no new service.
→ No: Continue to Q4.
Q4: Do you want to manage infrastructure?
→ Yes / I run locally: Chroma. Free, local, 5-minute setup.
→ No / I want managed: Pinecone free tier or OpenClaw memory_search.
Memory architecture is a reliability decision, not just a storage decision. The wrong setup causes specific, predictable failures:
Solo agent, 20 sessions/day, file-based:
Memory overhead: 3K tokens/session × 20 × 30 = 1.8M tokens/month → $1.80/month (Sonnet)
3-agent team, 50 sessions/day, file-based:
Memory overhead: 3K × 50 × 30 = 4.5M tokens/month → $4.50/month
Same team + Chroma vector search:
File memory: $4.50 + Embedding indexing: ~$0.05 + Chroma: free (local) → $4.55/month
Same team + Pinecone:
File memory: $4.50 + Embeddings: ~$0.05 + Pinecone: free tier → $4.55/month (up to 100K vectors)
Bottom line: Memory is one of the cheapest parts of running agents. The cost difference between file-based and vector-augmented is essentially zero. The real cost is the engineering time to set it up and the operational complexity of maintaining it.
MEMORY.md and memory/ directory in your workspace.memory_search. Not before.Tested memory patterns, cost optimization, multi-agent coordination, and more. Every guide runs in production. $9/month, cancel anytime.
Get The Library — $9/mo30-day money-back guarantee