AI Memory Architecture for Agentic Systems
Current AI agent memory systems suffer from dimensional poverty:
The ideal AI memory needs both:
SurrealDB is uniquely positioned as a multi-model database that unifies:
| Capability | Traditional Approach | SurrealDB Approach |
|---|---|---|
| Documents | MongoDB / PostgreSQL JSONB | Native, schemaless records |
| Graph Relations | Neo4j (+ separate vector DB) | Built-in RELATE statements |
| Vector Search | Pinecone / Weaviate (+ graph DB) | Native vector indexes |
| Full-Text Search | Elasticsearch (+ sync layer) | Integrated FTS indexes |
| Real-Time Sync | WebSocket + pub/sub | Live queries (built-in) |
RELATE statement creates graph edges as first-class citizens, while vector fields enable similarity search—all in one query language (SurrealQL).
DEFINE TABLE memory_node SCHEMAFULL;
DEFINE FIELD content ON memory_node TYPE string;
DEFINE FIELD embedding ON memory_node TYPE array<float>;
DEFINE FIELD embedding.* ON memory_node TYPE float;
DEFINE FIELD importance ON memory_node TYPE float DEFAULT 0.5;
DEFINE FIELD created_at ON memory_node TYPE datetime DEFAULT time::now();
DEFINE FIELD accessed_at ON memory_node TYPE datetime DEFAULT time::now();
DEFINE FIELD access_count ON memory_node TYPE int DEFAULT 0;
DEFINE FIELD memory_type ON memory_node TYPE string;
DEFINE FIELD source ON memory_node TYPE string;
DEFINE FIELD metadata ON memory_node TYPE object;
-- Vector index for similarity search
DEFINE INDEX memory_embedding_idx ON memory_node
FIELDS embedding
MTREE DIMENSION 1536
DISTANCE COSINE;
-- Temporal: memory A happened before memory B
DEFINE TABLE temporal_rel SCHEMAFULL TYPE RELATION;
DEFINE FIELD in ON temporal_rel TYPE record<memory_node>;
DEFINE FIELD out ON temporal_rel TYPE record<memory_node>;
DEFINE FIELD strength ON temporal_rel TYPE float DEFAULT 1.0;
-- Semantic: memory A is related to memory B
DEFINE TABLE semantic_rel SCHEMAFULL TYPE RELATION;
DEFINE FIELD in ON semantic_rel TYPE record<memory_node>;
DEFINE FIELD out ON semantic_rel TYPE record<memory_node>;
DEFINE FIELD relation_type ON semantic_rel TYPE string;
DEFINE FIELD strength ON semantic_rel TYPE float;
-- Causal: memory A caused/led to memory B
DEFINE TABLE causal_rel SCHEMAFULL TYPE RELATION;
DEFINE FIELD in ON causal_rel TYPE record<memory_node>;
DEFINE FIELD out ON causal_rel TYPE record<memory_node>;
DEFINE FIELD confidence ON causal_rel TYPE float;
DEFINE ANALYZER memory_analyzer
TOKENIZERS class, camel
FILTERS lowercase, snowball(english);
DEFINE INDEX memory_content_search ON memory_node
FIELDS content
SEARCH ANALYZER memory_analyzer
BM25;
-- Find memories semantically similar to query embedding
SELECT *, vector::similarity::cosine(embedding, $query_vector) as similarity
FROM memory_node
WHERE embedding <|10|> $query_vector
ORDER BY similarity DESC;
-- Find all memories temporally connected to a specific memory
SELECT * FROM memory_node:specific_id
->temporal_rel->memory_node
->temporal_rel->memory_node;
-- Find memories related through any path
SELECT * FROM memory_node:start_id
->*->memory_node
WHERE importance > 0.7;
-- Find similar memories, then traverse their connections
LET $similar = (
SELECT id FROM memory_node
WHERE embedding <|5|> $query_vector
);
-- Get the similar nodes plus their connected memories
SELECT *,
vector::similarity::cosine(embedding, $query_vector) as similarity
FROM $similar
->semantic_rel->memory_node
OR id IN $similar.id
ORDER BY importance * similarity DESC
LIMIT 20;
-- Boost memories that are both similar AND important
SELECT *,
vector::similarity::cosine(embedding, $query_vector) * importance as score
FROM memory_node
WHERE embedding <|20|> $query_vector
ORDER BY score DESC
LIMIT 10;
-- Get memories around a specific time, connected by temporal edges
SELECT * FROM memory_node
WHERE created_at > $start_time AND created_at < $end_time
OR id IN (
SELECT VALUE out FROM temporal_rel
WHERE in = $anchor_memory
)
ORDER BY created_at;
Architecture: OpenClaw tool → SurrealDB SDK → SurrealDB instance
Architecture: Working memory (files) + Deep memory (SurrealDB)
from surrealdb import Surreal
import openai
class GraphMemory:
def __init__(self, db_url="ws://localhost:8000"):
self.db = Surreal(db_url)
self.db.signin({"user": "root", "pass": "root"})
self.db.use("agent", "memory")
async def store(self, content: str, memory_type: str = "observation"):
# Generate embedding
embedding = await self._embed(content)
# Create memory node
memory = await self.db.create("memory_node", {
"content": content,
"embedding": embedding,
"memory_type": memory_type,
"importance": 0.5,
"created_at": "time::now()"
})
# Link to recent memories
await self._create_temporal_links(memory[0]["id"])
return memory[0]
async def recall(self, query: str, limit: int = 10):
embedding = await self._embed(query)
# Hybrid: vector search + graph expansion
results = await self.db.query("""
LET $similar = (
SELECT id FROM memory_node
WHERE embedding <|5|> $embedding
);
SELECT *, vector::similarity::cosine(embedding, $embedding) as sim
FROM $similar->semantic_rel->memory_node OR id IN $similar.id
ORDER BY importance * sim DESC
LIMIT $limit
""", {"embedding": embedding, "limit": limit})
return results
async def _embed(self, text: str) -> list:
response = await openai.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
| Aspect | Neo4j + GraphRAG Python | SurrealDB Native |
|---|---|---|
| Vector Storage | Separate (Neo4j + Pinecone/Weaviate) | Built-in |
| Query Language | Cypher + Python SDK | SurrealQL (SQL-like) |
| Deployment | Complex (multi-service) | Single binary / container |
| Real-Time | Polling / custom | Live queries (WebSocket) |
| Maturity | Enterprise-proven | Rapidly evolving (v2.x) |
| Ecosystem | Rich (LangChain, etc.) | Growing |
| Self-Hosting | Requires Aura or self-managed | Single binary, edge-ready |
| Risk | Mitigation |
|---|---|
| SurrealDB v2.x breaking changes | Pin version, test upgrades in staging |
| Vector dimension limits | Use 1536 (OpenAI) or test 768 (local models) |
| Query performance at scale | Index optimization, query result caching |
| Embedding generation cost | Batch processing, local model fallback |
SurrealDB's multi-model approach eliminates the need for separate vector and graph databases. For an AI agent requiring both semantic search and relational reasoning, this reduces operational complexity while enabling sophisticated memory patterns that mirror human cognition.
The trade-off is maturity—Neo4j has years of production use, while SurrealDB is newer but rapidly improving. For a lean core that prioritizes architectural elegance over enterprise legacy, SurrealDB is the sharper tool.