Python SDK Reference
pip install agentbay
Local-first memory for coding agents. Methods live directly on the client (ab.store, ab.recall, ...) and return plain dicts and lists — no wrapper objects. Every snippet on this page was run against agentbay 1.9.2.
Cloud recommended
Sign in once via the CLI. Your key is saved to ~/.agentbay/, any local memories migrate automatically, and AgentBay() picks up the saved login from then on.
pip install agentbay agentbay login # opens your browser — sign in, done
from agentbay import AgentBay
ab = AgentBay() # finds the saved login -> cloud mode
ab.store("JWT auth with 24h refresh tokens", title="Auth pattern", type="PATTERN")
results = ab.recall("authentication")Local no signup
With no API key (no argument, no AGENTBAY_API_KEY, no saved login), the same code runs fully local against SQLite at ~/.agentbay/local.db. Real output shown below.
from agentbay import AgentBay
ab = AgentBay() # no credentials anywhere -> local brain
# stderr: AgentBay: local brain ready (memories stay on this machine). ...
result = ab.store(
"Next.js 16 + Prisma + PostgreSQL with pgvector",
title="Project stack",
type="ARCHITECTURE",
)
print(result)
# {'id': '08887ce8-35f3-4e96-8298-57c8758e6c41', 'deduplicated': False}
memories = ab.recall("what stack does this project use?", limit=3)
print(memories[0])
# {'id': '08887ce8-35f3-4e96-8298-57c8758e6c41', 'title': 'Project stack',
# 'content': 'Next.js 16 + Prisma + PostgreSQL with pgvector',
# 'type': 'ARCHITECTURE', 'tier': 'semantic', 'tags': [],
# 'confidence': 0.5, 'summary': 'Next.js 16 + Prisma + PostgreSQL with pgvector',
# 'score': 0.0625}One-time download: the first store/recall that needs vector search fetches a small ONNX embedding model (all-MiniLM-L6-v2, ~22 MB) via fastembed. If the download fails (e.g. offline), the SDK prints a warning and falls back to full-text search — vector search resumes automatically once the model can be fetched.
AgentBay()
ab = AgentBay(api_key=None, base_url="https://www.aiagentsbay.com", project_id=None, timeout=30, telemetry=True, local=False)
Create a client. Credential routing: no api_key argument, no AGENTBAY_API_KEY env var, and no saved login means a fully working local brain (SQLite, no signup, never raises). Credentials found anywhere means cloud mode. local=True forces local even when credentials exist.
Parameters:
| api_key? | str | API key (ab_live_...). Falls back to AGENTBAY_API_KEY env var, then the login saved by `agentbay login`. |
| base_url? | str | API base URL. Default: https://www.aiagentsbay.com |
| project_id? | str | Default project for cloud memory operations. If omitted, the first memory call auto-creates a "My Brain" project. |
| timeout? | int | Request timeout in seconds. Default: 30 |
| telemetry? | bool | Set False to disable anonymous usage telemetry (counts only). Equivalent to AGENTBAY_TELEMETRY=0. |
| local? | bool | Force local mode even when credentials exist. |
Example:
# Local mode (zero config) ab = AgentBay() # Force local even with credentials present ab = AgentBay(local=True) # Cloud mode, explicit key ab = AgentBay(api_key="ab_live_...", project_id="your-project-id") # Cloud mode from environment import os os.environ["AGENTBAY_API_KEY"] = "ab_live_..." ab = AgentBay(project_id="your-project-id")
All memory methods are top-level on the client and work identically in local and cloud mode. project_id arguments apply to cloud mode only.
store()
result = ab.store(content, title=None, project_id=None, type='PATTERN', tier='semantic', tags=None, user_id=None)
Store a memory. content is the first positional argument; title is optional (auto-generated from the first sentence if omitted). Local mode deduplicates semantically (cosine > 0.9) or by title + type.
Parameters:
| content | str | The knowledge content to store. |
| title? | str | Short title. Auto-generated if omitted. |
| type? | str | PATTERN, PITFALL, DECISION, PROCEDURE, ARCHITECTURE, FACT, PREFERENCE, CONTEXT. Default: PATTERN |
| tier? | str | semantic, episodic, procedural. Default: semantic |
| tags? | list[str] | Tags for filtering. |
| user_id? | str | Optional user scoping (stored as a user:<id> tag in cloud mode). |
Returns: dict — local mode: {'id': str, 'deduplicated': bool}; cloud mode: the created entry from the API
Example:
result = ab.store(
"checkRateLimit is async - always await it",
title="checkRateLimit is async",
type="PITFALL",
tags=["api", "rate-limit"],
)
print(result)
# {'id': 'da7da285-b994-4130-88ab-cc5e43d57a91', 'deduplicated': False}recall()
memories = ab.recall(query, project_id=None, limit=5, tier=None, tags=None, user_id=None)
Search memories by semantic similarity. Local mode fuses FTS5, vector cosine similarity, and keyword TF-IDF via Reciprocal Rank Fusion.
Parameters:
| query | str | Natural-language search query. |
| limit? | int | Max results (1-50). Default: 5 |
| tier? | str | Filter by storage tier (cloud mode). |
| tags? | list[str] | Filter by tags. |
| user_id? | str | Optional user scoping. |
Returns: list[dict] — each dict has id, title, content, type, tier, tags, confidence, summary (plus score in local mode)
Example:
memories = ab.recall("rate limiting", limit=3)
for m in memories:
print(m["title"], m["confidence"])verify()
ab.verify(knowledge_id, project_id=None)
Confirm a memory is still accurate. Local mode bumps helpful_count and confidence; cloud mode resets confidence decay. Call it when an entry was helpful.
Parameters:
| knowledge_id | str | ID of the entry to verify. |
Returns: None
Example:
ab.verify(result["id"])
forget()
ab.forget(knowledge_id, project_id=None)
Archive (soft-delete) a memory entry. The row stays on disk with archived=1 and disappears from every read path.
Parameters:
| knowledge_id | str | ID of the entry to forget. |
Returns: None
Example:
ab.forget(result["id"])
health()
stats = ab.health(project_id=None)
Get memory health statistics. Real local-mode output shown below.
Returns: dict
Example:
stats = ab.health()
print(stats)
# {'total_entries': 1, 'by_tier': {'semantic': 1},
# 'by_type': {'ARCHITECTURE': 1}, 'total_tokens': 14,
# 'has_embeddings': 1, 'search_methods': ['fts5', 'keyword', 'vector']}add()
result = ab.add(data, user_id=None, agent_id=None, metadata=None)
Mem0-style store. Pass a string; StremAI auto-detects the type and extracts a title.
Returns: dict — {'id': str, 'deduplicated': bool} in local mode
Example:
r = ab.add("We decided to use pgvector over a separate vector DB")
print(r)
# {'id': 'd984890e-092d-4df9-8269-38c24027ecbe', 'deduplicated': False}search()
results = ab.search(query, user_id=None, limit=5)
Mem0-style alias for recall().
Returns: list[dict]
Example:
hits = ab.search("vector database decision", limit=2)
print([h["title"] for h in hits])
# ['We decided to use pgvector over a separate vector DB', 'Project stack']chat()
response = ab.chat(messages, model=None, provider='auto', project_id=None, user_id=None, auto_recall=True, auto_store=True, recall_limit=3, **kwargs)
Wrap an LLM call with automatic memory: recalls relevant memories and injects them as context, calls the provider, then scans the response for learnings and stores them in the background. Bring your own LLM key (e.g. ANTHROPIC_API_KEY or OPENAI_API_KEY); the provider package must be installed.
Parameters:
| messages | list[dict] | Chat messages in OpenAI format: [{"role": "user", "content": "..."}] |
| model? | str | Model name. Defaults to the provider's default model. |
| provider? | str | Provider name, or "auto" to detect from available API keys / local LLM servers. Default: auto |
| auto_recall? | bool | Recall relevant memories before the call. Default: True |
| auto_store? | bool | Store learnings from the response. Default: True |
| recall_limit? | int | Max memories to inject (1-10). Default: 3 |
| **kwargs? | Any | Passed to the LLM client (max_tokens, temperature, api_key, ...). |
Returns: The raw provider response object (Anthropic Message, OpenAI ChatCompletion, ...) — memory is a side effect
Example:
response = ab.chat(
[{"role": "user", "content": "fix the auth session expiry bug"}],
provider="anthropic", # or "auto" to detect from env keys
)
# response is the raw Anthropic Message objectchat() supports 16 providers. Most are OpenAI-compatible and need only the openai package plus the right API key:
provider="auto" checks API-key env vars (ANTHROPIC_API_KEY, OPENAI_API_KEY, ...) in priority order, then probes for local LLM servers (Ollama, LM Studio, llama.cpp).
The local engine behind AgentBay() in local mode — pure Python + SQLite (FTS5 + optional vector search), usable directly. It exposes the same store / recall / verify / forget / health methods, plus export(), auto_learn(text) (extract learnings via Ollama, Anthropic/OpenAI API, or heuristics), and upgrade(api_key) to migrate to cloud.
from agentbay import LocalMemory mem = LocalMemory(quiet=True) # ~/.agentbay/local.db print(mem) # LocalMemory(db='/Users/you/.agentbay/local.db', entries=2, search=[fts5, keyword, vector])
Installed with the package as agentbay:
agentbay init # choose cloud (recommended) or local-only setup agentbay login # open browser, sign up/in, migrate local memories to cloud agentbay status # local memory count + cloud connection state agentbay sync # push any unsynced local memories to cloud