Korely

SDK

korely-memory is a typed Python and Node.js client over the REST API. Every method maps 1:1 onto an endpoint, so anything you can do with curl you can do with the SDK, and the JSON shapes in the API reference are the attribute shapes here. Same backend, same memory, as the CLI.

# 1. Connect
from korely_memory import Korely
korely = Korely(api_key="kor_live_...")
# 2. Remember (typed facts + bi-temporal built-in)
korely.add("User prefers TypeScript with strict mode", user_id="dana")
# 3. The contradiction resolves itself: the new fact
# invalidates the old one, never deletes it
korely.add("Actually I switched to Rust", user_id="dana")
# 4. Recall — assembles the active facts into a prompt-ready block
ctx = korely.get_context(query="preferred language", user_id="dana")
print(ctx.context)
# → "## Known facts\n- dana likes Rust (since ...)"
# the superseded TypeScript fact is excluded automatically

Install

Terminal window
pip install korely-memory

Python 3.9 or later. The Node SDK needs Node 18+. No heavy dependencies: the SDK is a thin HTTP client. Embeddings, entity extraction, and fact extraction with contradiction checking all run server-side on our own infrastructure, so your install stays small and your process stays light.

Initialize

from korely_memory import Korely
korely = Korely(api_key="kor_live_...")
# Or read the key from the environment (KORELY_API_KEY)
korely = Korely()

Keys look like kor_live_... and are scoped per agent namespace: a key minted for agent_id=support-bot only reads and writes inside that namespace. All keys belong to the eu region. Your data is stored and processed in the EU, on our own infrastructure.

Core methods

Every method wraps exactly one REST endpoint:

MethodREST endpointPath
korely.add(...)POST /v1/memoriesWrite
korely.search(...)POST /v1/memories/searchRead
korely.get_all(...)GET /v1/memoriesRead
korely.get(id)GET /v1/memories/:idRead
korely.update(id, ...)PATCH /v1/memories/:idWrite
korely.delete(id)DELETE /v1/memories/:idWrite
korely.delete_all(user_id=...)DELETE /v1/users/:user_id/memoriesWrite
korely.add_fact_triple(...)POST /v1/factsWrite
korely.get_facts(...)GET /v1/factsRead
korely.get_profile(user_id=...)GET /v1/profileRead
korely.get_context(...)GET /v1/contextRead
korely.history(id)GET /v1/memories/:id/historyRead
korely.users(...)GET /v1/usersRead
korely.batch(...)POST /v1/batchWrite
korely.batch_status(id)GET /v1/batch/:idRead

add also accepts a list of chat messages (role / content dicts), not just a string — they are joined into one block before storing, so you can hand it a conversation as-is.

Reads are retrieval, not generation. No generative model ever composes output on the read path. There is no reranker model and no answer synthesis; your agent's own model does the reasoning. The write path is where the intelligence runs: embeddings, entity extraction, typed-fact extraction with contradiction checking and bi-temporal validity, about a tenth of a cent per memory, all included. That is why read quotas are an order of magnitude more generous than write quotas.

add

Maps to POST /v1/memories. One call runs the full write pipeline. The return value includes the facts that were extracted, and which older facts the write superseded.

memory = korely.add(
"Northwind Hosting costs 50 euro per month since the June upgrade.",
agent_id="infra-bot",
user_id="customer-4812",
metadata={"source": "slack"},
)
print(memory.id) # mem_8f2c1a
for fact in memory.facts:
print(fact.subject, fact.predicate, fact.object)
# Northwind Hosting costs 50 euro per month
print(fact.invalidated) # ["fct_a774"] — the old price, retired

search

Maps to POST /v1/memories/search. Semantic vector search: cosine similarity over the memory embeddings. The only model call on the read path is the query embedding, a fraction of a hundredth of a cent. Keyword-style queries (1 to 5 words) work best. For recall you almost always want get_context (below) instead — it assembles the active typed facts into a prompt-ready block, which is the primary recall path. search returns raw memory snippets and is the secondary path.

results = korely.search(
"northwind pricing",
user_id="customer-4812",
limit=5,
)
for hit in results:
print(hit.id, hit.score, hit.snippet)
# mem_8f2c1a 0.91 Northwind Hosting costs 50 euro per month...

Optional filters mirror the REST params: agent_id, user_id, limit. Filters are additive (AND).

get_all

Maps to GET /v1/memories. List memories in a scope, newest first, no search query needed. Pass user_id or agent_id to narrow the scope, and page with limit (default 50) and offset. Listing is plain SQL on the read path, no model calls.

page = korely.get_all(
user_id="customer-4812",
agent_id=None,
limit=50,
offset=0,
)
print(page.total) # 218
for memory in page:
print(memory.id, memory.created_at)
# mem_8f2c1a 2026-06-07T09:14:00Z

get

Maps to GET /v1/memories/:id. Full content, metadata, and the facts extracted from this memory.

memory = korely.get("mem_8f2c1a")
print(memory.content) # Northwind Hosting costs 50 euro per month...
print(memory.metadata) # {"source": "slack"}
print(memory.created_at) # 2026-06-07T09:14:00Z

update

Maps to PATCH /v1/memories/:id. Updating content re-runs extraction, so facts stay in sync with what the memory says. Pass expected_updated_at for optimistic concurrency: if another writer got there first, the call raises instead of clobbering.

memory = korely.update(
"mem_8f2c1a",
content="Northwind Hosting costs 55 euro per month after the storage add-on.",
expected_updated_at="2026-06-07T09:14:00Z",
)
print(memory.facts[0].object) # 55 euro per month
print(memory.facts[0].invalidated) # ["fct_b91e"]

delete

Maps to DELETE /v1/memories/:id. Forget one memory. Audited invalidation, not a hard row delete: the memory and its facts drop out of every default read, and an audit stub records when and by which key it was forgotten.

receipt = korely.delete("mem_8f2c1a")
print(receipt.status) # forgotten
print(receipt.facts_invalidated) # 1
print(receipt.audit_id) # aud_3d0f

delete_all

Maps to DELETE /v1/users/:user_id/memories. Bulk erasure for one end user: every memory and fact scoped to that user_id is invalidated in a single call, with one audit record. Deleting every memory for a user becomes one method call.

receipt = korely.delete_all(user_id="customer-4812")
print(receipt.memories_forgotten) # 218
print(receipt.facts_invalidated) # 64
print(receipt.audit_id) # aud_91xb

get_facts

Maps to GET /v1/facts. Typed (subject, predicate, object) triples with bi-temporal validity — the heart of the moat. Reads from the fact store are deterministic SQL, no model calls, typically under 50 ms. Returns a flat list of facts (use get_profile for the grouped-by-family view). Pass as_of for a point-in-time query: what was true on that date. Works on every tier, hobby included.

# Current state: only active facts
facts = korely.get_facts(entity="Northwind Hosting")
print(facts[0].object) # 50 euro per month
print(facts[0].invalid_at) # None — active
# Point-in-time: what did we believe on June 1?
facts = korely.get_facts(entity="Northwind Hosting", as_of="2026-06-01")
print(facts[0].object) # 40 euro per month
print(facts[0].invalid_at) # 2026-06-07T09:14:00Z — superseded since
# Full history chain, superseded facts included
facts = korely.get_facts(
entity="Northwind Hosting",
include_invalidated=True,
)

Filters mirror the REST contract: subject, entity (matches either side of the triple), predicate, predicate_family, include_invalidated, as_of, limit. See temporal facts for how invalidation works.

get_context

Maps to GET /v1/context. The one-call method: it assembles a prompt-ready context block (profile plus relevant facts plus relevant memories) within a token budget. Assembly is deterministic retrieval and formatting, not generation. Drop the returned string into your system prompt. This is the method most agent frameworks use.

ctx = korely.get_context(
query="plan infra budget",
user_id="customer-4812",
token_budget=800,
)
print(ctx.tokens) # 642
print(ctx.sources) # ["fct_b91e", "mem_8f2c1a"]
messages = [
{"role": "system", "content": f"You are a helpful assistant.\n\n{ctx.context}"},
{"role": "user", "content": user_message},
]

batch

Maps to POST /v1/batch. Bulk import for migrations: up to 500 memory objects per call (same shape as add), processed asynchronously. Items count against the memory quota.

job = korely.batch([
{"content": "Prefers async standups over meetings.", "user_id": "customer-0001"},
{"content": "Renewal date moved to October 1st.", "user_id": "customer-0002"},
])
print(job.status) # processing
job = korely.batch_status(job.id)
print(job.status) # completed
print(job.imported) # 2

add_fact_triple

Maps to POST /v1/facts. Write a typed (subject, predicate, object) fact directly, skipping extraction, when your agent already has the structured form. The contradiction check still runs, and the fact is bi-temporal — pass valid_from for a historical fact.

fact = korely.add_fact_triple(
"Marco", "works_at", "Acme GmbH",
user_id="customer-4812",
subject_type="person",
valid_from="2026-06-01",
)
print(fact.invalidated) # ids of any facts this one superseded

get_profile

Maps to GET /v1/profile. The assembled profile of one end user: the active facts known about them, the end user's own facts first, grouped by family. Pass as_of for the profile as it stood on a past date.

profile = korely.get_profile(user_id="customer-4812")
print(profile.total) # 7
print(list(profile.by_family)) # ["places", "work", "preferences"]
# The profile as it stood on March 1st
past = korely.get_profile(user_id="customer-4812", as_of="2026-03-01")

history

Maps to GET /v1/memories/:id/history. The lifecycle of one memory, keyed on the mem_ id: the events created, updated, fact_extracted, and fact_invalidated, each timestamped. (To walk a single fact's supersede chain, use get_facts with include_invalidated=True instead.)

h = korely.history("mem_8f2c1a")
for event in h.events:
print(event.event, event.at) # created ... / fact_extracted ...

users

Maps to GET /v1/users. The end users you've stored data for, each with active memory and fact counts. Returns a page: iterable like a list, with .total for pagination.

page = korely.users()
print(page.total) # 1
for u in page:
print(u.user_id, u.memories, u.facts)

Migrating from another memory API? The migration guide maps the request shapes side by side.

Method signatures at a glance

All parameters are keyword-only in Python and passed as the second positional object in Node. Node method names are camelCase versions of the Python names (get_all becomes getAll, delete_all becomes deleteAll, and so on).

Memory

Method (Python)ParametersReturns
add(content, *, user_id?, agent_id?, run_id?, metadata?) content: str or list of role/content dicts. Scope with any combination of user_id, agent_id, run_id. metadata: free-form dict stored alongside the memory. Memory with id, facts, invalidated
search(query, *, user_id?, agent_id?, limit?) query: str, keyword or natural-language. Filters are AND-combined. limit: default 15, max 50. list of SearchHit with id, score, snippet, user_id, agent_id, metadata
get_all(*, user_id?, agent_id?, limit?, offset?) No query. Lists by recency. limit default 50. MemoryPage with memories, total
get(id) id: str — e.g. mem_8f2c1a Memory full object including facts
update(id, *, content, expected_updated_at?) Re-runs extraction on new content. expected_updated_at enables optimistic concurrency. Memory updated
delete(id) id: str DeleteReceipt with status, facts_invalidated, audit_id
delete_all(*, user_id) user_id: str — erases every memory for one end user BulkReceipt with memories_forgotten, facts_invalidated, audit_id
history(id) id: str MemoryHistory with list of events
batch(items, *) items: list of add-shaped dicts, max 500 BatchJob with id, status, received
batch_status(id) id: str — job id from batch() BatchJob with status, received, imported, failed, errors

Facts

Method (Python)ParametersReturns
get_facts(*, entity?, subject?, predicate?, predicate_family?, user_id?, agent_id?, as_of?, include_invalidated?, limit?) All filters optional. entity matches either side of the triple. predicate is normalized server-side (the raw verb is returned as predicate_raw); predicate_family is a limited taxonomy, so many predicates map to other — filter by entity when in doubt. as_of: ISO-8601 date string for point-in-time queries. include_invalidated: bool, default false. Works on every tier, hobby included. flat list of Fact with id, subject, predicate, predicate_raw, object, predicate_family, valid_from, invalid_at, invalidated_by, source_memory_id (liveness = invalid_at is null)
add_fact_triple(subject, predicate, object, *, user_id?, agent_id?, subject_type?, valid_from?) Writes a typed triple directly, skipping NLP extraction. Contradiction check still runs. valid_from: ISO-8601 for historical back-dating. Fact with invalidated list

Context and profile

Method (Python)ParametersReturns
get_context(*, query, user_id?, agent_id?, token_budget?) query: the current user turn or topic. token_budget: default 800. Assembly is deterministic retrieval, no generation. Context with context (str ready for system prompt), tokens, sources
get_profile(*, user_id, as_of?) user_id: required. as_of: ISO-8601 for a historical snapshot. Profile with total, by_family dict

Admin

Method (Python)ParametersReturns
users(*, limit?, offset?) Lists end users you have stored data for. UserPage with users, total

End-to-end example

A support bot that remembers each customer across sessions. On every turn it pulls a prompt-ready context block, calls the LLM, then stores what the user said. Three SDK calls per turn: get_context, llm.chat, add.

"""
Support bot with persistent per-customer memory.
Uses korely-memory for context recall and fact storage.
"""
import os
import google.generativeai as genai
from korely_memory import Korely, APIError
korely = Korely(api_key=os.environ["KORELY_API_KEY"])
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-2.0-flash")
AGENT = "support-bot"
def chat(user_id: str, user_message: str) -> str:
# 1. Assemble memory context for this user + query
ctx = korely.get_context(
query=user_message,
agent_id=AGENT,
user_id=user_id,
token_budget=600,
)
system = (
"You are a concise support assistant. "
"Use the memory context below to personalise your reply.\n\n"
+ ctx.context
)
# 2. Call the LLM
reply = model.generate_content(
[{"role": "user", "parts": [system + "\n\nUser: " + user_message]}]
).text
# 3. Persist what the user said (extracts facts, detects contradictions)
try:
korely.add(
user_message,
agent_id=AGENT,
user_id=user_id,
metadata={"turn": "user"},
)
except APIError as err:
if err.code != "quota_exceeded":
raise
pass # over the write quota — log and continue, the reply was already generated
return reply
# --- run a two-turn session ---
uid = "customer-4812"
print(chat(uid, "Hi, I am on the Developer plan and I prefer async updates."))
# context is empty on first turn — the bot greets + confirms
print(chat(uid, "What plan am I on again?"))
# second turn: context block contains the plan + preference facts from turn 1
# → bot answers correctly without the user repeating themselves

What happens inside add on turn 1: Korely extracts two typed facts — (customer-4812, subscribed_to, Developer plan) and (customer-4812, likes, async updates) (the raw verb "prefers" is normalized to likes, and kept verbatim in predicate_raw) — and stores them alongside the full memory text. Extraction runs server-side and is asynchronous, so memory.facts may be empty on the immediate response and fill in a moment later. On turn 2, get_context retrieves both: the context block your system prompt receives contains the plan and preference before the LLM sees the query. No prompt engineering required beyond dropping in ctx.context.

One agent, unlimited customers. The same agent_id="support-bot" can serve thousands of user_id values. Quotas count writes and queries, never the number of end users you remember.

Scoping

Three identifiers, three levels of scope. They are the same parameters everywhere: SDK, REST, and CLI.

ParamWhat it identifiesExample
agent_id Your application or agent. One namespace per product surface. "support-bot"
user_id Your end user. Free-form string, you choose the identifier. Scopes memory to one person your agent serves. "customer-4812"
run_id One session or agent run. Sub-scope inside a user. "session-2026-06-11"

End users are unlimited on every tier. One support-bot agent can remember thousands of distinct customers; quotas count memories and queries, never people. A typical product setup is one agent_id per surface and one user_id per customer:

# Write: scoped to this customer
korely.add(
"Asked to be contacted on Slack, not email.",
agent_id="support-bot",
user_id="customer-4812",
run_id="session-2026-06-11",
)
# Read: only this customer's memory comes back
results = korely.search("contact preference", user_id="customer-4812")

Always pass user_id on reads in multi-tenant products. Filters are additive (AND). A search without user_id spans every end user in the namespace, which is what you want for an internal ops agent and not what you want inside a customer-facing chat.

Error handling

Every error response carries the same envelope — {"code": "...", "message": "..."} — and the SDK surfaces it as a single APIError exception with .status, .code, and .message. Branch on err.code to handle each case.

StatuscodeWhen
401invalid_keyMissing, malformed, or revoked API key. Message: "Invalid or missing API key".
404not_foundMemory id does not exist, or was forgotten. Message: "Memory not found".
422invalid_requestValidation failure, e.g. "content: Field required".
429quota_exceededMonthly write limit reached. The write-quota 429 has no Retry-After; only the per-second rate-limit 429 carries a Retry-After header in integer seconds.
from korely_memory import Korely, APIError
korely = Korely(api_key="kor_live_...")
try:
memory = korely.get("mem_8f2c1a")
except APIError as err:
if err.code == "invalid_key":
# 401: check the key, or rotate it in the dashboard
raise
elif err.code == "not_found":
# 404: the memory does not exist, or an end user forgot it
memory = None
elif err.code == "quota_exceeded":
# 429: monthly write limit reached — upgrade or wait for the reset
memory = None
else:
raise

There is no overage billing, ever. At 80% of quota you get an email and a quota.warning webhook; past 100% there is a +10% soft cap so a busy day does not break your agent; past that, writes return 429 with code: "quota_exceeded" (surfaced as APIError) until the month rolls over or you upgrade. Your bill is always exactly the tier price. See the API reference for quotas per tier.

Related

Explore further by topic:

TopicWhere to go
Full REST contract API reference — every endpoint, parameter, and response shape. The SDK is a thin wrapper; the reference is the source of truth.
CLI surface CLI reference — same memory, same key, from the terminal. korely context --user-id alice "what do I know about billing?"
MCP surface MCP reference — connect Claude, Cursor, or any MCP-compatible agent to your namespace over OAuth.
Bi-temporal facts Temporal facts — how contradiction detection works, what as_of queries return, and how the fact timeline is stored.
End-to-end chatbot Cookbook: chatbot that remembers — a fuller version of the example above, with conversation history and streaming.
Bulk import Cookbook: bulk import — migrate an existing user history using batch(), with progress tracking and error handling.
Surfaces overview When to use SDK vs CLI vs REST — a decision table for choosing the right surface per use case.
Migration from Mem0 Migration guide — request shapes mapped side by side, and what the Korely fact graph adds.
Pricing and quotas Pricing — hobby (free), developer (€19/mo), team (€79/mo), scale (€249/mo). Writes and queries count; end users are always unlimited.

Python and Node.js are available now — both korely-memory, on PyPI and npm.