SDK

korely-memory is a typed Python and Node.js client over the REST API. Every method maps 1:1 onto an endpoint, so anything you can do with curl you can do with the SDK, and the JSON shapes in the API reference are the attribute shapes here. Same backend, same memory, as the CLI.

# 1. Connect
from korely_memory import Korely
korely = Korely(api_key="kor_live_...")

# 2. Remember (typed facts + bi-temporal built-in)
korely.add("User prefers TypeScript with strict mode", user_id="dana")

# 3. The contradiction resolves itself: the new fact
#    invalidates the old one, never deletes it
korely.add("Actually I switched to Rust", user_id="dana")

# 4. Recall — assembles the active facts into a prompt-ready block
ctx = korely.get_context(query="preferred language", user_id="dana")
print(ctx.context)
# → "## Known facts\n- dana likes Rust (since ...)"
#    the superseded TypeScript fact is excluded automatically

Install

pip install korely-memory

Python 3.9 or later. The Node SDK needs Node 18+. No heavy dependencies: the SDK is a thin HTTP client. Embeddings, entity extraction, and fact extraction with contradiction checking all run server-side on our own infrastructure, so your install stays small and your process stays light.

Initialize

from korely_memory import Korely

korely = Korely(api_key="kor_live_...")

# Or read the key from the environment (KORELY_API_KEY)
korely = Korely()

Keys look like kor_live_... and are scoped per agent namespace: a key minted for agent_id=support-bot only reads and writes inside that namespace. All keys belong to the eu region. Your data is stored and processed in the EU, on our own infrastructure.

Core methods

Every method wraps exactly one REST endpoint:

Method	REST endpoint	Path
`korely.add(...)`	`POST /v1/memories`	Write
`korely.search(...)`	`POST /v1/memories/search`	Read
`korely.get_all(...)`	`GET /v1/memories`	Read
`korely.get(id)`	`GET /v1/memories/:id`	Read
`korely.update(id, ...)`	`PATCH /v1/memories/:id`	Write
`korely.delete(id)`	`DELETE /v1/memories/:id`	Write
`korely.delete_all(user_id=...)`	`DELETE /v1/users/:user_id/memories`	Write
`korely.add_fact_triple(...)`	`POST /v1/facts`	Write
`korely.get_facts(...)`	`GET /v1/facts`	Read
`korely.get_profile(user_id=...)`	`GET /v1/profile`	Read
`korely.get_context(...)`	`GET /v1/context`	Read
`korely.history(id)`	`GET /v1/memories/:id/history`	Read
`korely.users(...)`	`GET /v1/users`	Read
`korely.batch(...)`	`POST /v1/batch`	Write
`korely.batch_status(id)`	`GET /v1/batch/:id`	Read

add also accepts a list of chat messages (role / content dicts), not just a string — they are joined into one block before storing, so you can hand it a conversation as-is.

Reads are retrieval, not generation. No generative model ever composes output on the read path. There is no reranker model and no answer synthesis; your agent's own model does the reasoning. The write path is where the intelligence runs: embeddings, entity extraction, typed-fact extraction with contradiction checking and bi-temporal validity, about a tenth of a cent per memory, all included. That is why read quotas are an order of magnitude more generous than write quotas.

add

Maps to POST /v1/memories. One call runs the full write pipeline. The return value includes the facts that were extracted, and which older facts the write superseded.

memory = korely.add(
    "Northwind Hosting costs 50 euro per month since the June upgrade.",
    agent_id="infra-bot",
    user_id="customer-4812",
    metadata={"source": "slack"},
)

print(memory.id)                       # mem_8f2c1a
for fact in memory.facts:
    print(fact.subject, fact.predicate, fact.object)
    # Northwind Hosting costs 50 euro per month
    print(fact.invalidated)            # ["fct_a774"] — the old price, retired

search

Maps to POST /v1/memories/search. Semantic vector search: cosine similarity over the memory embeddings. The only model call on the read path is the query embedding, a fraction of a hundredth of a cent. Keyword-style queries (1 to 5 words) work best. For recall you almost always want get_context (below) instead — it assembles the active typed facts into a prompt-ready block, which is the primary recall path. search returns raw memory snippets and is the secondary path.

results = korely.search(
    "northwind pricing",
    user_id="customer-4812",
    limit=5,
)

for hit in results:
    print(hit.id, hit.score, hit.snippet)
    # mem_8f2c1a 0.91 Northwind Hosting costs 50 euro per month...

Optional filters mirror the REST params: agent_id, user_id, limit. Filters are additive (AND).

get_all

Maps to GET /v1/memories. List memories in a scope, newest first, no search query needed. Pass user_id or agent_id to narrow the scope, and page with limit (default 50) and offset. Listing is plain SQL on the read path, no model calls.

page = korely.get_all(
    user_id="customer-4812",
    agent_id=None,
    limit=50,
    offset=0,
)

print(page.total)  # 218
for memory in page:
    print(memory.id, memory.created_at)
    # mem_8f2c1a 2026-06-07T09:14:00Z

get

Maps to GET /v1/memories/:id. Full content, metadata, and the facts extracted from this memory.

memory = korely.get("mem_8f2c1a")

print(memory.content)     # Northwind Hosting costs 50 euro per month...
print(memory.metadata)    # {"source": "slack"}
print(memory.created_at)  # 2026-06-07T09:14:00Z

update

Maps to PATCH /v1/memories/:id. Updating content re-runs extraction, so facts stay in sync with what the memory says. Pass expected_updated_at for optimistic concurrency: if another writer got there first, the call raises instead of clobbering.

memory = korely.update(
    "mem_8f2c1a",
    content="Northwind Hosting costs 55 euro per month after the storage add-on.",
    expected_updated_at="2026-06-07T09:14:00Z",
)

print(memory.facts[0].object)       # 55 euro per month
print(memory.facts[0].invalidated)  # ["fct_b91e"]

delete

Maps to DELETE /v1/memories/:id. Forget one memory. Audited invalidation, not a hard row delete: the memory and its facts drop out of every default read, and an audit stub records when and by which key it was forgotten.

receipt = korely.delete("mem_8f2c1a")

print(receipt.status)             # forgotten
print(receipt.facts_invalidated)  # 1
print(receipt.audit_id)           # aud_3d0f

delete_all

Maps to DELETE /v1/users/:user_id/memories. Bulk erasure for one end user: every memory and fact scoped to that user_id is invalidated in a single call, with one audit record. Deleting every memory for a user becomes one method call.

receipt = korely.delete_all(user_id="customer-4812")

print(receipt.memories_forgotten)  # 218
print(receipt.facts_invalidated)   # 64
print(receipt.audit_id)            # aud_91xb

get_facts

Maps to GET /v1/facts. Typed (subject, predicate, object) triples with bi-temporal validity — the heart of the moat. Reads from the fact store are deterministic SQL, no model calls, typically under 50 ms. Returns a flat list of facts (use get_profile for the grouped-by-family view). Pass as_of for a point-in-time query: what was true on that date. Works on every tier, hobby included.

# Current state: only active facts
facts = korely.get_facts(entity="Northwind Hosting")
print(facts[0].object)      # 50 euro per month
print(facts[0].invalid_at)  # None — active

# Point-in-time: what did we believe on June 1?
facts = korely.get_facts(entity="Northwind Hosting", as_of="2026-06-01")
print(facts[0].object)      # 40 euro per month
print(facts[0].invalid_at)  # 2026-06-07T09:14:00Z — superseded since

# Full history chain, superseded facts included
facts = korely.get_facts(
    entity="Northwind Hosting",
    include_invalidated=True,
)

Filters mirror the REST contract: subject, entity (matches either side of the triple), predicate, predicate_family, include_invalidated, as_of, limit. See temporal facts for how invalidation works.

get_context

Maps to GET /v1/context. The one-call method: it assembles a prompt-ready context block (profile plus relevant facts plus relevant memories) within a token budget. Assembly is deterministic retrieval and formatting, not generation. Drop the returned string into your system prompt. This is the method most agent frameworks use.

ctx = korely.get_context(
    query="plan infra budget",
    user_id="customer-4812",
    token_budget=800,
)

print(ctx.tokens)   # 642
print(ctx.sources)  # ["fct_b91e", "mem_8f2c1a"]

messages = [
    {"role": "system", "content": f"You are a helpful assistant.\n\n{ctx.context}"},
    {"role": "user", "content": user_message},
]

batch

Maps to POST /v1/batch. Bulk import for migrations: up to 500 memory objects per call (same shape as add), processed asynchronously. Items count against the memory quota.

job = korely.batch([
    {"content": "Prefers async standups over meetings.", "user_id": "customer-0001"},
    {"content": "Renewal date moved to October 1st.", "user_id": "customer-0002"},
])

print(job.status)   # processing

job = korely.batch_status(job.id)
print(job.status)   # completed
print(job.imported) # 2

add_fact_triple

Maps to POST /v1/facts. Write a typed (subject, predicate, object) fact directly, skipping extraction, when your agent already has the structured form. The contradiction check still runs, and the fact is bi-temporal — pass valid_from for a historical fact.

fact = korely.add_fact_triple(
    "Marco", "works_at", "Acme GmbH",
    user_id="customer-4812",
    subject_type="person",
    valid_from="2026-06-01",
)

print(fact.invalidated)  # ids of any facts this one superseded

get_profile

Maps to GET /v1/profile. The assembled profile of one end user: the active facts known about them, the end user's own facts first, grouped by family. Pass as_of for the profile as it stood on a past date.

profile = korely.get_profile(user_id="customer-4812")
print(profile.total)              # 7
print(list(profile.by_family))    # ["places", "work", "preferences"]

# The profile as it stood on March 1st
past = korely.get_profile(user_id="customer-4812", as_of="2026-03-01")

history

Maps to GET /v1/memories/:id/history. The lifecycle of one memory, keyed on the mem_ id: the events created, updated, fact_extracted, and fact_invalidated, each timestamped. (To walk a single fact's supersede chain, use get_facts with include_invalidated=True instead.)

h = korely.history("mem_8f2c1a")
for event in h.events:
    print(event.event, event.at)   # created ... / fact_extracted ...

users

Maps to GET /v1/users. The end users you've stored data for, each with active memory and fact counts. Returns a page: iterable like a list, with .total for pagination.

page = korely.users()
print(page.total)                 # 1
for u in page:
    print(u.user_id, u.memories, u.facts)

Migrating from another memory API? The migration guide maps the request shapes side by side.

Method signatures at a glance

All parameters are keyword-only in Python and passed as the second positional object in Node. Node method names are camelCase versions of the Python names (get_all becomes getAll, delete_all becomes deleteAll, and so on).

Memory

Method (Python)	Parameters	Returns
`add(content, *, user_id?, agent_id?, run_id?, metadata?)`	`content`: str or list of role/content dicts. Scope with any combination of `user_id`, `agent_id`, `run_id`. `metadata`: free-form dict stored alongside the memory.	`Memory` with `id`, `facts`, `invalidated`
`search(query, *, user_id?, agent_id?, limit?)`	`query`: str, keyword or natural-language. Filters are AND-combined. `limit`: default 15, max 50.	list of `SearchHit` with `id`, `score`, `snippet`, `user_id`, `agent_id`, `metadata`
`get_all(*, user_id?, agent_id?, limit?, offset?)`	No query. Lists by recency. `limit` default 50.	`MemoryPage` with `memories`, `total`
`get(id)`	`id`: str — e.g. `mem_8f2c1a`	`Memory` full object including `facts`
`update(id, *, content, expected_updated_at?)`	Re-runs extraction on new content. `expected_updated_at` enables optimistic concurrency.	`Memory` updated
`delete(id)`	`id`: str	`DeleteReceipt` with `status`, `facts_invalidated`, `audit_id`
`delete_all(*, user_id)`	`user_id`: str — erases every memory for one end user	`BulkReceipt` with `memories_forgotten`, `facts_invalidated`, `audit_id`
`history(id)`	`id`: str	`MemoryHistory` with list of `events`
`batch(items, *)`	`items`: list of add-shaped dicts, max 500	`BatchJob` with `id`, `status`, `received`
`batch_status(id)`	`id`: str — job id from `batch()`	`BatchJob` with `status`, `received`, `imported`, `failed`, `errors`

Facts

Method (Python)	Parameters	Returns
`get_facts(*, entity?, subject?, predicate?, predicate_family?, user_id?, agent_id?, as_of?, include_invalidated?, limit?)`	All filters optional. `entity` matches either side of the triple. `predicate` is normalized server-side (the raw verb is returned as `predicate_raw`); `predicate_family` is a limited taxonomy, so many predicates map to `other` — filter by `entity` when in doubt. `as_of`: ISO-8601 date string for point-in-time queries. `include_invalidated`: bool, default false. Works on every tier, hobby included.	flat list of `Fact` with `id`, `subject`, `predicate`, `predicate_raw`, `object`, `predicate_family`, `valid_from`, `invalid_at`, `invalidated_by`, `source_memory_id` (liveness = `invalid_at` is null)
`add_fact_triple(subject, predicate, object, *, user_id?, agent_id?, subject_type?, valid_from?)`	Writes a typed triple directly, skipping NLP extraction. Contradiction check still runs. `valid_from`: ISO-8601 for historical back-dating.	`Fact` with `invalidated` list

Context and profile

Method (Python)	Parameters	Returns
`get_context(*, query, user_id?, agent_id?, token_budget?)`	`query`: the current user turn or topic. `token_budget`: default 800. Assembly is deterministic retrieval, no generation.	`Context` with `context` (str ready for system prompt), `tokens`, `sources`
`get_profile(*, user_id, as_of?)`	`user_id`: required. `as_of`: ISO-8601 for a historical snapshot.	`Profile` with `total`, `by_family` dict

Admin

Method (Python)	Parameters	Returns
`users(*, limit?, offset?)`	Lists end users you have stored data for.	`UserPage` with `users`, `total`

End-to-end example

A support bot that remembers each customer across sessions. On every turn it pulls a prompt-ready context block, calls the LLM, then stores what the user said. Three SDK calls per turn: get_context, llm.chat, add.

"""
Support bot with persistent per-customer memory.
Uses korely-memory for context recall and fact storage.
"""
import os
import google.generativeai as genai
from korely_memory import Korely, APIError

korely = Korely(api_key=os.environ["KORELY_API_KEY"])
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-2.0-flash")

AGENT = "support-bot"

def chat(user_id: str, user_message: str) -> str:
    # 1. Assemble memory context for this user + query
    ctx = korely.get_context(
        query=user_message,
        agent_id=AGENT,
        user_id=user_id,
        token_budget=600,
    )

    system = (
        "You are a concise support assistant. "
        "Use the memory context below to personalise your reply.\n\n"
        + ctx.context
    )

    # 2. Call the LLM
    reply = model.generate_content(
        [{"role": "user", "parts": [system + "\n\nUser: " + user_message]}]
    ).text

    # 3. Persist what the user said (extracts facts, detects contradictions)
    try:
        korely.add(
            user_message,
            agent_id=AGENT,
            user_id=user_id,
            metadata={"turn": "user"},
        )
    except APIError as err:
        if err.code != "quota_exceeded":
            raise
        pass   # over the write quota — log and continue, the reply was already generated

    return reply


# --- run a two-turn session ---
uid = "customer-4812"

print(chat(uid, "Hi, I am on the Developer plan and I prefer async updates."))
# context is empty on first turn — the bot greets + confirms

print(chat(uid, "What plan am I on again?"))
# second turn: context block contains the plan + preference facts from turn 1
# → bot answers correctly without the user repeating themselves

What happens inside add on turn 1: Korely extracts two typed facts — (customer-4812, subscribed_to, Developer plan) and (customer-4812, likes, async updates) (the raw verb "prefers" is normalized to likes, and kept verbatim in predicate_raw) — and stores them alongside the full memory text. Extraction runs server-side and is asynchronous, so memory.facts may be empty on the immediate response and fill in a moment later. On turn 2, get_context retrieves both: the context block your system prompt receives contains the plan and preference before the LLM sees the query. No prompt engineering required beyond dropping in ctx.context.

One agent, unlimited customers. The same agent_id="support-bot" can serve thousands of user_id values. Quotas count writes and queries, never the number of end users you remember.

Scoping

Three identifiers, three levels of scope. They are the same parameters everywhere: SDK, REST, and CLI.

Param	What it identifies	Example
`agent_id`	Your application or agent. One namespace per product surface.	`"support-bot"`
`user_id`	Your end user. Free-form string, you choose the identifier. Scopes memory to one person your agent serves.	`"customer-4812"`
`run_id`	One session or agent run. Sub-scope inside a user.	`"session-2026-06-11"`

End users are unlimited on every tier. One support-bot agent can remember thousands of distinct customers; quotas count memories and queries, never people. A typical product setup is one agent_id per surface and one user_id per customer:

# Write: scoped to this customer
korely.add(
    "Asked to be contacted on Slack, not email.",
    agent_id="support-bot",
    user_id="customer-4812",
    run_id="session-2026-06-11",
)

# Read: only this customer's memory comes back
results = korely.search("contact preference", user_id="customer-4812")

Always pass user_id on reads in multi-tenant products. Filters are additive (AND). A search without user_id spans every end user in the namespace, which is what you want for an internal ops agent and not what you want inside a customer-facing chat.

Error handling

Every error response carries the same envelope — {"code": "...", "message": "..."} — and the SDK surfaces it as a single APIError exception with .status, .code, and .message. Branch on err.code to handle each case.

Status	`code`	When
`401`	`invalid_key`	Missing, malformed, or revoked API key. Message: `"Invalid or missing API key"`.
`404`	`not_found`	Memory id does not exist, or was forgotten. Message: `"Memory not found"`.
`422`	`invalid_request`	Validation failure, e.g. `"content: Field required"`.
`429`	`quota_exceeded`	Monthly write limit reached. The write-quota 429 has no `Retry-After`; only the per-second rate-limit 429 carries a `Retry-After` header in integer seconds.

from korely_memory import Korely, APIError

korely = Korely(api_key="kor_live_...")

try:
    memory = korely.get("mem_8f2c1a")
except APIError as err:
    if err.code == "invalid_key":
        # 401: check the key, or rotate it in the dashboard
        raise
    elif err.code == "not_found":
        # 404: the memory does not exist, or an end user forgot it
        memory = None
    elif err.code == "quota_exceeded":
        # 429: monthly write limit reached — upgrade or wait for the reset
        memory = None
    else:
        raise

There is no overage billing, ever. At 80% of quota you get an email and a quota.warning webhook; past 100% there is a +10% soft cap so a busy day does not break your agent; past that, writes return 429 with code: "quota_exceeded" (surfaced as APIError) until the month rolls over or you upgrade. Your bill is always exactly the tier price. See the API reference for quotas per tier.

Explore further by topic:

Topic	Where to go
Full REST contract	API reference — every endpoint, parameter, and response shape. The SDK is a thin wrapper; the reference is the source of truth.
CLI surface	CLI reference — same memory, same key, from the terminal. `korely context --user-id alice "what do I know about billing?"`
MCP surface	MCP reference — connect Claude, Cursor, or any MCP-compatible agent to your namespace over OAuth.
Bi-temporal facts	Temporal facts — how contradiction detection works, what `as_of` queries return, and how the fact timeline is stored.
End-to-end chatbot	Cookbook: chatbot that remembers — a fuller version of the example above, with conversation history and streaming.
Bulk import	Cookbook: bulk import — migrate an existing user history using `batch()`, with progress tracking and error handling.
Surfaces overview	When to use SDK vs CLI vs REST — a decision table for choosing the right surface per use case.
Migration from Mem0	Migration guide — request shapes mapped side by side, and what the Korely fact graph adds.
Pricing and quotas	Pricing — hobby (free), developer (€19/mo), team (€79/mo), scale (€249/mo). Writes and queries count; end users are always unlimited.

Python and Node.js are available now — both korely-memory, on PyPI and npm.