Korely

Surfaces

Korely has one memory and three ways to reach it. Every surface shares the same account, the same scoping model (user_id / agent_id / run_id), and the same permission model. You don't have to pick one, and nothing you write through one surface is invisible to another.

The three agent surfaces are: the REST API at https://api.korely.ai/v1 (the foundation everything else builds on), the SDK (korely-memory for Python and Node, wraps the REST API), and the CLI (ships inside the Python package, pipeable into any shell).

Install / connect

Every surface points at the same base URL: https://api.korely.ai/v1, EU-hosted in Helsinki. Auth is a single header on every request: Authorization: Bearer kor_live_.... Get a key from the dashboard after signup.

Terminal window
pip install korely-memory
# Python 3.9+; no heavy deps — embeddings and extraction run server-side

Write from anywhere

Any surface

  • Your product via SDK
  • A cron job via CLI
  • A pipeline via REST

One account

Your memory store

  • Cloud store (Postgres + pgvector), EU-hosted
  • Bi-temporal facts
  • One permission model

Read from anywhere

Every other surface

  • CLI writes at 2am, SDK reads at 9am
  • Your app writes a fact, your agent recalls it

Capability matrix

Operation CLI SDK / REST
Search memories
Read a memory, related items
Typed facts (bi-temporal) 1
Write memories 2
Batch import 3
Point-in-time facts (as_of)
Delete all memories for a user_id
Answer generation 4
  1. Typed facts — the bi-temporal core — are available on every tier, including the free Hobby tier. Reading facts costs only a query against your quota.
  2. The CLI exposes korely add for writing a new memory and korely delete / korely delete-all for removal.
  3. Bulk import (POST /v1/batch) is REST/SDK-only. Point-in-time fact queries (as_of) work on every surface, including the CLI (korely facts --as-of). See the API reference.
  4. No surface generates answers, by design. Reads are retrieval, not generation: no model composes output on the read path, so your agent's own model does the reasoning and read quotas are an order of magnitude more generous than write quotas. Full detail in Architecture.

Python and Node.js are available now.

CLI

The scripting surface. Search, read, add, and delete memories from any terminal, pipeable into anything. Reach for it when you already know what to pull: cron jobs, CI prep, editor commands, shell pipelines.

korely cli zsh
$ korely search "pricing decisions" --limit 3
1.  Pricing review June         score 0.92
2.  Plan change approval        score 0.87
3.  Q2 budget retro             score 0.81

$ korely search "open action items" --json > weekly.json
$ jq '.results[0].snippet' weekly.json
"Open actions, week 24"

# Pipe results into any local model
$ korely search "customer feedback" | ollama run llama3.1 "summarize"

Read the CLI deep dive → Install, every command, --json for scripts, and all flags.

SDK / REST

The product surface. Per-customer memory inside your own application: developer API keys, user_id scoping for your end users (unlimited on every tier), agent_id namespaces for your app, run_id for sessions. One call to write, one call to read.

korely_memory.py python
from korely_memory import Korely

korely = Korely(api_key="kor_live_...")  # EU-hosted, Helsinki

# Write: graph + typed-fact extraction included in the call
korely.add(
    "Prefers invoices as PDF, replies fastest before 10am CET",
    user_id="customer-4812",  # your end user, unlimited on every tier
    agent_id="billing-bot",    # your app's namespace
)

# Recall: fact-assembled context block, prompt-ready (the moat)
ctx = korely.get_context(query="invoice preferences", user_id="customer-4812")
print(ctx.context)  # active typed facts + relevant memories, no AI on the read path

Read the SDK deep dive → Client setup, scoping, facts, and the full REST API reference behind it.

Which surface should I use?

  • You are scripting or automating (cron, CI, shell pipelines, editor commands): use the CLI. Composable, --json output, no SDK dependency in the pipeline.
  • You are building a product with memory per customer: use the SDK. API keys, user_id scoping, quotas.
  • You want direct HTTP control (any language, any runtime): call the REST API directly. Same auth header, same scoping model.
Use caseBest surfaceWhy
Per-customer memory inside your productSDK / RESTuser_id scoping, API keys, quotas — end users unlimited on every tier.
Pipe memory into CI / cron scriptsCLIComposable, --json output, no SDK dependency in the pipeline.
A support bot serving 10,000 customersSDK / RESTOne agent_id, 10,000 user_id values. Filters are additive.
Automation workflows in n8nSDK / RESTHTTP calls, full scoping model, nothing to host yourself.
Audit or correct what an agent remembered about meMemory Panel (in-app)Human-in-the-loop: see, edit, forget any fact.
Migrating off Mem0SDK / RESTSame user_id / agent_id / run_id scoping. See the API reference.

Use them together

The surfaces aren't competing options. They are entry points into the same account. A setup we run ourselves: the SDK embedded in a product so each customer gets their own memory under their own user_id, the CLI in a nightly cron that adds a digest memory, and the REST API hit directly from a webhook handler. Three surfaces, one memory store. The memory the cron job writes at 2am is in the agent's search results at 9am.

One scoping model everywhere. user_id is your end user's identifier (free-form, e.g. "customer-4812", unlimited on every tier). agent_id is your application's namespace (e.g. "support-bot"). run_id is one session. The same three keys mean the same thing on every surface. Full detail in Memory model.

Why reads feel free. Most read operations are pure SQL lookups, zero AI calls. Search embeds the query (a fraction of a hundredth of a cent) and retrieves by semantic vector similarity (cosine over embeddings). Facts reads are deterministic and typically return in under 50 ms. The intelligence runs on the write path: embeddings, entity extraction, typed-fact extraction with contradiction checking, about a tenth of a cent per memory, all included. Details in Architecture.

REST endpoints

Every SDK method and CLI subcommand maps to exactly one endpoint. The base URL is https://api.korely.ai/v1. Auth is always Authorization: Bearer kor_live_....

MethodPathWhat it does
POST/memoriesWrite a memory. Runs the full pipeline: embed, entity extract, fact extract with contradiction check.
GET/memoriesList memories in a scope, newest first. Filters: user_id, agent_id, limit, offset.
GET/memories/{id}Full content, metadata, and extracted facts for one memory.
PATCH/memories/{id}Update content. Re-runs extraction. Pass expected_updated_at for optimistic concurrency (409 on stale write).
DELETE/memories/{id}Forget one memory. Audited invalidation, not a hard delete. Returns an audit id.
GET/memories/{id}/historyLifecycle of one memory: created, updated, forgotten, plus every fact it produced.
POST/memories/searchSemantic vector search (cosine over embeddings). Returns {results:[...]} with a snippet per hit. Body: query, user_id, agent_id, limit.
GET/factsTyped (subject, predicate, object) triples — the bi-temporal core. Filters: entity, subject, predicate, predicate_family, as_of, include_invalidated, limit. Available on every tier. Returns flat JSON {facts:[...],total}.
POST/factsWrite a typed fact directly. Contradiction check still runs. Bi-temporal: pass valid_from for historical facts.
GET/contextThe moat recall path: a fact-assembled, prompt-ready context block within a token budget. Query: query, user_id, agent_id, token_budget (default 800).
POST/batchBulk import up to 500 memory objects, processed async. Returns {id, status, received}.
GET/batch/{id}Status and result counts for a batch job: {id, status, received, imported, failed, errors}.
GET/usersEnd users with stored data, as objects {user_id, memories, facts, last_active}. Paginated.
DELETE/users/{end_user}/memoriesGDPR bulk forget: every memory and fact for one end user, one audit record. Returns 200 + {user_id, memories_forgotten, facts_invalidated, audit_id}.
GET/agentsAgent namespaces tied to the key, with memory counts.
DELETE/agents/{id}Delete an agent namespace and all its memories.
GET/pingAuth check. Requires the auth header like every endpoint. Returns {"ok":true,"tier":"...","scopes":[...]}.

SDK method reference

Python uses snake_case; Node.js uses camelCase for the same methods. The table below uses the Python name; the Node.js equivalent is in parentheses where it differs.

Method (Python / Node.js)Signature (key params)REST
add() content, *, user_id, agent_id, run_id, metadata POST /memories
search() query, user_id, agent_id, limit POST /memories/search
get_all() / getAll() user_id, agent_id, limit, offset GET /memories
get(id) id GET /memories/{id}
update(id, ...) id, content, expected_updated_at PATCH /memories/{id}
delete(id) id DELETE /memories/{id}
delete_all() / deleteAll() user_id DELETE /users/{end_user}/memories
history(id) id GET /memories/{id}/history
get_facts() / getFacts() entity, subject, predicate, predicate_family, as_of, include_invalidated, limit GET /facts
add_fact_triple() / addFactTriple() subject, predicate, object, *, user_id, agent_id, subject_type, valid_from POST /facts
get_context() / getContext() query, user_id, agent_id, token_budget GET /context
get_profile() / getProfile() user_id, as_of GET /profile
batch() items: list of add-shaped objects POST /batch
batch_status() / batchStatus() job_id GET /batch/{id}
users() agent_id, limit, offset GET /users

Every method is documented with examples in the SDK deep dive. The REST shapes are in the API reference.

CLI command reference

The korely command ships inside the Python package. All subcommands accept --user-id, --agent-id, and --json unless noted. The facts subcommand has additional filter flags.

CommandKey flagsOne-line example
korely auth --api-key korely auth — verify key and print base URL + end-user count
korely add "..." --user-id --agent-id --run-id korely add "Prefers PDF invoices" --user-id customer-4812
korely search "..." --user-id --agent-id --limit --json korely search "invoice prefs" --user-id customer-4812 --limit 5
korely context "..." --user-id --agent-id --json korely context "budget decisions" --user-id alice | ollama run llama3.1 "summarize"
korely facts --user-id --agent-id --entity --subject --family --as-of --include-invalidated --limit --json korely facts --user-id customer-4812 --family preferences --json
korely profile --user-id --json korely profile --user-id customer-4812
korely get <id> --json korely get mem_8f2c1a
korely users --agent-id --limit --json korely users --json | jq '.users[].user_id'
korely delete <id> --json korely delete mem_8f2c1a
korely delete-all --user-id --yes --json korely delete-all --user-id customer-4812 --yes

Every subcommand is documented with all flags and worked examples in the CLI deep dive.

Error codes

Every error response carries the same flat envelope: {"code":"<slug>","message":"<text>"} — never an error or detail field. The SDK raises a typed exception per status (with APIError as the base for anything else); the CLI exits non-zero and prints the error JSON to stderr.

StatusCodeSDK exceptionWhen it happens
401 invalid_key AuthenticationError Missing, malformed, or revoked API key.
403 agent_cap_exceeded NamespaceForbiddenError A new agent_id would exceed your plan's agent limit.
404 not_found NotFoundError Memory or fact id does not exist, or was forgotten.
409 stale_write StaleWriteError update with an expected_updated_at older than the current record.
422 invalid_request APIError Request body failed schema validation. Message is flat, e.g. "content: Field required".
429 quota_exceeded QuotaExceededError Monthly write or query quota exceeded the +10% grace period. The write-quota 429 carries no Retry-After; only a rate-limit 429 sends Retry-After (integer seconds, header).

There is no overage billing. At 80% of quota you receive an email and a quota.warning webhook. Past 100%, writes and searches that exceed the soft cap (10% above the tier limit) return 429 until the month rolls over or you upgrade. Your bill is always exactly the tier price.

End-to-end example

A customer support agent that remembers preferences from one conversation and recalls them in the next. One agent namespace, one end user, three calls total.

from korely_memory import Korely
korely = Korely(api_key="kor_live_...", region="eu")
# ── Conversation 1: store what the customer told us ──────────────────────────
memory = korely.add(
"Prefers invoices as PDF, replies fastest before 10am CET",
user_id="customer-4812",
agent_id="support-bot",
metadata={"source": "support-chat"},
)
print(memory.id) # mem_8f2c1a
print(len(memory.facts)) # 2 — e.g. "prefers PDF invoices", "active before 10am CET"
# ── Conversation 2 (next day): assemble context before the model call ────────
ctx = korely.get_context(
query="billing preferences",
user_id="customer-4812",
agent_id="support-bot",
token_budget=400,
)
messages = [
{"role": "system", "content": f"You are a helpful billing assistant.\n\n{ctx.context}"},
{"role": "user", "content": "Can you send me the invoice?"},
]
# → model sees: "customer-4812 prefers invoices as PDF (from 2026-06-11)"
# → model replies: "I'll send the invoice to you as a PDF right away."
# ── GDPR: customer asks to be forgotten ─────────────────────────────────────
receipt = korely.delete_all(user_id="customer-4812")
print(receipt.memories_forgotten) # 1
print(receipt.facts_invalidated) # 2
print(receipt.audit_id) # aud_3d0f

Pricing

Four tiers, all priced in EUR. Quotas reset monthly. End users (user_id values) are unlimited on every tier.

TierPriceWrite quotaQuery quotaAgent cap
HobbyFree1,000 writes/mo25,000 queries/mo2 agents
Developer€19/mo5,000 writes/mo250,000 queries/mo10 agents
Team€79/mo25,000 writes/mo1,000,000 queries/mo100 agents
Scale€249/mo75,000 writes/mo10,000,000 queries/mo500 agents

No overage billing. A soft +10% grace period applies past the monthly quota limit; requests that exceed that cap return 429. Typed facts (get_facts, add_fact_triple, as_of queries) — the bi-temporal core — are available on every tier, including the free Hobby tier. Reading them costs only a query against your quota.

Related

  • SDK deep dive — every method with full signatures, scoping rules, error handling, and worked examples.
  • CLI deep dive — every subcommand with all flags, JSON output, and shell pipeline patterns.
  • API reference — the full REST contract, request and response shapes, endpoint by endpoint.
  • Memory model — how user_id, agent_id, and run_id scoping works across all surfaces.
  • Temporal facts — the bi-temporal model behind get_facts and as_of.
  • Architecture — why writes cost and reads are nearly free.