Surfaces

Korely has one memory and three ways to reach it. Every surface shares the same account, the same scoping model (user_id / agent_id / run_id), and the same permission model. You don't have to pick one, and nothing you write through one surface is invisible to another.

The three agent surfaces are: the REST API at https://api.korely.ai/v1 (the foundation everything else builds on), the SDK (korely-memory for Python and Node, wraps the REST API), and the CLI (ships inside the Python package, pipeable into any shell).

Install / connect

Every surface points at the same base URL: https://api.korely.ai/v1, EU-hosted in Helsinki. Auth is a single header on every request: Authorization: Bearer kor_live_.... Get a key from the dashboard after signup.

pip install korely-memory
# Python 3.9+; no heavy deps — embeddings and extraction run server-side

npm install korely-memory
# Node 18+; same package name, same method names (camelCase)

pip install korely-memory
# The korely command ships inside the Python package
korely auth   # verifies key + prints base URL and end-user count

# No install — just curl or any HTTP client
# Every endpoint, including /v1/ping, requires the auth header
curl https://api.korely.ai/v1/ping \
  -H "Authorization: Bearer kor_live_..."
# {"ok":true,"tier":"hobby","scopes":["memories:read","memories:write"]}

Write from anywhere

Any surface

Your product via SDK
A cron job via CLI
A pipeline via REST

One account

Your memory store

Cloud store (Postgres + pgvector), EU-hosted
Bi-temporal facts
One permission model

Read from anywhere

Every other surface

CLI writes at 2am, SDK reads at 9am
Your app writes a fact, your agent recalls it

Capability matrix

Operation	CLI	SDK / REST
Search memories	✓	✓
Read a memory, related items	✓	✓
Typed facts (bi-temporal) ¹	✓	✓
Write memories ²	✓	✓
Batch import ³	—	✓
Point-in-time facts (`as_of`)	✓	✓
Delete all memories for a `user_id`	✓	✓
Answer generation ⁴	—	—

Typed facts — the bi-temporal core — are available on every tier, including the free Hobby tier. Reading facts costs only a query against your quota.
The CLI exposes korely add for writing a new memory and korely delete / korely delete-all for removal.
Bulk import (POST /v1/batch) is REST/SDK-only. Point-in-time fact queries (as_of) work on every surface, including the CLI (korely facts --as-of). See the API reference.
No surface generates answers, by design. Reads are retrieval, not generation: no model composes output on the read path, so your agent's own model does the reasoning and read quotas are an order of magnitude more generous than write quotas. Full detail in Architecture.

Python and Node.js are available now.

CLI

The scripting surface. Search, read, add, and delete memories from any terminal, pipeable into anything. Reach for it when you already know what to pull: cron jobs, CI prep, editor commands, shell pipelines.

korely cli zsh

$ korely search "pricing decisions" --limit 3
1.  Pricing review June         score 0.92
2.  Plan change approval        score 0.87
3.  Q2 budget retro             score 0.81

$ korely search "open action items" --json > weekly.json
$ jq '.results[0].snippet' weekly.json
"Open actions, week 24"

# Pipe results into any local model
$ korely search "customer feedback" | ollama run llama3.1 "summarize"

Read the CLI deep dive → Install, every command, --json for scripts, and all flags.

SDK / REST

The product surface. Per-customer memory inside your own application: developer API keys, user_id scoping for your end users (unlimited on every tier), agent_id namespaces for your app, run_id for sessions. One call to write, one call to read.

korely_memory.py python

from korely_memory import Korely

korely = Korely(api_key="kor_live_...")  # EU-hosted, Helsinki

# Write: graph + typed-fact extraction included in the call
korely.add(
    "Prefers invoices as PDF, replies fastest before 10am CET",
    user_id="customer-4812",  # your end user, unlimited on every tier
    agent_id="billing-bot",    # your app's namespace
)

# Recall: fact-assembled context block, prompt-ready (the moat)
ctx = korely.get_context(query="invoice preferences", user_id="customer-4812")
print(ctx.context)  # active typed facts + relevant memories, no AI on the read path

Read the SDK deep dive → Client setup, scoping, facts, and the full REST API reference behind it.

Which surface should I use?

You are scripting or automating (cron, CI, shell pipelines, editor commands): use the CLI. Composable, --json output, no SDK dependency in the pipeline.
You are building a product with memory per customer: use the SDK. API keys, user_id scoping, quotas.
You want direct HTTP control (any language, any runtime): call the REST API directly. Same auth header, same scoping model.

Use case	Best surface	Why
Per-customer memory inside your product	SDK / REST	`user_id` scoping, API keys, quotas — end users unlimited on every tier.
Pipe memory into CI / cron scripts	CLI	Composable, `--json` output, no SDK dependency in the pipeline.
A support bot serving 10,000 customers	SDK / REST	One `agent_id`, 10,000 `user_id` values. Filters are additive.
Automation workflows in n8n	SDK / REST	HTTP calls, full scoping model, nothing to host yourself.
Audit or correct what an agent remembered about me	Memory Panel (in-app)	Human-in-the-loop: see, edit, forget any fact.
Migrating off Mem0	SDK / REST	Same `user_id` / `agent_id` / `run_id` scoping. See the API reference.

Use them together

The surfaces aren't competing options. They are entry points into the same account. A setup we run ourselves: the SDK embedded in a product so each customer gets their own memory under their own user_id, the CLI in a nightly cron that adds a digest memory, and the REST API hit directly from a webhook handler. Three surfaces, one memory store. The memory the cron job writes at 2am is in the agent's search results at 9am.

One scoping model everywhere. user_id is your end user's identifier (free-form, e.g. "customer-4812", unlimited on every tier). agent_id is your application's namespace (e.g. "support-bot"). run_id is one session. The same three keys mean the same thing on every surface. Full detail in Memory model.

Why reads feel free. Most read operations are pure SQL lookups, zero AI calls. Search embeds the query (a fraction of a hundredth of a cent) and retrieves by semantic vector similarity (cosine over embeddings). Facts reads are deterministic and typically return in under 50 ms. The intelligence runs on the write path: embeddings, entity extraction, typed-fact extraction with contradiction checking, about a tenth of a cent per memory, all included. Details in Architecture.

REST endpoints

Every SDK method and CLI subcommand maps to exactly one endpoint. The base URL is https://api.korely.ai/v1. Auth is always Authorization: Bearer kor_live_....

Method	Path	What it does
`POST`	`/memories`	Write a memory. Runs the full pipeline: embed, entity extract, fact extract with contradiction check.
`GET`	`/memories`	List memories in a scope, newest first. Filters: `user_id`, `agent_id`, `limit`, `offset`.
`GET`	`/memories/{id}`	Full content, metadata, and extracted facts for one memory.
`PATCH`	`/memories/{id}`	Update content. Re-runs extraction. Pass `expected_updated_at` for optimistic concurrency (409 on stale write).
`DELETE`	`/memories/{id}`	Forget one memory. Audited invalidation, not a hard delete. Returns an audit id.
`GET`	`/memories/{id}/history`	Lifecycle of one memory: created, updated, forgotten, plus every fact it produced.
`POST`	`/memories/search`	Semantic vector search (cosine over embeddings). Returns `{results:[...]}` with a `snippet` per hit. Body: `query`, `user_id`, `agent_id`, `limit`.
`GET`	`/facts`	Typed (subject, predicate, object) triples — the bi-temporal core. Filters: `entity`, `subject`, `predicate`, `predicate_family`, `as_of`, `include_invalidated`, `limit`. Available on every tier. Returns flat JSON `{facts:[...],total}`.
`POST`	`/facts`	Write a typed fact directly. Contradiction check still runs. Bi-temporal: pass `valid_from` for historical facts.
`GET`	`/context`	The moat recall path: a fact-assembled, prompt-ready context block within a token budget. Query: `query`, `user_id`, `agent_id`, `token_budget` (default 800).
`POST`	`/batch`	Bulk import up to 500 memory objects, processed async. Returns `{id, status, received}`.
`GET`	`/batch/{id}`	Status and result counts for a batch job: `{id, status, received, imported, failed, errors}`.
`GET`	`/users`	End users with stored data, as objects `{user_id, memories, facts, last_active}`. Paginated.
`DELETE`	`/users/{end_user}/memories`	GDPR bulk forget: every memory and fact for one end user, one audit record. Returns 200 + `{user_id, memories_forgotten, facts_invalidated, audit_id}`.
`GET`	`/agents`	Agent namespaces tied to the key, with memory counts.
`DELETE`	`/agents/{id}`	Delete an agent namespace and all its memories.
`GET`	`/ping`	Auth check. Requires the auth header like every endpoint. Returns `{"ok":true,"tier":"...","scopes":[...]}`.

SDK method reference

Python uses snake_case; Node.js uses camelCase for the same methods. The table below uses the Python name; the Node.js equivalent is in parentheses where it differs.

Method (Python / Node.js)	Signature (key params)	REST
`add()`	`content, *, user_id, agent_id, run_id, metadata`	`POST /memories`
`search()`	`query, user_id, agent_id, limit`	`POST /memories/search`
`get_all() / getAll()`	`user_id, agent_id, limit, offset`	`GET /memories`
`get(id)`	`id`	`GET /memories/{id}`
`update(id, ...)`	`id, content, expected_updated_at`	`PATCH /memories/{id}`
`delete(id)`	`id`	`DELETE /memories/{id}`
`delete_all() / deleteAll()`	`user_id`	`DELETE /users/{end_user}/memories`
`history(id)`	`id`	`GET /memories/{id}/history`
`get_facts() / getFacts()`	`entity, subject, predicate, predicate_family, as_of, include_invalidated, limit`	`GET /facts`
`add_fact_triple() / addFactTriple()`	`subject, predicate, object, *, user_id, agent_id, subject_type, valid_from`	`POST /facts`
`get_context() / getContext()`	`query, user_id, agent_id, token_budget`	`GET /context`
`get_profile() / getProfile()`	`user_id, as_of`	`GET /profile`
`batch()`	`items: list of add-shaped objects`	`POST /batch`
`batch_status() / batchStatus()`	`job_id`	`GET /batch/{id}`
`users()`	`agent_id, limit, offset`	`GET /users`

Every method is documented with examples in the SDK deep dive. The REST shapes are in the API reference.

CLI command reference

The korely command ships inside the Python package. All subcommands accept --user-id, --agent-id, and --json unless noted. The facts subcommand has additional filter flags.

Command	Key flags	One-line example
`korely auth`	`--api-key`	`korely auth` — verify key and print base URL + end-user count
`korely add "..."`	`--user-id --agent-id --run-id`	`korely add "Prefers PDF invoices" --user-id customer-4812`
`korely search "..."`	`--user-id --agent-id --limit --json`	`korely search "invoice prefs" --user-id customer-4812 --limit 5`
`korely context "..."`	`--user-id --agent-id --json`	`korely context "budget decisions" --user-id alice \| ollama run llama3.1 "summarize"`
`korely facts`	`--user-id --agent-id --entity --subject --family --as-of --include-invalidated --limit --json`	`korely facts --user-id customer-4812 --family preferences --json`
`korely profile`	`--user-id --json`	`korely profile --user-id customer-4812`
`korely get <id>`	`--json`	`korely get mem_8f2c1a`
`korely users`	`--agent-id --limit --json`	`korely users --json \| jq '.users[].user_id'`
`korely delete <id>`	`--json`	`korely delete mem_8f2c1a`
`korely delete-all`	`--user-id --yes --json`	`korely delete-all --user-id customer-4812 --yes`

Every subcommand is documented with all flags and worked examples in the CLI deep dive.

Error codes

Every error response carries the same flat envelope: {"code":"<slug>","message":"<text>"} — never an error or detail field. The SDK raises a typed exception per status (with APIError as the base for anything else); the CLI exits non-zero and prints the error JSON to stderr.

Status	Code	SDK exception	When it happens
`401`	`invalid_key`	`AuthenticationError`	Missing, malformed, or revoked API key.
`403`	`agent_cap_exceeded`	`NamespaceForbiddenError`	A new `agent_id` would exceed your plan's agent limit.
`404`	`not_found`	`NotFoundError`	Memory or fact id does not exist, or was forgotten.
`409`	`stale_write`	`StaleWriteError`	`update` with an `expected_updated_at` older than the current record.
`422`	`invalid_request`	`APIError`	Request body failed schema validation. Message is flat, e.g. `"content: Field required"`.
`429`	`quota_exceeded`	`QuotaExceededError`	Monthly write or query quota exceeded the +10% grace period. The write-quota 429 carries no `Retry-After`; only a rate-limit 429 sends `Retry-After` (integer seconds, header).

There is no overage billing. At 80% of quota you receive an email and a quota.warning webhook. Past 100%, writes and searches that exceed the soft cap (10% above the tier limit) return 429 until the month rolls over or you upgrade. Your bill is always exactly the tier price.

End-to-end example

A customer support agent that remembers preferences from one conversation and recalls them in the next. One agent namespace, one end user, three calls total.

from korely_memory import Korely

korely = Korely(api_key="kor_live_...", region="eu")

# ── Conversation 1: store what the customer told us ──────────────────────────
memory = korely.add(
    "Prefers invoices as PDF, replies fastest before 10am CET",
    user_id="customer-4812",
    agent_id="support-bot",
    metadata={"source": "support-chat"},
)
print(memory.id)      # mem_8f2c1a
print(len(memory.facts))  # 2 — e.g. "prefers PDF invoices", "active before 10am CET"

# ── Conversation 2 (next day): assemble context before the model call ────────
ctx = korely.get_context(
    query="billing preferences",
    user_id="customer-4812",
    agent_id="support-bot",
    token_budget=400,
)

messages = [
    {"role": "system", "content": f"You are a helpful billing assistant.\n\n{ctx.context}"},
    {"role": "user", "content": "Can you send me the invoice?"},
]
# → model sees: "customer-4812 prefers invoices as PDF (from 2026-06-11)"
# → model replies: "I'll send the invoice to you as a PDF right away."

# ── GDPR: customer asks to be forgotten ─────────────────────────────────────
receipt = korely.delete_all(user_id="customer-4812")
print(receipt.memories_forgotten)  # 1
print(receipt.facts_invalidated)   # 2
print(receipt.audit_id)            # aud_3d0f

import { Korely } from "korely-memory";

const korely = new Korely({ apiKey: "kor_live_...", region: "eu" });

// ── Conversation 1: store what the customer told us ──────────────────────────
const memory = await korely.add(
  "Prefers invoices as PDF, replies fastest before 10am CET",
  {
    user_id: "customer-4812",
    agent_id: "support-bot",
    metadata: { source: "support-chat" },
  },
);
console.log(memory.id);           // mem_8f2c1a
console.log(memory.facts.length); // 2

// ── Conversation 2 (next day): assemble context before the model call ────────
const ctx = await korely.getContext({
  query: "billing preferences",
  user_id: "customer-4812",
  agent_id: "support-bot",
  token_budget: 400,
});

const messages = [
  { role: "system", content: `You are a helpful billing assistant.\n\n${ctx.context}` },
  { role: "user", content: "Can you send me the invoice?" },
];
// → model sees: "customer-4812 prefers invoices as PDF (from 2026-06-11)"
// → model replies: "I'll send the invoice to you as a PDF right away."

// ── GDPR: customer asks to be forgotten ─────────────────────────────────────
const receipt = await korely.deleteAll({ user_id: "customer-4812" });
console.log(receipt.memories_forgotten); // 1
console.log(receipt.facts_invalidated);  // 2
console.log(receipt.audit_id);           // aud_3d0f

# Conversation 1: store a memory
curl -s -X POST https://api.korely.ai/v1/memories \
  -H "Authorization: Bearer kor_live_..." \
  -H "Content-Type: application/json" \
  -d '{"content":"Prefers invoices as PDF, replies fastest before 10am CET",
       "user_id":"customer-4812","agent_id":"support-bot"}' | jq .id
# "mem_8f2c1a"

# Conversation 2: get a prompt-ready context block
curl -s "https://api.korely.ai/v1/context?query=billing+preferences&user_id=customer-4812&agent_id=support-bot&token_budget=400" \
  -H "Authorization: Bearer kor_live_..." | jq .context
# "customer-4812 prefers invoices as PDF (from 2026-06-11)\n..."

# GDPR: forget everything for this customer
curl -s -X DELETE "https://api.korely.ai/v1/users/customer-4812/memories" \
  -H "Authorization: Bearer kor_live_..." | jq .audit_id
# "aud_3d0f"

Pricing

Four tiers, all priced in EUR. Quotas reset monthly. End users (user_id values) are unlimited on every tier.

Tier	Price	Write quota	Query quota	Agent cap
Hobby	Free	1,000 writes/mo	25,000 queries/mo	2 agents
Developer	€19/mo	5,000 writes/mo	250,000 queries/mo	10 agents
Team	€79/mo	25,000 writes/mo	1,000,000 queries/mo	100 agents
Scale	€249/mo	75,000 writes/mo	10,000,000 queries/mo	500 agents

No overage billing. A soft +10% grace period applies past the monthly quota limit; requests that exceed that cap return 429. Typed facts (get_facts, add_fact_triple, as_of queries) — the bi-temporal core — are available on every tier, including the free Hobby tier. Reading them costs only a query against your quota.

SDK deep dive — every method with full signatures, scoping rules, error handling, and worked examples.
CLI deep dive — every subcommand with all flags, JSON output, and shell pipeline patterns.
API reference — the full REST contract, request and response shapes, endpoint by endpoint.
Memory model — how user_id, agent_id, and run_id scoping works across all surfaces.
Temporal facts — the bi-temporal model behind get_facts and as_of.
Architecture — why writes cost and reads are nearly free.