Korely

Migrate from Supermemory

If you built on Supermemory, the Korely API will feel familiar. The verbs match: add a memory, search memories, list them, delete them. The main difference is scoping. Supermemory scopes everything with one free-form containerTag; Korely names the dimensions: user_id identifies your end user, agent_id identifies your application, run_id identifies a session. Switching is mechanical: map the calls, replay the corpus through the batch endpoint, verify with a search.

The short version: three steps. Export your memories with Supermemory's list endpoint, import them with POST /v1/batch, then run a search against the migrated corpus. Entity extraction, graph edges, and typed facts build automatically on ingest. There is nothing to enable and nothing to configure.

Concept mapping

Before touching code, map the mental model. Supermemory has one scoping dimension and one document type. Korely names three dimensions and adds a typed fact layer on top of raw memories.

Supermemory conceptKorely equivalentNotes
containerTag user_id Free-form string. End users are unlimited on every plan.
Compound tag e.g. supportbot_customer-4812 agent_id="supportbot" + user_id="customer-4812" Split once at migration time; filters are additive after that.
customId (dedup key) metadata.original_id Carry it in metadata; no first-class dedup key in Korely.
Document (raw text) Memory Same unit. POST /v1/memories or korely.add().
Extracted fact (search_mode="memories") Typed fact — GET /v1/facts Auto-built at ingest. Subject + predicate + object triple, bi-temporal.
Project / workspace agent_id One API key, multiple logical applications via agent_id.
Memory history GET /v1/memories/{id}/history Also available via korely.history(memory_id).

Call mapping

Every core Supermemory call has a direct Korely equivalent, in the Python SDK and over REST. The full REST contract is in the API reference.

Supermemory callKorely SDKKorely REST
client.add(content="...", container_tag="u1") korely.add("...", user_id="u1") POST /v1/memories
client.search.memories(q="...", container_tag="u1") korely.search("...", user_id="u1") POST /v1/memories/search
client.documents.list(container_tags=["u1"]) korely.get_all(user_id="u1") GET /v1/memories?user_id=u1
GET /v3/documents/:id korely.get(memory_id) GET /v1/memories/:id
DELETE /v3/documents/:id korely.delete(memory_id) DELETE /v1/memories/:id
search_mode="memories" (extracted facts) korely.get_facts(user_id="u1") GET /v1/facts?user_id=u1

To clear one end user entirely, korely.delete_all(user_id="u1") maps to DELETE /v1/users/:user_id/memories: delete every memory for a user with one call.

On scoping: a containerTag that holds an end-user identifier becomes user_id, free-form, and end users are unlimited on every tier. If your tags are compound, for example supportbot_customer-4812, split them: the application part becomes agent_id, the user part becomes user_id, and a session identifier becomes run_id. Filters are additive, so you keep the same isolation with named dimensions instead of string conventions. One Supermemory detail to remember while exporting: memory search takes containerTag (singular) while the document list endpoint takes containerTags (an array).

If you used customId for deduplication, carry it over inside metadata so every imported memory keeps a pointer to its original record.

In code, the switch looks like this:

korely_memory.py python
# before
from supermemory import Supermemory
client = Supermemory(api_key="...")
client.add(content="Prefers invoices as PDF", container_tag="customer-4812")

# after
from korely_memory import Korely
korely = Korely(api_key="kor_live_...")
korely.add("Prefers invoices as PDF", user_id="customer-4812")

results = korely.search("invoice preferences", user_id="customer-4812")

The Python and Node SDKs are live on PyPI and npm as korely-memory. See the SDK reference for the full method list.

Step 1: export from Supermemory

Supermemory's list endpoint, POST /v3/documents/list, returns your memories page by page (up to 200 per page) with their containerTags and metadata. It does not return the full text, so fetch each document body with GET /v3/documents/:id and write everything to one JSON file:

import json, requests
SM = "https://api.supermemory.ai"
headers = {"Authorization": "Bearer sm_..."}
export, page = [], 1
while True:
r = requests.post(
f"{SM}/v3/documents/list",
headers=headers,
json={"containerTags": ["customer-4812"], "limit": 200, "page": page},
)
batch = r.json().get("memories", [])
for doc in batch:
full = requests.get(f"{SM}/v3/documents/" + doc["id"], headers=headers).json()
export.append(
{
"content": full["content"],
"containerTags": doc.get("containerTags", []),
"metadata": doc.get("metadata") or {},
}
)
if len(batch) < 200:
break
page += 1
json.dump(export, open("supermemory_export.json", "w"))
print(len(export), "memories exported")

Run it once per container tag, or drop the containerTags filter to export the whole project in one pass.

Step 2: import with the batch endpoint

POST /v1/batch accepts up to 500 memory objects per request, same shape as POST /v1/memories, processed asynchronously. You get a job id back and poll it until it completes. Each imported item runs the full write pipeline: document and chunk embeddings, entity extraction on our own infrastructure, typed-fact extraction with contradiction checking and bi-temporal validity. About a tenth of a cent of intelligence per memory, all included in your plan.

import json, requests
export = json.load(open("supermemory_export.json"))
def to_scope(tags):
# one containerTag per end user: the tag is the user_id
return {"user_id": tags[0]} if tags else {}
memories = [
{
"content": item["content"],
"metadata": item.get("metadata", {}),
**to_scope(item.get("containerTags", [])),
}
for item in export
]
# POST /v1/batch takes up to 500 memories per request
for i in range(0, len(memories), 500):
r = requests.post(
"https://api.korely.ai/v1/batch",
headers={"Authorization": "Bearer kor_live_..."},
json={"memories": memories[i : i + 500]},
)
print(r.json()) # {"id": "bat_9b21f4", "status": "processing", "received": 500}

If your tags are compound, replace to_scope with a split: agent_id from the application part, user_id from the user part. Poll the job until it reports completed:

Terminal window
curl https://api.korely.ai/v1/batch/bat_9b21f4 \
-H "Authorization: Bearer kor_live_..."
# 200 OK
{"id": "bat_9b21f4", "status": "completed", "received": 500, "imported": 500, "failed": 0, "errors": []}

Keep the import inspectable. Tag migrated memories in metadata, for example {"source": "supermemory_export"}, so you can filter them with GET /v1/memories?user_id=u1 and audit exactly what came over.

Step 3: verify recall

The primary recall path on Korely is GET /v1/context: it assembles a compact answer from the typed facts and relevant memories it holds about one end user, with the source ids attached. This is the call you put in front of your agent's model — it returns the active facts first, not a raw list to re-rank. Point it at a query you know your old corpus can answer:

Terminal window
curl "https://api.korely.ai/v1/context?query=invoice%20preferences&user_id=customer-4812" \
-H "Authorization: Bearer kor_live_..."
# 200 OK
{
"context": "## Known facts\n- customer-4812 likes PDF invoices (since 2026-06-15)\n\n## Relevant memories\n- Prefers invoices as PDF, replies fastest before 10am CET.",
"tokens": 30,
"sources": ["fct_b91e", "mem_7c20de"]
}

The context string is fact-assembled from the bi-temporal layer, and sources mixes the fact ids (fct_) and memory ids (mem_) that backed it. The only model call on the read path is the query embedding, a fraction of a hundredth of a cent. No generative model composes the output. Your agent's own model does the reasoning over what Korely returns.

If you want the raw matches instead — the closest Supermemory analogue — POST /v1/memories/search returns semantic vector matches (cosine similarity over the stored embeddings) scoped to one end user. Use it as a secondary path when you want the underlying memories rather than the assembled context:

Terminal window
curl -X POST https://api.korely.ai/v1/memories/search \
-H "Authorization: Bearer kor_live_..." \
-H "Content-Type: application/json" \
-d '{"query": "invoice preferences", "user_id": "customer-4812", "limit": 5}'
# 200 OK
{
"results": [
{"id": "mem_7c20de", "score": 0.91,
"snippet": "Prefers invoices as PDF, replies fastest before 10am CET.",
"user_id": "customer-4812", "agent_id": null,
"metadata": {"source": "supermemory_export"}}
]
}

If you relied on search_mode="memories" for extracted facts, the equivalent check is GET /v1/facts?user_id=customer-4812: extraction runs on ingest, so the typed layer fills in as the batch completes.

Gotchas

A few things that catch migrating teams off guard:

  • Back-dating is REST-only, not on the SDK add() convenience. Supermemory lets you back-date a memory with a timestamp at write time. Over REST, POST /v1/memories accepts an optional timestamp field, stored as the fact's valid_from, so you can preserve original write dates during a bulk import. The SDK add() convenience method does not expose it, and the batch endpoint has no per-item timestamp. If you need temporal ordering on the SDK path, encode the original date in metadata and use it in your prompts.
  • search() has no time_filter or include_history. Supermemory accepts date ranges in search. Korely search takes query, user_id, agent_id, and limit only. For temporal queries use GET /v1/facts?as_of=2025-01-01 on the typed-fact layer.
  • Agent cap is plan-limited; end users are not. If you register more than your plan's agent_id limit you receive a 403 agent_cap_exceeded response. End users (user_id) are always unlimited. Keep application identifiers in agent_id and per-customer identifiers in user_id.
  • Write quota is enforced; reads are generous. Each plan has a monthly write quota (memories added or updated). When you hit it the API returns 429 with body {"code": "quota_exceeded", "message": "..."} — the write-quota 429 carries no Retry-After header; you clear it by upgrading or waiting for the monthly reset, not by retrying. (Only the separate per-second rate-limit 429 carries a Retry-After header, in integer seconds.) Queries are an order of magnitude more generous. The batch endpoint counts each memory in the job individually against the write quota, so import in off-peak hours if your corpus is large. Quotas per plan are on the pricing page.
  • Optimistic concurrency on update(). PATCH /v1/memories/{id} accepts an optional expected_updated_at field. If the memory was modified between your read and your write, the API returns 409 stale_write. Re-fetch and retry. This is a guard against clobber in concurrent agent environments; Supermemory does not have an equivalent.
  • Batch jobs are asynchronous. POST /v1/batch returns a job id immediately with status processing. Entity extraction, graph edges, and typed facts are populated when the job completes. Poll GET /v1/batch/{id} before running your verification search, or you will search a partially-imported corpus.

What your agents gain

A typed knowledge graph

Every imported memory goes through entity extraction. People, companies, products, places, and concepts become nodes; relations between them become typed edges, drawn from ~50 canonical predicates in 9 families (preferences, people, places, work, ownership, health, financial, events, other). The graph is traversable via the SDK and REST: fetch all memories for a user, or query facts directly with korely.get_facts() / GET /v1/facts. Graph reads are pure SQL and graph lookups with zero AI calls, reusing stored vectors rather than computing new embeddings.

korely_memory.py python
# retrieve typed facts for a user — flat list, the graph edge layer
facts = korely.get_facts(user_id="customer-4812")
# each fact normalizes the verb: predicate "likes", predicate_raw "prefers"
# facts[0].subject "customer-4812", facts[0].predicate "likes", facts[0].object "PDF invoices"

# or via REST → JSON {"facts": [...], "total": N}
# GET /v1/facts?user_id=customer-4812
# list all memories: GET /v1/memories?user_id=customer-4812

Bi-temporal facts

Each memory also yields typed (subject, predicate, object) triples, and every triple carries valid_from and invalid_at. When new information contradicts an existing fact, a two-stage contradiction check at write time invalidates the old fact and keeps it as history. Default reads return only what is true now; pass include_invalidated=true for the full supersede chain, or as_of on GET /v1/facts for a point-in-time view. Facts reads are deterministic SQL, typically under 50 ms. The temporal engine runs on every tier; GET /v1/facts and korely.get_facts() work on every plan, including the free hobby tier. The full model is in temporal facts.

Memory your end users can see

Korely ships a desktop and web app where the human sees the same memory your agents use. The Memory Panel lists every fact; each one can be edited or forgotten, with an audit cascade behind the delete, and an Entity Profile drawer shows everything known about one entity. A fact a user erases there does not come back to agents, on any read path. If agents carry long-term memory about people, the people get the delete button. Read human in the loop for the full model.

Underneath all surfaces: reads are retrieval, not generation. The intelligence runs once, at write time, which is why read quotas are an order of magnitude more generous than write quotas. Quotas for each plan are on the pricing page.

Next steps

  • Quickstart: wire Korely into your stack and run your first search in five minutes.
  • API reference: the complete REST contract behind every call in the mapping table.
  • Architecture: where the write-time intelligence runs and why the read path stays deterministic.
  • Questions about a large corpus or an unusual export shape? Email [email protected]. We read every message.