Migrate from Supermemory
If you built on Supermemory, the Korely API will feel familiar. The verbs
match: add a memory, search memories, list them, delete them. The main
difference is scoping. Supermemory scopes everything with one free-form
containerTag; Korely names the dimensions:
user_id identifies your end user, agent_id
identifies your application, run_id identifies a session.
Switching is mechanical: map the calls, replay the corpus through the
batch endpoint, verify with a search.
The short version: three steps. Export your memories
with Supermemory's list endpoint, import them with
POST /v1/batch, then run a search against the migrated
corpus. Entity extraction, graph edges, and typed facts build
automatically on ingest. There is nothing to enable and nothing to
configure.
Concept mapping
Before touching code, map the mental model. Supermemory has one scoping dimension and one document type. Korely names three dimensions and adds a typed fact layer on top of raw memories.
| Supermemory concept | Korely equivalent | Notes |
|---|---|---|
containerTag | user_id | Free-form string. End users are unlimited on every plan. |
Compound tag e.g. supportbot_customer-4812 | agent_id="supportbot" + user_id="customer-4812" | Split once at migration time; filters are additive after that. |
customId (dedup key) | metadata.original_id | Carry it in metadata; no first-class dedup key in Korely. |
| Document (raw text) | Memory | Same unit. POST /v1/memories or korely.add(). |
Extracted fact (search_mode="memories") | Typed fact — GET /v1/facts | Auto-built at ingest. Subject + predicate + object triple, bi-temporal. |
| Project / workspace | agent_id | One API key, multiple logical applications via agent_id. |
| Memory history | GET /v1/memories/{id}/history | Also available via korely.history(memory_id). |
Call mapping
Every core Supermemory call has a direct Korely equivalent, in the Python SDK and over REST. The full REST contract is in the API reference.
| Supermemory call | Korely SDK | Korely REST |
|---|---|---|
client.add(content="...", container_tag="u1") | korely.add("...", user_id="u1") | POST /v1/memories |
client.search.memories(q="...", container_tag="u1") | korely.search("...", user_id="u1") | POST /v1/memories/search |
client.documents.list(container_tags=["u1"]) | korely.get_all(user_id="u1") | GET /v1/memories?user_id=u1 |
GET /v3/documents/:id | korely.get(memory_id) | GET /v1/memories/:id |
DELETE /v3/documents/:id | korely.delete(memory_id) | DELETE /v1/memories/:id |
search_mode="memories" (extracted facts) | korely.get_facts(user_id="u1") | GET /v1/facts?user_id=u1 |
To clear one end user entirely, korely.delete_all(user_id="u1")
maps to DELETE /v1/users/:user_id/memories: delete every
memory for a user with one call.
On scoping: a containerTag that holds an end-user identifier
becomes user_id, free-form, and end users are unlimited on
every tier. If your tags are compound, for example
supportbot_customer-4812, split them: the application part
becomes agent_id, the user part becomes user_id,
and a session identifier becomes run_id. Filters are
additive, so you keep the same isolation with named dimensions instead of
string conventions. One Supermemory detail to remember while exporting:
memory search takes containerTag (singular) while the
document list endpoint takes containerTags (an array).
If you used customId for deduplication, carry it over inside
metadata so every imported memory keeps a pointer to its
original record.
In code, the switch looks like this:
# before
from supermemory import Supermemory
client = Supermemory(api_key="...")
client.add(content="Prefers invoices as PDF", container_tag="customer-4812")
# after
from korely_memory import Korely
korely = Korely(api_key="kor_live_...")
korely.add("Prefers invoices as PDF", user_id="customer-4812")
results = korely.search("invoice preferences", user_id="customer-4812")
The Python and Node SDKs are live on PyPI and npm as korely-memory.
See the SDK reference for the full method list.
Step 1: export from Supermemory
Supermemory's list endpoint, POST /v3/documents/list, returns
your memories page by page (up to 200 per page) with their
containerTags and metadata. It does not return
the full text, so fetch each document body with
GET /v3/documents/:id and write everything to one JSON file:
import json, requests
SM = "https://api.supermemory.ai"headers = {"Authorization": "Bearer sm_..."}
export, page = [], 1while True: r = requests.post( f"{SM}/v3/documents/list", headers=headers, json={"containerTags": ["customer-4812"], "limit": 200, "page": page}, ) batch = r.json().get("memories", []) for doc in batch: full = requests.get(f"{SM}/v3/documents/" + doc["id"], headers=headers).json() export.append( { "content": full["content"], "containerTags": doc.get("containerTags", []), "metadata": doc.get("metadata") or {}, } ) if len(batch) < 200: break page += 1
json.dump(export, open("supermemory_export.json", "w"))print(len(export), "memories exported")
Run it once per container tag, or drop the containerTags
filter to export the whole project in one pass.
Step 2: import with the batch endpoint
POST /v1/batch accepts up to 500 memory objects per request,
same shape as POST /v1/memories, processed asynchronously.
You get a job id back and poll it until it completes. Each imported item
runs the full write pipeline: document and chunk embeddings, entity
extraction on our own infrastructure, typed-fact extraction with
contradiction checking and bi-temporal validity. About a tenth of a cent
of intelligence per memory, all included in your plan.
import json, requests
export = json.load(open("supermemory_export.json"))
def to_scope(tags): # one containerTag per end user: the tag is the user_id return {"user_id": tags[0]} if tags else {}
memories = [ { "content": item["content"], "metadata": item.get("metadata", {}), **to_scope(item.get("containerTags", [])), } for item in export]
# POST /v1/batch takes up to 500 memories per requestfor i in range(0, len(memories), 500): r = requests.post( "https://api.korely.ai/v1/batch", headers={"Authorization": "Bearer kor_live_..."}, json={"memories": memories[i : i + 500]}, ) print(r.json()) # {"id": "bat_9b21f4", "status": "processing", "received": 500}
If your tags are compound, replace to_scope with a split:
agent_id from the application part, user_id from
the user part. Poll the job until it reports completed:
curl https://api.korely.ai/v1/batch/bat_9b21f4 \ -H "Authorization: Bearer kor_live_..."
# 200 OK{"id": "bat_9b21f4", "status": "completed", "received": 500, "imported": 500, "failed": 0, "errors": []} Keep the import inspectable. Tag migrated memories in
metadata, for example
{"source": "supermemory_export"}, so you can filter them
with GET /v1/memories?user_id=u1 and audit exactly what
came over.
Step 3: verify recall
The primary recall path on Korely is GET /v1/context: it
assembles a compact answer from the typed facts and relevant memories it
holds about one end user, with the source ids attached. This is the call
you put in front of your agent's model — it returns the active facts first,
not a raw list to re-rank. Point it at a query you know your old corpus can
answer:
curl "https://api.korely.ai/v1/context?query=invoice%20preferences&user_id=customer-4812" \ -H "Authorization: Bearer kor_live_..."
# 200 OK{ "context": "## Known facts\n- customer-4812 likes PDF invoices (since 2026-06-15)\n\n## Relevant memories\n- Prefers invoices as PDF, replies fastest before 10am CET.", "tokens": 30, "sources": ["fct_b91e", "mem_7c20de"]}
The context string is fact-assembled from the bi-temporal
layer, and sources mixes the fact ids
(fct_) and memory ids (mem_) that backed it. The
only model call on the read path is the query embedding, a fraction of a
hundredth of a cent. No generative model composes the output. Your agent's
own model does the reasoning over what Korely returns.
If you want the raw matches instead — the closest Supermemory analogue —
POST /v1/memories/search returns semantic vector matches
(cosine similarity over the stored embeddings) scoped to one end user. Use
it as a secondary path when you want the underlying memories rather than the
assembled context:
curl -X POST https://api.korely.ai/v1/memories/search \ -H "Authorization: Bearer kor_live_..." \ -H "Content-Type: application/json" \ -d '{"query": "invoice preferences", "user_id": "customer-4812", "limit": 5}'
# 200 OK{ "results": [ {"id": "mem_7c20de", "score": 0.91, "snippet": "Prefers invoices as PDF, replies fastest before 10am CET.", "user_id": "customer-4812", "agent_id": null, "metadata": {"source": "supermemory_export"}} ]}
If you relied on search_mode="memories" for extracted facts,
the equivalent check is GET /v1/facts?user_id=customer-4812:
extraction runs on ingest, so the typed layer fills in as the batch
completes.
Gotchas
A few things that catch migrating teams off guard:
- Back-dating is REST-only, not on the SDK
add()convenience. Supermemory lets you back-date a memory with a timestamp at write time. Over REST,POST /v1/memoriesaccepts an optionaltimestampfield, stored as the fact'svalid_from, so you can preserve original write dates during a bulk import. The SDKadd()convenience method does not expose it, and the batch endpoint has no per-item timestamp. If you need temporal ordering on the SDK path, encode the original date inmetadataand use it in your prompts. -
search()has notime_filterorinclude_history. Supermemory accepts date ranges in search. Korely search takesquery,user_id,agent_id, andlimitonly. For temporal queries useGET /v1/facts?as_of=2025-01-01on the typed-fact layer. - Agent cap is plan-limited; end users are not.
If you register more than your plan's
agent_idlimit you receive a403 agent_cap_exceededresponse. End users (user_id) are always unlimited. Keep application identifiers inagent_idand per-customer identifiers inuser_id. - Write quota is enforced; reads are generous.
Each plan has a monthly write quota (memories added or updated). When you
hit it the API returns
429with body{"code": "quota_exceeded", "message": "..."}— the write-quota 429 carries noRetry-Afterheader; you clear it by upgrading or waiting for the monthly reset, not by retrying. (Only the separate per-second rate-limit 429 carries aRetry-Afterheader, in integer seconds.) Queries are an order of magnitude more generous. The batch endpoint counts each memory in the job individually against the write quota, so import in off-peak hours if your corpus is large. Quotas per plan are on the pricing page. - Optimistic concurrency on
update().PATCH /v1/memories/{id}accepts an optionalexpected_updated_atfield. If the memory was modified between your read and your write, the API returns409 stale_write. Re-fetch and retry. This is a guard against clobber in concurrent agent environments; Supermemory does not have an equivalent. - Batch jobs are asynchronous.
POST /v1/batchreturns a job id immediately with statusprocessing. Entity extraction, graph edges, and typed facts are populated when the job completes. PollGET /v1/batch/{id}before running your verification search, or you will search a partially-imported corpus.
What your agents gain
A typed knowledge graph
Every imported memory goes through entity extraction. People, companies,
products, places, and concepts become nodes; relations between them become
typed edges, drawn from ~50 canonical predicates in 9 families
(preferences, people, places, work, ownership, health, financial, events,
other). The graph is traversable via the SDK and REST: fetch all memories
for a user, or query facts directly with korely.get_facts() /
GET /v1/facts. Graph reads are pure SQL and graph lookups
with zero AI calls, reusing stored vectors rather than computing new
embeddings.
# retrieve typed facts for a user — flat list, the graph edge layer
facts = korely.get_facts(user_id="customer-4812")
# each fact normalizes the verb: predicate "likes", predicate_raw "prefers"
# facts[0].subject "customer-4812", facts[0].predicate "likes", facts[0].object "PDF invoices"
# or via REST → JSON {"facts": [...], "total": N}
# GET /v1/facts?user_id=customer-4812
# list all memories: GET /v1/memories?user_id=customer-4812 Bi-temporal facts
Each memory also yields typed (subject, predicate, object) triples, and
every triple carries valid_from and invalid_at.
When new information contradicts an existing fact, a two-stage
contradiction check at write time invalidates the old fact and keeps it as
history. Default reads return only what is true now; pass
include_invalidated=true for the full supersede chain, or
as_of on GET /v1/facts for a point-in-time view.
Facts reads are deterministic SQL, typically under 50 ms. The temporal
engine runs on every tier; GET /v1/facts and
korely.get_facts() work on every plan, including the free
hobby tier. The full model is in
temporal facts.
Memory your end users can see
Korely ships a desktop and web app where the human sees the same memory your agents use. The Memory Panel lists every fact; each one can be edited or forgotten, with an audit cascade behind the delete, and an Entity Profile drawer shows everything known about one entity. A fact a user erases there does not come back to agents, on any read path. If agents carry long-term memory about people, the people get the delete button. Read human in the loop for the full model.
Underneath all surfaces: reads are retrieval, not generation. The intelligence runs once, at write time, which is why read quotas are an order of magnitude more generous than write quotas. Quotas for each plan are on the pricing page.
Next steps
- Quickstart: wire Korely into your stack and run your first search in five minutes.
- API reference: the complete REST contract behind every call in the mapping table.
- Architecture: where the write-time intelligence runs and why the read path stays deterministic.
- Questions about a large corpus or an unusual export shape? Email [email protected]. We read every message.