Support
FAQs
The questions developers ask before they wire Korely in.
What is Korely Agents?
A managed cloud memory layer for AI agents. Your agent writes what it learns
with add() and pulls a prompt-ready block back with
get_context() — across sessions, scoped per end user. Reachable
from the SDK, the CLI, or the REST API.
How is it different from a vector database?
A vector database stores embeddings and returns nearest neighbours. Korely
stores the memory and mines it into typed
(subject, predicate, object) facts, resolves contradictions over
time, and links entities in a graph. A read returns the current
truth, not just the closest text.
What is the difference between a memory and a fact?
A memory is the text you store ("Maria moved to the Pro
plan"). Facts are the typed triples extracted from it
(Maria — plan — Pro), each with a valid_from and an
invalid_at, so the timeline of what was true stays queryable.
How do I keep one customer's data separate from another's?
Scope every call with user_id (your end user). agent_id
scopes one of your apps and run_id a single session. The scope is
enforced server-side on every query — cross-tenant reads are impossible by
construction, not by prompt discipline.
Where is my data stored?
In the EU (Helsinki): Postgres and pgvector, on our own infrastructure, on every tier. It is not replicated to the US. See Data & governance.
How do reads and writes count against my plan?
Writes (add, batch) count against your monthly memories quota. Reads (search, context, facts, get, users) count against your queries quota. Read quotas are an order of magnitude more generous because reads run no model — they are deterministic lookups.
Is there a free tier?
Yes — Hobby is €0 (1,000 memories and 25,000 queries a month, 2 agents). Developer (€19) and Team (€79) raise the limits. End users are unlimited on every tier. See Pricing.
Can a person see and edit what my agent remembered?
Yes. The same store is inspectable and editable by a human in the Korely app — correct a fact, or erase everything about a user with one call. See Human in the loop.
Do you train models on my data?
No. Your stored memories are never used to train models.
What happens when I hit my quota (429)?
Once your monthly write quota is reached, every write (add, batch) returns
429 with the standard error envelope — code is
quota_exceeded. The write-quota 429 carries no Retry-After header; the quota resets at the start of the next
calendar month. Read calls (search, context, facts, get, users) are never
blocked by a write quota: they have their own, larger queries quota. The
error body looks like:
{
"code": "quota_exceeded",
"message": "Monthly memory write limit reached (1000). Upgrade to add more."
}
A separate rate-limit 429 (too many requests per second) does carry
a Retry-After header in integer seconds — that is the only 429
that tells you when to retry. All Korely errors share one envelope:
{"code": "...", "message": "..."}. Upgrade your plan
to raise the write quota. Current limits per plan: Hobby 1 k writes / 25 k
queries, Developer 5 k / 250 k, Team 20 k / 1 M, Scale 75 k / 10 M. See
Pricing.
What is the agent_id cap and how does the 403 work?
Each plan allows a fixed number of distinct agent_id values
(Hobby: 2, Developer: 10, Team: 100, Scale: 500). The first time you pass a
new agent_id that would push you past that ceiling,
the request returns 403 agent_cap_exceeded — the memory is
not written. Existing agents keep working. To add a new
agent, either upgrade or delete an unused agent with
DELETE /v1/agents/{id}.
There is no cap on the number of end users (user_id) or
sessions (run_id) on any tier.
Can I import a large history in bulk?
Yes — use the batch endpoint. Submit a batch of memories in a single request;
the server processes them asynchronously and returns a job id
(e.g. batch_4e1aa0) with its received count. Poll
GET /v1/batch/{id} until status is
done or failed; the result reports how many were
imported versus failed. Every fact the import
extracts is run through the same contradiction check, so a bulk load lands
as resolved, typed facts — not just dumped rows.
POST /v1/batch
{
"memories": [
{ "content": "Alice prefers dark mode", "user_id": "customer-4812" },
{ "content": "Bob is on the Team plan", "user_id": "customer-7203" }
]
}
Response: { "id": "batch_4e1aa0", "status": "queued", "received": 2 }
GET /v1/batch/batch_4e1aa0
Response: {
"id": "batch_4e1aa0",
"status": "done",
"received": 2,
"imported": 2,
"failed": 0,
"errors": []
}
Batch writes count against the same monthly writes quota as individual
add() calls.
How fast are reads in practice?
Read endpoints run no LLM — they are deterministic lookups against Postgres and pgvector. Measured on the production Hetzner cluster (Helsinki):
GET /v1/context(fact-assembled recall — the primary path) — p95 under 80 msGET /v1/facts(filtered, typed fact lookup) — p95 25-35 msGET /v1/memories/{id}— p95 under 30 msPOST /v1/memories/search(semantic vector search) — p95 under 150 ms
POST /v1/memories (add) is slower — it triggers entity extraction
and embedding — but that happens asynchronously after the 200 is returned,
so your agent is never blocked.
Is there a self-hosted option?
Not at this time. Korely Agents is a managed cloud service, EU-hosted in Helsinki. The entity graph, contradiction detection, and bi-temporal fact store rely on server-side infrastructure (Postgres + pgvector + the extraction pipeline) that is not packaged for self-deployment. If EU data residency is the concern, all data stays in the EU on every tier — including Hobby. See Data & governance.