Korely

Support

FAQs

The questions developers ask before they wire Korely in.

What is Korely Agents?

A managed cloud memory layer for AI agents. Your agent writes what it learns with add() and pulls a prompt-ready block back with get_context() — across sessions, scoped per end user. Reachable from the SDK, the CLI, or the REST API.

How is it different from a vector database?

A vector database stores embeddings and returns nearest neighbours. Korely stores the memory and mines it into typed (subject, predicate, object) facts, resolves contradictions over time, and links entities in a graph. A read returns the current truth, not just the closest text.

What is the difference between a memory and a fact?

A memory is the text you store ("Maria moved to the Pro plan"). Facts are the typed triples extracted from it (Maria — plan — Pro), each with a valid_from and an invalid_at, so the timeline of what was true stays queryable.

How do I keep one customer's data separate from another's?

Scope every call with user_id (your end user). agent_id scopes one of your apps and run_id a single session. The scope is enforced server-side on every query — cross-tenant reads are impossible by construction, not by prompt discipline.

Where is my data stored?

In the EU (Helsinki): Postgres and pgvector, on our own infrastructure, on every tier. It is not replicated to the US. See Data & governance.

How do reads and writes count against my plan?

Writes (add, batch) count against your monthly memories quota. Reads (search, context, facts, get, users) count against your queries quota. Read quotas are an order of magnitude more generous because reads run no model — they are deterministic lookups.

Is there a free tier?

Yes — Hobby is €0 (1,000 memories and 25,000 queries a month, 2 agents). Developer (€19) and Team (€79) raise the limits. End users are unlimited on every tier. See Pricing.

Can a person see and edit what my agent remembered?

Yes. The same store is inspectable and editable by a human in the Korely app — correct a fact, or erase everything about a user with one call. See Human in the loop.

Do you train models on my data?

No. Your stored memories are never used to train models.

What happens when I hit my quota (429)?

Once your monthly write quota is reached, every write (add, batch) returns 429 with the standard error envelope — code is quota_exceeded. The write-quota 429 carries no Retry-After header; the quota resets at the start of the next calendar month. Read calls (search, context, facts, get, users) are never blocked by a write quota: they have their own, larger queries quota. The error body looks like:

{
  "code": "quota_exceeded",
  "message": "Monthly memory write limit reached (1000). Upgrade to add more."
}

A separate rate-limit 429 (too many requests per second) does carry a Retry-After header in integer seconds — that is the only 429 that tells you when to retry. All Korely errors share one envelope: {"code": "...", "message": "..."}. Upgrade your plan to raise the write quota. Current limits per plan: Hobby 1 k writes / 25 k queries, Developer 5 k / 250 k, Team 20 k / 1 M, Scale 75 k / 10 M. See Pricing.

What is the agent_id cap and how does the 403 work?

Each plan allows a fixed number of distinct agent_id values (Hobby: 2, Developer: 10, Team: 100, Scale: 500). The first time you pass a new agent_id that would push you past that ceiling, the request returns 403 agent_cap_exceeded — the memory is not written. Existing agents keep working. To add a new agent, either upgrade or delete an unused agent with DELETE /v1/agents/{id}. There is no cap on the number of end users (user_id) or sessions (run_id) on any tier.

Can I import a large history in bulk?

Yes — use the batch endpoint. Submit a batch of memories in a single request; the server processes them asynchronously and returns a job id (e.g. batch_4e1aa0) with its received count. Poll GET /v1/batch/{id} until status is done or failed; the result reports how many were imported versus failed. Every fact the import extracts is run through the same contradiction check, so a bulk load lands as resolved, typed facts — not just dumped rows.

POST /v1/batch
{
  "memories": [
    { "content": "Alice prefers dark mode", "user_id": "customer-4812" },
    { "content": "Bob is on the Team plan", "user_id": "customer-7203" }
  ]
}

Response: { "id": "batch_4e1aa0", "status": "queued", "received": 2 }

GET /v1/batch/batch_4e1aa0
Response: {
  "id": "batch_4e1aa0",
  "status": "done",
  "received": 2,
  "imported": 2,
  "failed": 0,
  "errors": []
}

Batch writes count against the same monthly writes quota as individual add() calls.

How fast are reads in practice?

Read endpoints run no LLM — they are deterministic lookups against Postgres and pgvector. Measured on the production Hetzner cluster (Helsinki):

  • GET /v1/context (fact-assembled recall — the primary path) — p95 under 80 ms
  • GET /v1/facts (filtered, typed fact lookup) — p95 25-35 ms
  • GET /v1/memories/{id} — p95 under 30 ms
  • POST /v1/memories/search (semantic vector search) — p95 under 150 ms

POST /v1/memories (add) is slower — it triggers entity extraction and embedding — but that happens asynchronously after the 200 is returned, so your agent is never blocked.

Is there a self-hosted option?

Not at this time. Korely Agents is a managed cloud service, EU-hosted in Helsinki. The entity graph, contradiction detection, and bi-temporal fact store rely on server-side infrastructure (Postgres + pgvector + the extraction pipeline) that is not packaged for self-deployment. If EU data residency is the concern, all data stays in the EU on every tier — including Hobby. See Data & governance.