Korely

Context & facts

Get context

Assemble a prompt-ready Markdown block, the end user's known facts plus query-relevant memories, under a token budget. Deterministic retrieval and formatting only; no LLM generation.

GET /v1/context

SDK: korely.get_context(query, ...). This is the one call you drop into a prompt. Korely cosine-ranks the end user's typed facts, searches the query-relevant memories, packs both under your token budget, and hands back a Markdown string you can paste straight into the model's context. Nothing is generated, the assembly is deterministic.

Authentication

HTTP header, required: Authorization: Bearer kor_live_.... The key must carry the memories:read scope.

Query parameters

ParameterTypeRequiredDescription
querystringRequiredThe query the context block is retrieved and ranked against, facts are cosine-ranked and memories are searched against it. min_length=1, max_length=2000.
user_idstringOptionalThe end user to build context for (maps to end_user_id). When omitted, facts are not filtered by end user and the memory search runs across all of the API key owner's end users. Default null.
agent_idstringOptionalOptional agent scope filter applied to facts (AgentFact.agent_id). Default null.
token_budgetintegerOptionalTotal token budget for the assembled block. Facts get roughly 50% of the budget; memories fill the rest. ge=50, le=8000. Default 800.

Example request

Terminal window
curl -G https://api.korely.ai/v1/context \
-H "Authorization: Bearer kor_live_..." \
--data-urlencode "query=What does Giulia want for the weekly sync?" \
--data-urlencode "user_id=customer-giulia-4812" \
--data-urlencode "agent_id=support-bot" \
--data-urlencode "token_budget=800"

Response

200 OK. A prompt-ready Markdown block, its estimated token count, and the ordered source ids that contributed to it.

{
"context": "_The facts below are a compact profile of the user; the memories are the verbatim source. If a fact does not directly answer the question, rely on the memories._
## Known facts
- Giulia prefers async standups (since 2026-05-12)
- Giulia works_at Acme Corp (since 2026-04-01)
## Relevant memories
- Giulia asked to move the weekly sync to Mondays and keep it under 30 minutes.",
"tokens": 91,
"sources": ["fct_a1", "fct_a2", "mem_8f2c1a"]
}
FieldTypeDescription
contextstringThe Markdown block: an optional reader-trust note, then ## Known facts (top facts, cosine-ranked, capped at 10 or 50% of the budget), then ## Relevant memories. Empty string when nothing fits.
tokensintegerEstimated token count of the context string (a char / 4 heuristic).
sourcesarray<string>Ordered list of the source public ids that contributed to the block: fact ids (fct_) first, then memory ids (mem_).

Errors

StatusCodeCause
401invalid_keyThe Authorization: Bearer credentials are missing, or the kor_live_ key does not resolve to a live API key. Response carries WWW-Authenticate: Bearer.
403forbiddenThe API key does not carry the memories:read scope.
429rate_limit_exceededPer-tier fixed-window minute/hour/day rate limit exceeded. Response carries Retry-After, X-RateLimit-Limit, and X-RateLimit-Remaining headers.
429quota_exceededMonthly query quota reached (tier queries_per_month plus a 10% grace). Upgrade for more.
422invalid_requestRequest validation failed, query is missing or violates min_length=1/max_length=2000, or token_budget falls outside ge=50/le=8000.

Notes

  • Read-only. A plain GET, no soft-delete or pagination semantics on this endpoint. It is deterministic retrieval and formatting only; there is no LLM generation.
  • Budget split. Facts are capped at 50% of token_budget and at the top 10 by cosine relevance. Memories (the verbatim source) fill the remainder; the top memory may be truncated at a sentence boundary to fit.
  • Budget is clamped. token_budget is clamped server-side to [50, 8000] even though the query parameter already enforces the same bounds.
  • Reader-trust note. The leading guidance note is prepended only when facts exist and the budget is at least 150 tokens.
  • Graceful degradation. When embeddings are unavailable, fact ranking falls back to most-recent-active. If the memory search is unavailable, the block degrades to facts-only rather than erroring.
  • End-user scope. user_id is passed as the end-user scope (end_user_id); omitting it builds context across all of the API key owner's end users.
  • Counts as a read. Every successful call records usage and counts against your monthly read quota.

Related