Context & facts
Get context
Assemble a prompt-ready Markdown block, the end user's known facts plus query-relevant memories, under a token budget. Deterministic retrieval and formatting only; no LLM generation.
/v1/context
SDK: korely.get_context(query, ...). This is the one call you
drop into a prompt. Korely cosine-ranks the end user's typed facts, searches
the query-relevant memories, packs both under your token budget, and hands
back a Markdown string you can paste straight into the model's context.
Nothing is generated, the assembly is deterministic.
Authentication
HTTP header, required: Authorization: Bearer kor_live_.... The key must carry the memories:read scope.
Query parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Required | The query the context block is retrieved and ranked against, facts are cosine-ranked and memories are searched against it. min_length=1, max_length=2000. |
user_id | string | Optional | The end user to build context for (maps to end_user_id). When omitted, facts are not filtered by end user and the memory search runs across all of the API key owner's end users. Default null. |
agent_id | string | Optional | Optional agent scope filter applied to facts (AgentFact.agent_id). Default null. |
token_budget | integer | Optional | Total token budget for the assembled block. Facts get roughly 50% of the budget; memories fill the rest. ge=50, le=8000. Default 800. |
Example request
curl -G https://api.korely.ai/v1/context \ -H "Authorization: Bearer kor_live_..." \ --data-urlencode "query=What does Giulia want for the weekly sync?" \ --data-urlencode "user_id=customer-giulia-4812" \ --data-urlencode "agent_id=support-bot" \ --data-urlencode "token_budget=800"Response
200 OK. A prompt-ready Markdown block, its estimated token
count, and the ordered source ids that contributed to it.
{ "context": "_The facts below are a compact profile of the user; the memories are the verbatim source. If a fact does not directly answer the question, rely on the memories._
## Known facts- Giulia prefers async standups (since 2026-05-12)- Giulia works_at Acme Corp (since 2026-04-01)
## Relevant memories- Giulia asked to move the weekly sync to Mondays and keep it under 30 minutes.", "tokens": 91, "sources": ["fct_a1", "fct_a2", "mem_8f2c1a"]}| Field | Type | Description |
|---|---|---|
context | string | The Markdown block: an optional reader-trust note, then ## Known facts (top facts, cosine-ranked, capped at 10 or 50% of the budget), then ## Relevant memories. Empty string when nothing fits. |
tokens | integer | Estimated token count of the context string (a char / 4 heuristic). |
sources | array<string> | Ordered list of the source public ids that contributed to the block: fact ids (fct_) first, then memory ids (mem_). |
Errors
| Status | Code | Cause |
|---|---|---|
401 | invalid_key | The Authorization: Bearer credentials are missing, or the kor_live_ key does not resolve to a live API key. Response carries WWW-Authenticate: Bearer. |
403 | forbidden | The API key does not carry the memories:read scope. |
429 | rate_limit_exceeded | Per-tier fixed-window minute/hour/day rate limit exceeded. Response carries Retry-After, X-RateLimit-Limit, and X-RateLimit-Remaining headers. |
429 | quota_exceeded | Monthly query quota reached (tier queries_per_month plus a 10% grace). Upgrade for more. |
422 | invalid_request | Request validation failed, query is missing or violates min_length=1/max_length=2000, or token_budget falls outside ge=50/le=8000. |
Notes
- Read-only. A plain GET, no soft-delete or pagination semantics on this endpoint. It is deterministic retrieval and formatting only; there is no LLM generation.
- Budget split. Facts are capped at 50% of
token_budgetand at the top 10 by cosine relevance. Memories (the verbatim source) fill the remainder; the top memory may be truncated at a sentence boundary to fit. - Budget is clamped.
token_budgetis clamped server-side to[50, 8000]even though the query parameter already enforces the same bounds. - Reader-trust note. The leading guidance note is prepended only when facts exist and the budget is at least 150 tokens.
- Graceful degradation. When embeddings are unavailable, fact ranking falls back to most-recent-active. If the memory search is unavailable, the block degrades to facts-only rather than erroring.
- End-user scope.
user_idis passed as the end-user scope (end_user_id); omitting it builds context across all of the API key owner's end users. - Counts as a read. Every successful call records usage and counts against your monthly read quota.
Related
- Get context, guide, the narrative walkthrough with context.
- Get facts, the raw typed facts for an end user.
- Search memories, the underlying memory search.
- Add a memory, write the memories this block is built from.