Core operations
Get context
Retrieve a single, prompt-ready context block assembled from the most relevant memories and current facts — one call, right before your LLM generates.
get_context is the read call an agent makes immediately before sending a prompt to
an LLM. It searches across all stored memories and current facts for the given query,
assembles them into a single formatted string that fits inside your token budget, and
returns the source ids so you can surface citations. The context string is ready to drop
directly into a system prompt or user message — no post-processing required.
get_context is the primary recall path, and the one place Korely's moat shows up in a
single call. It does not just rank raw text: it assembles the user's currently-valid typed
facts — (subject, predicate, object) triples with bi-temporal validity, so superseded
facts are excluded and only what holds right now is included — alongside the most relevant memories. The
returned sources array mixes fact ids (fct_) and memory ids (mem_),
showing exactly which assembled the block.
Unlike search, which runs semantic vector retrieval and
returns a list of raw memory hits, get_context ranks, deduplicates, and formats the most
relevant material — facts first — into a cohesive block. Use it when you want memory with zero glue code.
Use search when you need the raw results to apply your own ranking or filtering logic.
flowchart LR
A([Agent]) -->|"GET /v1/context?query=..."| B[Korely API]
B --> C[(Memories)]
B --> D[(Facts)]
C --> E[Rank + dedup]
D --> E
E --> F[Format to token budget]
F -->|"{ context, tokens, sources }"| A
A -->|"system prompt + context"| G([LLM]) Request
Endpoint: GET /v1/context. SDK: korely.get_context(*, query, user_id=None, agent_id=None, token_budget=800).
| Param | Type | Notes |
|---|---|---|
query | string | Required. The text to retrieve relevant context for. Typically the user's latest message or the topic your agent is about to reason over. |
user_id | string | Optional. Scopes retrieval to memories belonging to a specific end-user. Pass your app's user identifier (e.g. customer-4812). If omitted, retrieval is scoped to the agent namespace only. |
agent_id | string | Optional. Scopes retrieval to a specific agent within the namespace. Useful when multiple agent roles share the same API key. |
token_budget | integer | Optional. Maximum approximate token count for the returned context string. Default: 800. Korely packs the highest-scoring content that fits within this budget. Lower values keep context tight for small models; raise it for more coverage. |
Example
from korely_memory import Korely
korely = Korely(api_key="kor_live_...")
# Call right before you build the LLM promptresult = korely.get_context( query="What are this user's dietary preferences?", user_id="customer-4812", token_budget=600,)
system_prompt = f"""You are a helpful nutrition assistant.
Relevant context about this user:{result.context}
Answer using the context above when relevant."""
# result.tokens → 312# result.sources → ["mem_8f2c1a", "fct_b91e", "mem_3d0f44"]Response
{ "context": "User is vegetarian and avoids gluten. Allergic to tree nuts (flagged 2025-11). Prefers high-protein meals. Recent sessions indicate interest in Mediterranean cuisine. Goals: maintain weight, increase energy.", "tokens": 312, "sources": [ "mem_8f2c1a", "fct_b91e", "mem_3d0f44" ]}| Field | Type | Description |
|---|---|---|
context | string | A formatted, prompt-ready block of text containing the most relevant memories and facts. Ready to embed directly into a system or user message. |
tokens | integer | Approximate token count of the returned context string. Always at or below the requested token_budget. |
sources | string[] | Ids of the memories and facts included in the context block. Use these to power citations or to retrieve full objects via search or GET /v1/memories/{id}. |
Errors
| Status | Code string | When it happens |
|---|---|---|
401 | invalid_key | The Authorization header is missing, malformed, or the key has been revoked. Re-authenticate and retry. |
403 | agent_cap_exceeded | The agent_id you passed is new and your plan has reached its agent-namespace limit (2 on Hobby, 10 on Developer, 100 on Team, 500 on Scale). Use an existing agent_id or upgrade your plan. |
422 | invalid_request | Request validation failed. Typically a missing or wrong-type query value, or a non-integer token_budget. Like every Korely error, the body is the flat envelope {"code": "invalid_request", "message": "query: Field required"}. |
429 | quota_exceeded | Your monthly query quota (including the +10% grace) is exhausted. Upgrade your plan or wait for the reset. The monthly-quota 429 does not carry a Retry-After header — only a transient per-minute rate-limit 429 would, as integer seconds. |
Notes
- Idempotency.
GET /v1/contextis a pure read — calling it multiple times with the same parameters produces the same result (assuming no new memories have been written). It is safe to retry on transient network errors without risk of duplicate writes. - Scoping. Pass
user_idin customer-facing agents to restrict retrieval to one end user. A call withoutuser_iddraws from all end users in the namespace, which is correct for an internal ops agent and wrong for a per-customer chat.agent_idadds a second, narrowing scope on top ofuser_id— it does not broaden. - Token budget. The budget is an approximation based on a 4-character-per-token estimate. The actual token count your LLM sees may differ by a few percent depending on the model's tokenizer. Leave a 10-20% buffer below your model's hard context limit.
- Rate limiting.
get_contextcounts against your monthly query quota, the same pool assearch. Hobby plans include 25k queries per month; Developer 250k; Team 1M; Scale 10M. There is no per-minute rate limit on reads, but hitting the monthly cap returns429 quota_exceeded. - Empty context. If no memories or facts are stored for the given scope,
contextis an empty string,tokensis0, andsourcesis an empty array. This is a valid 200 response, not an error. Agents should handle the empty-context case gracefully by generating a neutral reply rather than surfacing an error to the user.
Related
- Search memories — returns raw ranked hits when you need to apply your own ranking or filtering before building a prompt.
- Add a memory — write new content and run the full extraction pipeline so future context calls include it.
- Delete a memory — soft-delete a memory so it is excluded from future context blocks.
- API reference — complete endpoint contract with all parameters and response shapes.