OpenAI Agents SDK
The OpenAI
Agents SDK is a small Python framework for building agentic apps:
Agent, Runner, and function tools. A function tool
is just a Python function with the @function_tool decorator,
the name becomes the tool name, the docstring becomes the description, and
your type hints become the JSON schema. Korely plugs in as the memory layer:
write three tools that call the korely-memory SDK, attach them
to an Agent, and the model recalls, searches and saves on its
own.
The moat: facts, not just snippets
Most memory layers hand back the chunks that matched a vector search. Korely
does that too, but its read path leads with something different.
get_context assembles the user's active typed facts,
the bi-temporal (subject, predicate, object) triples Korely
extracts from everything you save and keeps current through two-stage
contradiction resolution, into a single ready-to-prompt block. When the
user upgrades from Free to Pro, the old fact is not deleted; it is
invalidated and the new one supersedes it, with both timestamps preserved.
So the recall tool below returns a settled view ("Luca is on Pro", "prefers
async standups") instead of forcing the model to re-read raw snippets every
turn. That assembled block, not a pile of rows, is what makes an Agents SDK
agent feel like it actually knows the user.
Install
You need the OpenAI Agents SDK and the Korely Python package. Note the
install name and the import name differ for the Agents SDK, you
pip install openai-agents but import from
agents:
pip install openai-agents korely-memoryPython 3.10 or later is required for the Agents SDK. You need two keys: your OpenAI key (the model runs on OpenAI) and your Korely key (copy it from Settings → API Keys in the Korely app). Export both:
export OPENAI_API_KEY="sk-..."export KORELY_API_KEY="kor_live_..."Define the memory tools
Three tools cover the loop: recall (the moat,
fact-assembled context, reach for this first), save (write
something worth remembering), and search (find the one
memory that mentioned a thing). Each is a plain function decorated with
@function_tool; the docstring is what the model reads to decide
when to call it, so write it for the model.
import osfrom agents import function_toolfrom korely_memory import Korely
korely = Korely(api_key=os.environ["KORELY_API_KEY"], region="eu")
@function_tooldef recall_memory(query: str) -> str: """Recall settled facts and relevant memories about the user.
Reach for this first, before answering anything personal. It returns an assembled block of the user's active typed facts plus relevant memories.
Args: query: What you want to remember about, e.g. "preferred language". """ ctx = korely.get_context(query=query, token_budget=800) return ctx.context
@function_tooldef save_memory(content: str) -> str: """Save something worth remembering for later turns.
Args: content: A short, self-contained statement to remember, e.g. "User switched their main language to Rust". """ memory = korely.add(content) return f"Saved (id={memory.id})."
@function_tooldef search_memory(query: str) -> str: """Find the exact memory that mentioned something.
Use this when you need a specific past detail rather than a settled fact.
Args: query: A keyword-style query, 1 to 5 words. """ hits = korely.search(query, limit=5) if not hits: return "No matching memories." return "\n".join(f"- ({h.score:.2f}) {h.snippet}" for h in hits) Why recall_memory wins. search_memory returns matching snippets, useful for "find the
memory that mentioned the Lisbon hotel". recall_memory calls
get_context, which returns the user's current typed
facts already resolved for contradictions and assembled into a prompt
block. Give the model both and it will reach for context when it just needs
to know the user, and for search when it needs one exact past
detail.
Attach them to an agent
Pass the decorated functions in the tools=[...] list on
Agent. The instructions string is the system
prompt, that is where you nudge the model to consult memory before
answering. Then run it with Runner.run_sync for a one-shot
script, or await Runner.run(...) inside an async function. The
answer is on result.final_output.
from agents import Agent, Runnerfrom memory_tools import recall_memory, save_memory, search_memory
agent = Agent( name="Assistant", instructions=( "You are a helpful assistant with long-term memory. " "Call recall_memory before answering anything about the user. " "When the user tells you something durable, call save_memory." ), tools=[recall_memory, save_memory, search_memory],)
# --- Option A: synchronous (no event loop needed) ---result = Runner.run_sync( agent, "Remember that I switched my main language to Rust. " "Then tell me what language I prefer.",)print(result.final_output)
Inside a server, a notebook, or anywhere there is already an event loop, use
the async runner instead of run_sync:
import asynciofrom agents import Agent, Runnerfrom memory_tools import recall_memory, save_memory, search_memory
agent = Agent( name="Assistant", instructions="You are a helpful assistant with long-term memory.", tools=[recall_memory, save_memory, search_memory],)
async def main() -> None: result = await Runner.run( agent, input="What language do I prefer, and why did I switch?", ) print(result.final_output)
asyncio.run(main()) run_sync vs an active event loop. Runner.run_sync is a convenience wrapper that drives the async
runner for you, it is perfect for a plain script. Do not call it from
inside a running event loop (a Jupyter cell, an async web handler); there,
await Runner.run(agent, input=...) instead. The first
positional argument is always the agent; the user message is the second
positional argument, or pass it as input=....
What comes back
Korely's read path is pure retrieval: get_context assembles
facts and search runs semantic vector search. No generative
model runs on Korely's side, your OpenAI model does all the reasoning over
the block Korely returns. That is also why read quotas are far more generous
than write quotas: reads are cheap on our side.
$ python agent.py
→ tool call save_memory(content="User switched their main language to Rust")
← Saved (id=mem_...).
→ tool call recall_memory(query="preferred language")
← ## Known facts
- dana likes Rust (since 2026-06-16)
the superseded TypeScript fact is excluded automatically
Agent › You prefer Rust now. You told me you switched your main
language to it, and I've recorded that as your current preference —
the earlier TypeScript note is kept on file but no longer active. Fact extraction is asynchronous. korely.add(...)
returns as soon as the raw memory is stored, but the typed facts it
produces are extracted a few seconds later, server-side. So a
save_memory call followed immediately by a
recall_memory in the same turn may not yet reflect
the just-saved fact. That is fine for a chat agent, the fact is settled
and contradiction-checked by the next turn. Do not poll
get_context in a tight loop expecting the fact to appear
instantly; let it land between turns.
Scoping by end user
The tools above are workspace-scoped, they write into your account. When
each end user of your product needs an isolated memory, bind a
user_id server-side and pass it on every Korely call. Capture it
in a closure so the model can never set or cross it:
import osfrom agents import Agent, Runner, function_toolfrom korely_memory import Korely
korely = Korely(api_key=os.environ["KORELY_API_KEY"], region="eu")
def build_agent(user_id: str) -> Agent: """user_id is bound here, server-side. The model never sees it."""
@function_tool def recall_memory(query: str) -> str: """Recall settled facts and memories about this user.""" return korely.get_context(query=query, user_id=user_id).context
@function_tool def save_memory(content: str) -> str: """Save something worth remembering about this user.""" memory = korely.add(content, user_id=user_id, agent_id="support-bot") return f"Saved (id={memory.id})."
return Agent( name="Support bot", instructions="Recall memory before answering. Save durable facts.", tools=[recall_memory, save_memory], )
# Per request, bound to the authenticated session:agent = build_agent(user_id=session.user_id)result = Runner.run_sync(agent, user_message)print(result.final_output)
Each user_id is its own memory space: facts, graph and recall
never cross between them. agent_id labels which application
wrote the memory, and run_id can tag a single session.
Where to go next
- Memory as a tool, the end-to-end pattern of exposing recall/save/search as agent tools, with prompt-tuning tips for getting the model to call them.
- Get context, how the
assembled fact block is built, what
token_budgetcontrols, and what eachsourcein the response means.
Something not working? Email
[email protected] with your
openai-agents and korely-memory versions and the
error output. We read every message.