Korely

OpenAI Agents SDK

The OpenAI Agents SDK is a small Python framework for building agentic apps: Agent, Runner, and function tools. A function tool is just a Python function with the @function_tool decorator, the name becomes the tool name, the docstring becomes the description, and your type hints become the JSON schema. Korely plugs in as the memory layer: write three tools that call the korely-memory SDK, attach them to an Agent, and the model recalls, searches and saves on its own.

The moat: facts, not just snippets

Most memory layers hand back the chunks that matched a vector search. Korely does that too, but its read path leads with something different. get_context assembles the user's active typed facts, the bi-temporal (subject, predicate, object) triples Korely extracts from everything you save and keeps current through two-stage contradiction resolution, into a single ready-to-prompt block. When the user upgrades from Free to Pro, the old fact is not deleted; it is invalidated and the new one supersedes it, with both timestamps preserved. So the recall tool below returns a settled view ("Luca is on Pro", "prefers async standups") instead of forcing the model to re-read raw snippets every turn. That assembled block, not a pile of rows, is what makes an Agents SDK agent feel like it actually knows the user.

Install

You need the OpenAI Agents SDK and the Korely Python package. Note the install name and the import name differ for the Agents SDK, you pip install openai-agents but import from agents:

Terminal window
pip install openai-agents korely-memory

Python 3.10 or later is required for the Agents SDK. You need two keys: your OpenAI key (the model runs on OpenAI) and your Korely key (copy it from Settings → API Keys in the Korely app). Export both:

Terminal window
export OPENAI_API_KEY="sk-..."
export KORELY_API_KEY="kor_live_..."

Define the memory tools

Three tools cover the loop: recall (the moat, fact-assembled context, reach for this first), save (write something worth remembering), and search (find the one memory that mentioned a thing). Each is a plain function decorated with @function_tool; the docstring is what the model reads to decide when to call it, so write it for the model.

memory_tools.py
import os
from agents import function_tool
from korely_memory import Korely
korely = Korely(api_key=os.environ["KORELY_API_KEY"], region="eu")
@function_tool
def recall_memory(query: str) -> str:
"""Recall settled facts and relevant memories about the user.
Reach for this first, before answering anything personal. It returns an
assembled block of the user's active typed facts plus relevant memories.
Args:
query: What you want to remember about, e.g. "preferred language".
"""
ctx = korely.get_context(query=query, token_budget=800)
return ctx.context
@function_tool
def save_memory(content: str) -> str:
"""Save something worth remembering for later turns.
Args:
content: A short, self-contained statement to remember,
e.g. "User switched their main language to Rust".
"""
memory = korely.add(content)
return f"Saved (id={memory.id})."
@function_tool
def search_memory(query: str) -> str:
"""Find the exact memory that mentioned something.
Use this when you need a specific past detail rather than a settled fact.
Args:
query: A keyword-style query, 1 to 5 words.
"""
hits = korely.search(query, limit=5)
if not hits:
return "No matching memories."
return "\n".join(f"- ({h.score:.2f}) {h.snippet}" for h in hits)

Why recall_memory wins. search_memory returns matching snippets, useful for "find the memory that mentioned the Lisbon hotel". recall_memory calls get_context, which returns the user's current typed facts already resolved for contradictions and assembled into a prompt block. Give the model both and it will reach for context when it just needs to know the user, and for search when it needs one exact past detail.

Attach them to an agent

Pass the decorated functions in the tools=[...] list on Agent. The instructions string is the system prompt, that is where you nudge the model to consult memory before answering. Then run it with Runner.run_sync for a one-shot script, or await Runner.run(...) inside an async function. The answer is on result.final_output.

agent.py
from agents import Agent, Runner
from memory_tools import recall_memory, save_memory, search_memory
agent = Agent(
name="Assistant",
instructions=(
"You are a helpful assistant with long-term memory. "
"Call recall_memory before answering anything about the user. "
"When the user tells you something durable, call save_memory."
),
tools=[recall_memory, save_memory, search_memory],
)
# --- Option A: synchronous (no event loop needed) ---
result = Runner.run_sync(
agent,
"Remember that I switched my main language to Rust. "
"Then tell me what language I prefer.",
)
print(result.final_output)

Inside a server, a notebook, or anywhere there is already an event loop, use the async runner instead of run_sync:

import asyncio
from agents import Agent, Runner
from memory_tools import recall_memory, save_memory, search_memory
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant with long-term memory.",
tools=[recall_memory, save_memory, search_memory],
)
async def main() -> None:
result = await Runner.run(
agent,
input="What language do I prefer, and why did I switch?",
)
print(result.final_output)
asyncio.run(main())

run_sync vs an active event loop. Runner.run_sync is a convenience wrapper that drives the async runner for you, it is perfect for a plain script. Do not call it from inside a running event loop (a Jupyter cell, an async web handler); there, await Runner.run(agent, input=...) instead. The first positional argument is always the agent; the user message is the second positional argument, or pass it as input=....

What comes back

Korely's read path is pure retrieval: get_context assembles facts and search runs semantic vector search. No generative model runs on Korely's side, your OpenAI model does all the reasoning over the block Korely returns. That is also why read quotas are far more generous than write quotas: reads are cheap on our side.

terminal python agent.py
$ python agent.py

→ tool call save_memory(content="User switched their main language to Rust")
← Saved (id=mem_...).

→ tool call recall_memory(query="preferred language")
← ## Known facts
   - dana likes Rust (since 2026-06-16)
   the superseded TypeScript fact is excluded automatically

Agent › You prefer Rust now. You told me you switched your main
language to it, and I've recorded that as your current preference —
the earlier TypeScript note is kept on file but no longer active.

Fact extraction is asynchronous. korely.add(...) returns as soon as the raw memory is stored, but the typed facts it produces are extracted a few seconds later, server-side. So a save_memory call followed immediately by a recall_memory in the same turn may not yet reflect the just-saved fact. That is fine for a chat agent, the fact is settled and contradiction-checked by the next turn. Do not poll get_context in a tight loop expecting the fact to appear instantly; let it land between turns.

Scoping by end user

The tools above are workspace-scoped, they write into your account. When each end user of your product needs an isolated memory, bind a user_id server-side and pass it on every Korely call. Capture it in a closure so the model can never set or cross it:

import os
from agents import Agent, Runner, function_tool
from korely_memory import Korely
korely = Korely(api_key=os.environ["KORELY_API_KEY"], region="eu")
def build_agent(user_id: str) -> Agent:
"""user_id is bound here, server-side. The model never sees it."""
@function_tool
def recall_memory(query: str) -> str:
"""Recall settled facts and memories about this user."""
return korely.get_context(query=query, user_id=user_id).context
@function_tool
def save_memory(content: str) -> str:
"""Save something worth remembering about this user."""
memory = korely.add(content, user_id=user_id, agent_id="support-bot")
return f"Saved (id={memory.id})."
return Agent(
name="Support bot",
instructions="Recall memory before answering. Save durable facts.",
tools=[recall_memory, save_memory],
)
# Per request, bound to the authenticated session:
agent = build_agent(user_id=session.user_id)
result = Runner.run_sync(agent, user_message)
print(result.final_output)

Each user_id is its own memory space: facts, graph and recall never cross between them. agent_id labels which application wrote the memory, and run_id can tag a single session.

Where to go next

  • Memory as a tool, the end-to-end pattern of exposing recall/save/search as agent tools, with prompt-tuning tips for getting the model to call them.
  • Get context, how the assembled fact block is built, what token_budget controls, and what each source in the response means.

Something not working? Email [email protected] with your openai-agents and korely-memory versions and the error output. We read every message.