Korely

Vercel AI SDK

The AI SDK is the TypeScript toolkit for building AI applications: generateText, streamText, agents, and UI hooks for React, Vue and Svelte. Korely plugs in as a memory layer through the korely-memory Node package (or direct fetch calls): define thin tool() wrappers around Korely's REST endpoints and pass them to the model. From that point the model can search memories, read typed facts, and write back — all inside a normal generateText call.

What you get

Each Korely tool wrapper calls one REST endpoint under the hood. The model receives clean JSON and decides when to invoke memory. On the read side the primary path is get context — fact-assembled recall over the user's active typed facts — backed by search, read a single memory, list facts, and fetch a user profile. On the write side: add a memory and delete a memory. No schema writing, no glue code beyond the fetch calls shown below.

Install

You need the AI SDK and the Korely Node package:

Terminal window
npm install ai korely-memory

You also need a Korely API key. Copy it from Settings → API Keys in the Korely app and export it:

Terminal window
export KORELY_API_KEY="kor_live_..."

Authentication

Every request to https://api.korely.ai/v1 carries the key in the Authorization header:

const headers = {
Authorization: 'Bearer ' + process.env.KORELY_API_KEY,
'Content-Type': 'application/json',
};

Complete example

A single-file agent: define two memory tools, let the model decide when to reach for them. stopWhen: stepCountIs(5) lets the model chain tool calls (search, then answer) before producing the final text.

agent.ts
import { generateText, stepCountIs, tool } from 'ai';
import { z } from 'zod';
const KORELY_API = 'https://api.korely.ai/v1';
const headers = {
Authorization: 'Bearer ' + process.env.KORELY_API_KEY,
'Content-Type': 'application/json',
};
const tools = {
search_memory: tool({
description: 'Search what has been remembered so far.',
inputSchema: z.object({
query: z.string().describe('Keyword-style query, 1 to 5 words'),
limit: z.number().optional().default(5),
}),
execute: async ({ query, limit }) => {
const res = await fetch(KORELY_API + '/memories/search', {
method: 'POST',
headers,
body: JSON.stringify({ query, limit }),
});
return res.json();
},
}),
save_memory: tool({
description: 'Save something worth remembering.',
inputSchema: z.object({ content: z.string() }),
execute: async ({ content }) => {
const res = await fetch(KORELY_API + '/memories', {
method: 'POST',
headers,
body: JSON.stringify({ content }),
});
return res.json();
},
}),
};
const { text } = await generateText({
model: 'anthropic/claude-sonnet-4.5',
tools,
stopWhen: stepCountIs(5),
prompt:
'What do my notes say about the Lisbon offsite? ' +
'Search memory before answering.',
});
console.log(text);

Run it and the model calls search_memory on its own:

terminal npx tsx agent.ts
$ npx tsx agent.ts

→ tool call search_memory(query="lisbon offsite", limit=5)

← JSON from /v1/memories/search:

   { "results": [
     { "id": "mem_...", "score": 0.94, "snippet": "Three days, June 24 to 26.
       Day 1 product review, day 2 strategy workshop, day 3 free morning +
       flights home. Hotel booked near Cais do Sodré.",
       "user_id": "...", "agent_id": null, "metadata": {} },
     { "id": "mem_...", "score": 0.81, "snippet": "Flights confirmed for the
       23rd evening. Bring the demo laptop, HDMI dongle, offsite agenda
       printouts...", "user_id": "...", "agent_id": null, "metadata": {} }
   ] }

Agent › The Lisbon offsite runs June 24 to 26: product review on
day 1, a strategy workshop on day 2, and a free morning on day 3
before flights home. You fly in the evening of the 23rd, the hotel
is near Cais do Sodré, and your checklist flags the demo laptop,
an HDMI dongle and printed agendas.

What comes back from /v1/memories/search is pure retrieval: semantic vector search (cosine similarity over embeddings). No generative model runs on Korely's read path; your model does the reasoning. That is also why read quotas are an order of magnitude more generous than write quotas.

Prefer get_context for recall. Raw search returns matching memory snippets. GET /v1/context goes further: it assembles the user's active typed facts — the bi-temporal (subject, predicate, object) triples Korely extracts and keeps current through contradiction resolution — into a compact, ready-to-prompt block. That is the differentiator. Give the model a get_context tool and it gets settled facts ("Luca upgraded to Pro", "prefers async standups") instead of having to re-read raw snippets every turn. Search stays useful for "find the exact memory that mentioned X"; context is what you reach for when you want the agent to simply know the user.

Using the korely-memory Node SDK

If you prefer a typed client instead of raw fetch, the korely-memory package exports a Korely class with the same methods:

import { Korely } from 'korely-memory';
const korely = new Korely({ apiKey: process.env.KORELY_API_KEY });
// The moat tool: fact-assembled recall. Reach for this first.
const recall = tool({
description: 'Recall settled facts and relevant memories about this user.',
inputSchema: z.object({ query: z.string() }),
execute: async ({ query }) => korely.getContext(query),
});
const search_memory = tool({
description: 'Find the exact memory that mentioned something.',
inputSchema: z.object({ query: z.string() }),
execute: async ({ query }) => korely.search(query, { limit: 5 }),
});
const save_memory = tool({
description: 'Save something worth remembering.',
inputSchema: z.object({ content: z.string() }),
execute: async ({ content }) => korely.add(content),
});

Streaming

With streamText the call returns before the model finishes. The tools are stateless HTTP calls so there is nothing extra to close; just define the tools outside the handler and reuse them:

import { streamText, stepCountIs } from 'ai';
const result = streamText({
model: 'anthropic/claude-sonnet-4.5',
tools,
stopWhen: stepCountIs(5),
prompt: 'Summarize what you know about my current apartment lease.',
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}

Serverless route handlers: define the tools at module scope so they are not recreated on every request. Each execute is just a fetch call — there is no connection to open or close.

Multi-tenant chatbots: scope by user_id

The examples above are workspace-scoped: they write memories into your account. When you are building a product where each end user needs their own isolated memory, scope every read and write with an end-user identifier. Korely's naming for the three scopes: user_id is the end user (unlimited on every plan, you choose the string), agent_id is your application, run_id is one session.

Bind user_id from your session server-side, so the model can never reach across tenants no matter what the conversation says:

import { generateText, stepCountIs, tool } from 'ai';
import { z } from 'zod';
const KORELY_API = 'https://api.korely.ai/v1';
const headers = {
Authorization: 'Bearer ' + process.env.KORELY_API_KEY,
'Content-Type': 'application/json',
};
// user_id is bound here, server-side. The model never sees or sets it.
function korelyMemoryTools(userId: string) {
return {
search_memory: tool({
description: 'Search everything known about this user.',
inputSchema: z.object({
query: z.string().describe('Keyword-style query, 1 to 5 words'),
}),
execute: async ({ query }) => {
const res = await fetch(KORELY_API + '/memories/search', {
method: 'POST',
headers,
body: JSON.stringify({ query, user_id: userId, limit: 5 }),
});
return res.json();
},
}),
save_memory: tool({
description: 'Save something worth remembering about this user.',
inputSchema: z.object({
content: z.string(),
}),
execute: async ({ content }) => {
const res = await fetch(KORELY_API + '/memories', {
method: 'POST',
headers,
body: JSON.stringify({
content,
user_id: userId,
agent_id: 'support-bot',
}),
});
return res.json();
},
}),
};
}
// Per chat request:
const { text } = await generateText({
model: 'anthropic/claude-sonnet-4.5',
tools: korelyMemoryTools(session.userId),
stopWhen: stepCountIs(5),
messages,
});

Each user_id is its own memory space: facts, graph and retrieval never cross between them. When a user asks you to wipe what the bot knows, one call deletes every memory for that user (DELETE /v1/users/{userId}/memories, see the API reference). The chatbot that remembers cookbook builds this pattern out end to end.

Troubleshooting

SymptomFix
401 from api.korely.ai Key missing or malformed. Check that KORELY_API_KEY starts with kor_live_ and is exported in the process environment where the agent runs.
Tools load but the model never calls them Nudge it in the prompt ("search memory before answering") or in the system message. Tool choice is the model's; smaller models need the hint more than larger ones.
429 on writes You have hit the write quota for your plan (hobby: 1,000 writes / month). Upgrade on the pricing page or reduce write frequency by deduplicating before saving.
One slow tool call stalls the whole turn Cap chains with stopWhen: stepCountIs(n) and keep limit low on search calls; 5 results is usually plenty for a chat turn.

For the full SDK method surface, see the SDK reference. For how facts, supersession and time work, see temporal facts.

Something not working? Email [email protected] with your ai and korely-memory versions and the error output. We read every message.