Semantic Memory
How Org in a Box stores, retrieves, and injects long-term memory into every agent turn.
How It Works
Every piece of information an agent learns about you is stored in the memories table with a 1536-dimensional embedding vector. Before each agent turn, the memory plugin fetches the most semantically similar memories and injects them into the system prompt.
User message → embed via OpenAI/Azure → cosine search → top-K memories → system prompt
When no embedder is configured, the plugin falls back to recency-ordered recall (most recent first).
Memory Kinds
| Kind | When Used |
|---|---|
fact | Explicit facts: "Alice is the VP of Engineering at Acme." |
preference | Auto-extracted: "I prefer dark mode", "call me Bob" |
event | Time-anchored events: "Board meeting on Thursday" |
skill-note | Reflections from the learning loop |
Visibility
| Scope | Who Sees It |
|---|---|
private (default) | Only the owning user |
team | All users in the same tenant |
org | All users in the organisation |
Visibility is enforced by RBAC: memory.read_team and memory.read_org permissions must be granted.
System Prompt Injection
The experimental.chat.system.transform hook fires before every turn. It:
- Reads the last user message text (cached by the
chat.messagehook) - Embeds it using
buildEmbedderFromEnv()(Azure OpenAI preferred, OpenAI fallback) - Runs cosine similarity against all stored embeddings (threshold: 0.3)
- Takes the top 8 results
- Truncates to a 2000-token budget
- Wraps the block in
<memory-context>...</memory-context>
Adding Memories
Agents can add memories directly:
memory add fact "The customer's renewal date is Q3 2026"
memory add preference "User wants all reports in bullet-point format"
Via REST API:
POST /v1/memories
Authorization: Bearer <token>
Content-Type: application/json
{
"kind": "fact",
"content": "ACME Corp uses Salesforce for CRM",
"visibility": "team"
}
Backfilling Embeddings
Memories created before an embedder was configured have embedding = NULL. Enqueue a backfill job:
POST /v1/jobs
{ "kind": "embed-memories", "payload": { "limit": 500 } }
Or schedule it as a nightly cron trigger.
Configuration
| Option | Default | Description |
|---|---|---|
recallLimit | 8 | Max memories injected per turn |
similarityThreshold | 0.3 | Minimum cosine similarity to include |
tokenBudget | 2000 | Max tokens of memory context |
Pass as plugin options in opencode.jsonc:
{
"plugin": ["@orginabox/plugin-memory"],
"pluginOptions": {
"@orginabox/plugin-memory": {
"recallLimit": 12,
"similarityThreshold": 0.25
}
}
}