Semantic Memory

How Org in a Box stores, retrieves, and injects long-term memory into every agent turn.

How It Works

Every piece of information an agent learns about you is stored in the memories table with a 1536-dimensional embedding vector. Before each agent turn, the memory plugin fetches the most semantically similar memories and injects them into the system prompt.

User message → embed via OpenAI/Azure → cosine search → top-K memories → system prompt

When no embedder is configured, the plugin falls back to recency-ordered recall (most recent first).

Memory Kinds

Kind	When Used
`fact`	Explicit facts: "Alice is the VP of Engineering at Acme."
`preference`	Auto-extracted: "I prefer dark mode", "call me Bob"
`event`	Time-anchored events: "Board meeting on Thursday"
`skill-note`	Reflections from the learning loop

Visibility

Scope	Who Sees It
`private` (default)	Only the owning user
`team`	All users in the same tenant
`org`	All users in the organisation

Visibility is enforced by RBAC: memory.read_team and memory.read_org permissions must be granted.

System Prompt Injection

The experimental.chat.system.transform hook fires before every turn. It:

Reads the last user message text (cached by the chat.message hook)
Embeds it using buildEmbedderFromEnv() (Azure OpenAI preferred, OpenAI fallback)
Runs cosine similarity against all stored embeddings (threshold: 0.3)
Takes the top 8 results
Truncates to a 2000-token budget
Wraps the block in <memory-context>...</memory-context>

Adding Memories

Agents can add memories directly:

memory add fact "The customer's renewal date is Q3 2026"
memory add preference "User wants all reports in bullet-point format"

Via REST API:

POST /v1/memories
Authorization: Bearer <token>
Content-Type: application/json

{
  "kind": "fact",
  "content": "ACME Corp uses Salesforce for CRM",
  "visibility": "team"
}

Backfilling Embeddings

Memories created before an embedder was configured have embedding = NULL. Enqueue a backfill job:

POST /v1/jobs
{ "kind": "embed-memories", "payload": { "limit": 500 } }

Or schedule it as a nightly cron trigger.

Configuration

Option	Default	Description
`recallLimit`	`8`	Max memories injected per turn
`similarityThreshold`	`0.3`	Minimum cosine similarity to include
`tokenBudget`	`2000`	Max tokens of memory context

Pass as plugin options in opencode.jsonc:

{
  "plugin": ["@orginabox/plugin-memory"],
  "pluginOptions": {
    "@orginabox/plugin-memory": {
      "recallLimit": 12,
      "similarityThreshold": 0.25
    }
  }
}