Claude, ChatGPT, Cursor, Cline, Continue, Kilo Code — whatever your team picks today and switches to tomorrow, they all read and write the same memory store. Hybrid keyword + vector + graph retrieval. Self-hostable on a laptop or as a multi-tenant brain for a whole company.
Built-in dashboard · live throughput · per-token usage · 24h history · tier hit rates
A 200K-token context isn't memory — it's a goldfish bowl that resets every session. The usual workaround, "stuff a vector DB behind it," misses everything you actually want from memory. And the moment you have more than one agent, every one of them ends up with its own private notes.
The agent landscape moves fast. Whatever you and your team use this quarter — and whatever you switch to next quarter — they all read and write the same novamem store. No re-onboarding. No silos. One canonical place where everything goes.
Tell Claude Code that the deploy target is k3s on 192.168.10.248. Switch to Cursor on a different machine — the same fact is there, retrievable by hybrid search, automatically scoped to the project. Switch to ChatGPT through the HTTP API for an architecture review — it sees every decision the others made. One canonical memory. Not three.
Every entry belongs to a single user. An entry can additionally belong to a project — a sub-brain that's shareable. User-global entries are private; project entries are visible to every member of that project. Every search runs three signals in parallel and fuses them into one ranked list.
memory_searchweights for keyword-only / vector-only / graph-heavy. Fans out across populated namespaces by default.memory_remembernamespace, sourceType, capturedFrom, confidence. Worthiness gate + SHA dedup are applied automatically.memory_recent / memory_todaysince; today: last 24 h). Useful for "what did the agent learn today" digests.memory_neighborsmemory_update / memory_forgetforget returns deleted:false.project_*Full usage guide → covers worthiness gates, decay maths, dream cycle, namespaces, and weight tuning.
One docker compose up -d brings the whole stack online. No external services. No proprietary embedding API. No hidden per-token charges.
7 · log₂(hits+1)) is tunable per-tenant.@xenova/transformers by default — no API keys, runs on CPU. Swap in any OpenAI-compatible endpoint with a single env var.Every search runs keyword (FTS), vector (cosine), and graph (neighbour traversal) in parallel. Results fuse via min-max-normalised weighted scoring with sensible defaults you can override per call.
Write a memory entry. A worthiness gate rejects conversational filler; a SHA-256 dedupe path returns the existing id for exact duplicates. The entry lands in warm + cold + graph atomically.
One query fans out to all three indexes. Results are fused with weighted scoring. Override weights per call — `{keyword:1, vector:0}` for exact-id lookups, `{vector:1}` to lean fully semantic.
Entries decay on a synaptic schedule — effectiveDays = 7 · log₂(hits + 1). Hits in cold reactively promote back to warm. A nightly dream cycle compacts duplicates and promotes shared neighbours.
Every entry is per-user by default. Create a project to carve out a sub-brain; share it by adding members. Memory crosses user boundaries only through explicit project membership.
A simple example. Imagine you've been remembering project-notes for weeks. Today you ask your agent: "How did we end up choosing Postgres for the main store?" Here's what each tier contributes — and why fusing all three beats any one alone.
postgres, main store) and returns rows where those words appear verbatim.All three signals run in parallel and fuse into one ranked list. The warm hit gets you the literal ADR. The cold hit pulls in the prior reasoning even though it never said "Postgres". The graph hit ties the decision to its supporting cost analysis. Your agent answers with all of it — not just the one tier that happened to match.
Pick whichever fits your environment. All three lead to the same server image, the same dashboard, the same MCP surface.
# clone, set 3 secrets, up git clone https://github.com/azrtydxb/novamem.git cd novamem && cp .env.example .env echo "POSTGRES_PASSWORD=$(openssl rand -base64 24)" >> .env echo "NOVAMEM_BOOTSTRAP_ADMIN_PASSWORD=$(openssl rand -base64 24)" >> .env echo "NOVAMEM_COOKIE_SECRET=$(openssl rand -hex 32)" >> .env docker compose up -d # http://localhost:7778/admin
# bring your own Postgres + Qdrant + FalkorDB # prereqs: Node 20+, pnpm 9+ git clone https://github.com/azrtydxb/novamem.git cd novamem pnpm install && pnpm build cp .env.example .env # point at your existing datastores pnpm --filter @azrtydxb/novamem-server start
# multi-arch image on ghcr.io # manifests in deploy/k8s/ git clone https://github.com/azrtydxb/novamem.git cd novamem/deploy/k8s # edit secrets.yaml + ingress.yaml host # then: kubectl apply -k . kubectl -n novamem rollout status deploy/novamem
@azrtydxb/novamem-init detects 30+ supported hosts, asks for your server URL + dashboard credentials, mints a fresh bearer, and writes the config each host expects. Idempotent — won't clobber existing entries.
npx -y @azrtydxb/novamem-init # detects: Claude Code · Claude Desktop · ChatGPT (via HTTP) # Cursor · OpenCode · Codex CLI · Cline · Continue # Kilo Code · RooCode · Gemini CLI · Copilot · Windsurf # Factory · Amazon Q · & ~16 skill-only hosts
Prefer manual setup? Per-host walkthroughs: Claude Code · Claude Desktop · Cursor · Kilo Code · Other hosts & skills
Same code. Same MCP surface. Same dashboard. The only thing that changes between a personal laptop and a 5,000-engineer company is how you stand it up. Multi-tenancy and project-based sub-brains are first-class from day one.
docker compose up -d on your laptop or homelab. One user. Private memory across every AI host on your machine. Zero SaaS dependency, zero per-token cost.Everything you need to operate, integrate, or audit. The OpenAPI spec is generated from the same Zod schemas the server uses at runtime — so it's always accurate.
/api-docs on your deployment.ghcr.io/azrtydxb/novamem.novamem is opinionated about a few things — and the rest is yours.
sourceType, capturedFrom, confidence — so you can filter "what did Claude infer" from "what was directly observed".SECURITY.md and an audited auth model.Self-host in a minute. Wire every AI host on your machine in one npx command.