Your AI has no memory. We fix that.
UPtrim gives your local LLM permanent memory — your name, preferences, projects, everything. Then it teaches local and cloud models to work hand-in-hand: draft with Llama on your own machine, review with Claude when you need the firepower, both sharing the same context. Your memory never leaves your box.
v1.0 is out today.
Open-source, free forever, runs fully offline. Don't wait for v2.0 — everything below is downloadable right now.
Download v1.0 on GitHub →Close a chat.
Still remembered.
UPtrim watches every conversation and quietly extracts facts — your name, your preferences, the project you're working on. When you come back tomorrow, the model already knows.
- NLP fact extraction (spaCy + regex + structured triples)
- Local SQLite storage with FTS5 keyword search
- Semantic search via bundled embeddings (v2.0)
- Every fact editable & deletable from the dashboard
One chat.
Many brains.
Small talk hits your free local LLM. Hard problems get Claude Opus. Images route to your local Stable Diffusion. UPtrim picks the right model for each request — automatically — using a Capability Matrix built from your hardware.
- OAuth sign-in for Claude & OpenAI, no raw keys
- Per-request cost tracking & session budget
- Capability Matrix benchmarks your local models
- Keep everything local when you want to
Agents you can
actually watch.
Spawn autonomous subagents for multi-step work — code review, research, data analysis. Each runs with a scoped memory slice, a persistent task ID, and a shared scratchpad. You can resume, replay, or kill any task.
- Pre-built modes: reviewer, researcher, analyst, coder
- Full audit trail & live execution view
- Shared scratchpad between collaborating agents
- Every tool call logged & inspectable
Everything, visible.
Memories, users, files, agents, routing, settings, tunnels, logs — one browser tab, running locally on your machine. Click the sidebar below. It actually works.
Thirty-four
new modules.
One release.
Hybrid cloud routing. Autonomous subagents. Semantic memory and document RAG. Image generation. A real-time view of the whole pipeline. And a real-time view of the whole pipeline.
Hybrid cloud + local routing
Small talk hits your local LLM. Hard problems go to Claude or GPT. OAuth means no raw API keys. Per-request cost tracking means no surprise bills. Your memory and files stay on your box regardless.
Real-time brain
Watch every memory injection, agent step, embedding score, and cloud token tick by in a live blueprint view.
SubAgents
Spawn autonomous agents for multi-step work with scoped memory, persistent task IDs, and shared scratchpad.
Semantic memory & RAG
Bundled local embeddings catch paraphrases, not just keywords. Uploaded docs auto-inject with source attribution.
Conversation history
Every past chat, searchable by date, topic, or content. Exportable. Time-grouped. That thing from three weeks ago? Findable.
Prompt enhancement
A local model expands your short prompts into detailed specs before they hit the cloud. Same question, better answer.
Image generation
Ask for an image. UPtrim routes to your local Stable Diffusion, drops it inline, keeps a gallery. No third parties.
Knowledge graph
Memories as connected nodes. The LLM sees that Sarah leads Project Atlas — not just that both exist.
TrimScript plugins
Write .trim scripts that extract, inject, filter, and react. Hot-reload. Visual blueprint builder.
Six new features just landed.
v2.0 keeps shipping. Here's what came down recently — the meatiest additions since the last preview.
LLM Conductor
Decomposes every turn into typed subtasks, routes each to the best brain, runs them in parallel, assembles. Per-turn ledger logs cost + latency for every subtask.
Internal Parliament
5–8 topical personas auto-clustered from your memory. Debate in the background on idle GPU. Their transcript is decision-support context when you ask hard questions.
Codex Offload
OpenAI Codex subprocess from inside UPtrim, full memory passed through. Pairs with Claude Code — pick your coding agent, keep your context.
Ghost Inbox
Unified feed for every async result, dream memory, search hit, and background task UPtrim ran for you. Like email for your AI's background work.
Capability Matrix v2
Live per-model scoring on cost, latency, quality, context window. The Conductor reads it — you see exactly why every call went where it did.
Context Compression
Long chats roll into running summaries instead of getting truncated. Message #287 still remembers what happened on #14 — nothing falls off the back of the context window.
Start local.
Upgrade into hybrid.
Monthly billing. No annual lock-in. Free tier gives any local LLM a proper memory. Paid tiers unlock Ghost Agent, hybrid cloud routing, sub-agents, and the visual knowledge graph as the ceiling lifts with you.
- 5,000 persistent memories
- File upload & RAG
- Multi-backend routing
- Basic knowledge graph
- Agent mode & local image gen
- Secret Shield (key redaction)
- Ghost Agent · background enrichment
- Memory Lifecycle (active/fading/dormant)
- Persona Engine + intent classification
- Semantic recall + conversation branching
- Brain dashboard + remote HTTPS
- Parental controls · encrypted backup
- Hybrid cloud+local routing
- Ghost Pre-Search + sub-agent swarm
- Ghost Inbox · Capability Matrix v2 new
- Context Compression · rolling summaries new
- Prompt enhancement + history search
- Cost tracker · aliases · 100K facts
- LLM Conductor · Internal Parliament new
- Visual Knowledge Graph explorer
- Claude Code + Codex offload new
- n8n + MCP · Ghost Mesh
- Staleness UI + Ambient Task Tracker
- Unlimited everything · SLA
- Everything in Premium · for the whole office
- Central admin console · group permissions
- SSO-ready (SAML / OIDC)
- Hardened audit trail (SIEM export)
- Encrypted backups to your S3 / R2
- 99.5% uptime SLA · 24h response
- 2+ instances minimum · HA + failover
- On-prem & air-gapped deployments
- Multi-proxy federation across regions
- BYO KMS / HSM · data-residency rules
- SOC 2 / HIPAA / GDPR mappings
- 99.9% uptime SLA · dedicated CSM
Install it.
Point your chat at it.
Done.
No accounts. No credit card. No rewrites. Drop UPtrim on your machine, change one URL in your chat app, and start building a proper memory.