I've been building local agents and found debugging the RAG retrieval step frustrating. I often couldn't tell why the LLM was pulling specific context chunks, and console logging vector arrays didn't help.
I built this tool to act as a standalone 'memory server' sitting on top of PostgreSQL with the pgvector extension. I wanted to avoid managing separate specialized vector DBs for smaller projects.
The main feature is the visualizer dashboard. It shows the retrieval process in real-time, displaying raw chunks, similarity scores, and how 'recency decay' influences the final ranking.
The backend is Node.js/TypeScript using Prisma. It runs via Docker Compose.
Current limitation: The default config relies on OpenAI for embedding generation. I am working on adding local support via Ollama bindings as the next priority so the entire stack can run offline.
I've been building local agents and found debugging the RAG retrieval step frustrating. I often couldn't tell why the LLM was pulling specific context chunks, and console logging vector arrays didn't help.
I built this tool to act as a standalone 'memory server' sitting on top of PostgreSQL with the pgvector extension. I wanted to avoid managing separate specialized vector DBs for smaller projects.
The main feature is the visualizer dashboard. It shows the retrieval process in real-time, displaying raw chunks, similarity scores, and how 'recency decay' influences the final ranking.
The backend is Node.js/TypeScript using Prisma. It runs via Docker Compose.
Current limitation: The default config relies on OpenAI for embedding generation. I am working on adding local support via Ollama bindings as the next priority so the entire stack can run offline.
The code is MIT licensed.