About LocalLens
LocalLens is an open-source reference implementation of retrieval-augmented generation (RAG) that runs entirely on the user's machine. Point it at any folder — personal notes, a codebase, a research dump — and it builds a private brain you can chat with. Every answer carries citations linking back to the exact source chunk, and no document, query, or response is ever sent to a third-party service.
Why it exists
Cloud RAG services hold three things that should stay with the user: the documents being indexed, the queries being asked, and the model responses being generated. Each of those is a leak surface. LocalLens is a working argument that the local-first trade — a smaller model and a single-user setup in exchange for none of your data leaving the machine — is the right trade for personal knowledge bases, regulated environments, and air-gapped research. It is also small enough to teach: the entire reference implementation is eight TypeScript files in src/, designed to be read in an afternoon.
How it works
File discovery, chunking, embedding, vector retrieval, prompt assembly, and completion all run in-process via the QVAC SDK from Tether. Chat uses QWEN3 1.7B (Q4-quantized) with a 600M fallback for slimmer hardware. Embeddings use GTE-Large FP16. Brains and chunks persist as plain JSON at .locallens/store.json — easy to inspect, back up, and diff with git. Two entry points sit on top of the same core: a Bun CLI for fast iteration, and a Bun.serve HTTP server for a real browser UI.
Who builds it
LocalLens is maintained by Marcus Souza (@souzavinny) and a small group of contributors. The application repository lives at github.com/souzavinny/locallens and the documentation site at github.com/souzavinny/locallens-docs. Both are released under the MIT license. Contributions — translations, extension recipes, architecture notes, bug reports — are welcome through GitHub issues and pull requests.
Principles
- Files never leave the machine. QVAC runs models locally; ingestion and search happen in-process. No upload, no telemetry, no third-party API key.
- Answers stay grounded. The prompt builder forces the model to cite retrieved chunks or to admit it does not know. Hallucinated paths are a regression bug, not an acceptable trade.
- Cloud-optional. Drop the network and LocalLens still works. Add a tunnel later if you want to share a brain.
- Small enough to read. Eight TypeScript files in
src/, split by responsibility. The whole codebase fits in an afternoon.
Where to go next
Read the overview, then walk the architecture and the build-from-scratch walkthrough. If you prefer hands-on, jump to the setup guide and run a sample brain in a terminal.