About LocalLens

LocalLens is an open-source reference implementation of retrieval-augmented generation (RAG) that runs entirely on the user's machine. Point it at any folder — personal notes, a codebase, a research dump — and it builds a private brain you can chat with. Every answer carries citations linking back to the exact source chunk, and no document, query, or response is ever sent to a third-party service.

Why it exists

Cloud RAG services hold three things that should stay with the user: the documents being indexed, the queries being asked, and the model responses being generated. Each of those is a leak surface. LocalLens is a working argument that the local-first trade — a smaller model and a single-user setup in exchange for none of your data leaving the machine — is the right trade for personal knowledge bases, regulated environments, and air-gapped research. It is also small enough to teach: the entire reference implementation is eight TypeScript files in src/, designed to be read in an afternoon.

How it works

File discovery, chunking, embedding, vector retrieval, prompt assembly, and completion all run in-process via the QVAC SDK from Tether. Chat uses QWEN3 1.7B (Q4-quantized) with a 600M fallback for slimmer hardware. Embeddings use GTE-Large FP16. Brains and chunks persist as plain JSON at .locallens/store.json — easy to inspect, back up, and diff with git. Two entry points sit on top of the same core: a Bun CLI for fast iteration, and a Bun.serve HTTP server for a real browser UI.

Who builds it

LocalLens is maintained by Marcus Souza (@souzavinny) and a small group of contributors. The application repository lives at github.com/souzavinny/locallens and the documentation site at github.com/souzavinny/locallens-docs. Both are released under the MIT license. Contributions — translations, extension recipes, architecture notes, bug reports — are welcome through GitHub issues and pull requests.

Principles

Where to go next

Read the overview, then walk the architecture and the build-from-scratch walkthrough. If you prefer hands-on, jump to the setup guide and run a sample brain in a terminal.