High-level pipeline

LocalLens takes a folder and a question and returns an answer with citations. Everything between those two endpoints is a small, fixed pipeline.

Indexing pipeline

Indexing runs once per brain, when you create it.

The steps in code:

discoverTextDocuments(rootPath) (src/files.ts) walks the folder, filters out everything that isn't a safe text file, and returns LocalDocument[].
chunkDocuments(documents, { brainId }) (src/rag.ts) calls ragChunk to split each document into ~220-token chunks with 40-token overlap, then returns TextChunk[].
QvacGateway.ingestChunks(workspace, chunks) (src/qvac.ts) embeds each chunk with GTE_LARGE_FP16 and writes them into a named QVAC workspace.
LocalLensStore.saveBrain and saveChunks (src/store.ts) persist the brain and its chunks to .locallens/store.json.

The first three steps work on the input folder. The fourth keeps a record of the brain, so next time you open the app, you can ask questions without re-indexing.

Question pipeline

Every chat round-trip walks the same four-step path.

In code:

QvacGateway.search(workspace, question, 5) (src/qvac.ts) embeds the question and runs ragSearch, returning the top 5 SearchHit[].
buildGroundedHistory(question, hits) (src/rag.ts) produces a two-message ChatMessage[] containing the system rules and the numbered excerpts.
QvacGateway.answer(history) (src/qvac.ts) calls QVAC completion() with stream: true and yields each contentDelta as an AsyncGenerator<string>.
LocalLensApp.askBrain (src/locallens.ts) accumulates the stream and returns { answer, citations }.

Where each step lives

Stage	Module	Key call
File discovery	`src/files.ts`	`discoverTextDocuments`
Chunking	`src/rag.ts`	`chunkDocuments` → `ragChunk`
Embedding & ingest	`src/qvac.ts`	`ingestChunks` → `ragIngest`
Persistence	`src/store.ts`	`saveBrain`, `saveChunks`
Retrieval	`src/qvac.ts`	`search` → `ragSearch`
Prompt	`src/rag.ts`	`buildGroundedHistory`
Completion	`src/qvac.ts`	`answer` → `completion`
Workflow	`src/locallens.ts`	`LocalLensApp.askBrain`

What the pipeline is not

No re-ingestion on the question path. Asking never touches the file system or the JSON store. It only reads from the QVAC workspace.
No second model call per question. One round-trip to the chat model. One embedding call for the search step. That's it.
No conversational state. Each question is an independent retrieval round. The chat history is rebuilt from scratch every time, grounded in whatever search returned for this question.

That last property keeps follow-up answers grounded. If you want multi-turn dialogue, build it above buildGroundedHistory, not inside it. The RAG vs LLM page explains why.

Request flow — a single question, traced end to end.
RAG vs LLM — what each side does and why they stay apart.
Why QVAC — the SDK surface in use, and what we'd be writing without it.
Source layout — where the code lives.

Indexing pipeline

Question pipeline

Where each step lives

What the pipeline is not

Next

On this page