Matrix extraction: filling research tables with local AI

A Matrix is a table where rows are sources and columns are extraction prompts. When you click Extract, the engine fills every empty or stale cell by sending a tightly scoped prompt to a local LLM and getting back structured JSON.

Overview

One cell at a time.

Matrix extraction is built around a simple promise: every cell value is grounded in a verbatim quote from a real page of the cited paper. To deliver that, the engine sequences each cell carefully — local retrieval, schema-enforced LLM call, robust parsing — and isolates failures so one bad cell never kills the run.

Chunks per cell

Source-scoped retrieval

0.1

Temperature

Near-deterministic

800

Max tokens

Output budget

300s

Watchdog

Hung-run timeout

Pipeline

Step by step.

1
Queue build
The view model walks every (source × column) pair and queues a cell if:
- the cell is empty, or
- it has an error from the previous run, or
- the column's prompt has changed since the cell was last extracted (detected via a stored prompt hash).
User edits are authoritative
Cells you have edited manually (isUserEdited == true) are always skipped. Your corrections are never overwritten.
2
Sequential per-cell run
Cells run one at a time, not in parallel. This keeps memory pressure on the local LLM low and lets you watch the matrix fill in row by row.
3
Model resolution
For each cell, the engine resolves which local Gemma 4 variant to use:
- First checks for a per-feature override (set in AI Settings → Per-feature Model → Matrix).
- Falls back to the global default model.
- Falls back to the first installed model if neither is set.
This lets you pair a smaller Gemma variant with Matrix (faster extraction across many cells) and a larger one with Evidence Scan (more careful reasoning on individual claims) — all on the same machine, all local.
4
Source-scoped retrieval
The semantic search service is asked for the top 15 chunks from this source that match the column's prompt. This is the critical design choice: retrieval is per-source, not project-wide. Without it the LLM would frequently see chunks from the wrong paper.
5
Prompt construction
The engine builds two prompts:
- A system prompt that mandates verbatim quotes, 1-indexed page numbers, and explicit "N/A" (with confidence 0) when the source does not contain the answer.
- A user prompt with the paper's citation header, the column name + instruction, an output-type hint (text, number, list), and the 15 retrieved excerpts numbered with their page labels.

JSON-schema-enforced call

The local LLM is invoked via the bundled llama-cli binary with:

--temp 0.1          # near-deterministic output
-n 800              # token budget
-c 16384            # context window
--json-schema       # value, source_quote, source_page, confidence

A 300-second watchdog terminates hung processes.

7
Robust JSON parsing
llama-cli appends chat-template tokens after the JSON body, so the engine walks the output character by character tracking brace depth and string state to slice out the first balanced JSON object. This survives most output quirks.
8
Cell merge
The parsed result is written back as a MatrixCell with the value, confidence, anchor (source quote + page index), model ID, prompt hash, and extraction timestamp. The cell is double-guarded so a user edit that came in mid-run cannot be overwritten.
9
Per-cell failure isolation
If one cell fails — timeout, malformed JSON, retrieval miss — only that cell shows an error indicator. The remaining cells continue. Cancel from the toolbar to stop the run mid-queue.

Library

Presets.

The Add Column menu ships with curated presets. Each is just a name + prompt + output type — you can edit them or write your own.

Preset	What it asks for
Summary	Brief overall summary of the paper
Sample size	Number of participants or observations
Methodology	Study design, approach, instruments
Findings	Main quantitative or qualitative results
Limitations	Author-stated limitations or caveats
Theoretical framework	Underlying theory the paper builds on
Data source	Datasets, archives, or collection sites used

Output

What you get back.

Per cell:

A short extracted value (the cell's display text).
A verbatim source quote.
A page number (1-indexed in the UI, 0-indexed internally on the anchor).
A confidence score 0–1.

Clicking a cell opens a detail modal with the quote rendered as a blockquote — and the anchor lets you jump straight to that page in the Reading Studio.

Matrix extraction: filling research tables with local AI

One cell at a time.

Step by step.

Queue build

Sequential per-cell run

Model resolution

Source-scoped retrieval

Prompt construction

JSON-schema-enforced call

Robust JSON parsing

Cell merge

Per-cell failure isolation

Presets.

What you get back.

More from the series.

The local AI architecture behind note.md

Source indexing: turning PDFs into a local knowledge index

Hybrid semantic search: meaning + keywords, fused

Run it on your Mac.