Scanner

The scanner is zetl’s entry point for understanding a vault. It walks every Markdown file and standalone .spl file, extracting concepts/Wikilinks and concepts/Spindle Lisp blocks in a single pass.

(given wikilink-extraction)
(given spl-extraction)

Wikilink extraction

The scanner recognises all common wikilink forms:

[[Page]] — basic link
[[Page|alias]] — aliased link
[[Page#heading]] — heading anchor
[[Page^block-id]] — block reference
![[Page]] — embed

Extracted links feed into the Link Graph for query and traversal.

Exclusion zones

The scanner skips wikilink-like patterns that appear inside:

Fenced code blocks (```)
Inline code (`)
HTML comments ()
YAML frontmatter (--- fences)

This prevents false positives from code examples or metadata.

SPL extraction

Fenced code blocks tagged ```spl are extracted verbatim, along with their source file and line number. Standalone .spl files anywhere in the vault are also picked up. Both feed into the Reasoning Engine with full concepts/Provenance.

Merkle tree construction

During scanning, the scanner groups pulldown-cmark AST nodes into Merkle Tree leaf nodes (headings, paragraphs, code blocks, tables, lists, etc.) and computes BLAKE3 hashes. SPL blocks receive dual hashing: a content hash and an AST hash — the latter enables theory cache invalidation even when formatting changes. See architecture/Cache.

Implementation

Built on pulldown-cmark for Markdown parsing and ignore for .gitignore-aware file walking. The scanner is deliberately read-only — see Local-first Design.

Performance

The scanner processes over 2,000 files per second on commodity hardware. Incremental scanning (via mtime-based cache) reduces re-scan time by 99% on typical editing sessions. See Performance.

Leaf type	Source
Heading	`## Section Title`
Paragraph	Prose text blocks
SplBlock	```spl fenced code blocks
Code	Non-SPL fenced code blocks
Table	Markdown tables
List	Ordered/unordered lists
Blockquote	`>` block quotes
Frontmatter	YAML between `---` fences

Tag	Meaning
`+D`	Definitely provable — strict derivation, no defeating possible
`-D`	Definitely not provable
`+d`	Defeasibly provable — inferred, no active defeaters
`-d`	Defeasibly not provable — blocked or no derivation path

Command	Description
`links`	Forward links from a page, with configurable depth
`backlinks`	Pages linking to a target, with depth traversal
`path`	Shortest link path between any two pages
`export`	Full graph as JSON

Metric	Target	Measured (10k vault)
Cold-start full scan	< 200 ms	140 ms
Incremental re-index (1 file)	< 50 ms	8 ms
Graph query p99	< 10 ms	4 ms
Search (10k files)	< 2 sec	—
Reasoning (10k rules)	< 500 ms	—
TUI startup	< 200 ms	—
View startup	< 200 ms	—
Scroll frame time	< 16 ms	—
Merkle construction overhead	< 20%	—
Diff (2k pages, 50 changed)	< 500 ms	—
Watch event latency (p95)	< 500 ms	—

Element	Provenance
Fact	File path, line number, page name
Rule	File path, line number, page name, rule label
Conclusion	Full proof tree of contributing rules and facts
Grounding	Section hash or explicit source reference (see Drift Detection)

Form	Example	Description
Basic	`[[Cache]]`	Link to a page
Aliased	`[[Cache\|caching layer]]`	Display text differs from target
Heading	`[[Cache#Design tension]]`	Link to a specific heading
Block	`[[Cache^summary]]`	Link to a block ID
Embed	`![[Cache]]`	Embed the target page inline
Self-heading	`[[#Syntax]]`	Link to heading in same page