Using the parse tree
A generated parser returns a parse tree — a typed object whose shape mirrors the grammar. This section covers the patterns for consuming that output.
The patterns fall into three categories that compose with each other:
- Traversal strategies — how you walk the tree: Interpreter, Tree walking
- Stage separation — how you structure the process around a traversal: Pipeline
- Module boundary — how you consistently expose the result to callers regardless of grammar changes: Facade
The three categories compose: you pick a traversal strategy based on what the computation needs, optionally wrap the stages in a pipeline to separate parsing, validation, and transformation with explicit typed boundaries, and optionally wrap the whole thing in a facade if callers should be isolated from the generated parser types.
Choosing a traversal strategy
| Strategy | Best for | Notes |
|---|---|---|
| Interpreter | Computing a single value from the input (e.g. arithmetic, template rendering) | No intermediate representation; direct recursion mirrors grammar structure; grammar changes ripple through all functions |
| Tree walking | Language tools — linters, formatters, translators, compilers, analysis passes | Parse once, walk many times; multiple independent passes over the same tree; childNodes handles traversal mechanics |
Use the interpreter when the result is a single computed value and the parse is not reused. Use tree walking for everything else — it is the right default for any non-trivial transformation or translation task.
Parse tree shape
Every node is a discriminated object with a kind field matching the rule name and
one field per item in that rule's body:
// Grammar rule: Date = Year, '-', Month, '-', Day;
export type DateNode = {
kind: 'Date'
year: YearNode // Year (non-terminal → named)
item1: string // '-' (terminal → item-n)
month: MonthNode // Month (non-terminal → named)
item3: string // '-' (terminal → item-n)
day: DayNode // Day (non-terminal → named)
}| Grammar construct | Generated TypeScript type |
|---|---|
| Terminal / char literal | string (field named itemn) |
| Non-terminal reference | XxxNode (field named after the rule, camelCase) |
| Duplicate non-terminals | Suffixed: year0, year1, … |
Alternation A | B | A | B (string alternatives deduplicated) |
Optional [A] | A | null |
Repetition {A} | A[] |
Sequence A, B, C | Flattened: non-terminals named, terminals use itemn |
CST and AST
The parse tree rdp-gen produces is a Concrete Syntax Tree (CST) — it mirrors
the grammar exactly, including every punctuation terminal, grouping rule, and
intermediate non-terminal. For simple grammars this is manageable to work with
directly, as the examples in this document do.
For more complex language tools it is common practice to first lower the CST into
an Abstract Syntax Tree (AST) — a simplified structure that keeps only the
semantically meaningful nodes and discards syntactic noise. For example, the
arithmetic CST has five levels (Expr → Term → Factor → Number → Digit); an AST
for the same input might collapse that to a simple tree of Add, Mul, and
Number nodes that is much easier to analyse or emit from.
Lowering is just the visitor pattern applied to produce a new tree rather than a value: write a visitor that walks the CST and builds your AST node by node, then apply your other passes to the AST instead. For the simple examples in this document the CST is manageable enough that a separate AST is not necessary.
Interpreting
When the goal is to compute a single result from the input — not to construct a data object — an interpreter mirrors the grammar's recursive structure directly in code. Each production rule maps to one function; the functions call each other exactly as the grammar rules reference each other. There is no intermediate representation: parsing and evaluation happen in one pass over the tree.
The recursive structure is easiest to see in a grammar diagram. For an arithmetic expression language:
Factor references Expr — the grammar is recursive — and the interpreter
functions are recursive in exactly the same way.
Generate the parser:
rdp-gen expr.ebnf --parser-name ExprParser --output src/ExprParser.tsEvaluator (src/expr.ts):
import {
ExprParser,
type ExprNode, type TermNode, type FactorNode, type NumberNode,
} from './ExprParser.js'
import { RDParserException } from '@configuredthings/rdp.js'
export class ExpressionError extends Error {}
export function evaluate(input: string): number {
try {
return evalExpr(ExprParser.parse(input))
} catch (e) {
if (e instanceof RDParserException) throw new ExpressionError(`invalid expression: "${input}"`)
throw e
}
}
function evalExpr(node: ExprNode): number {
let result = evalTerm(node.term)
for (const { item0: op, term } of node.item1) {
result = op === '+' ? result + evalTerm(term) : result - evalTerm(term)
}
return result
}
function evalTerm(node: TermNode): number {
let result = evalFactor(node.factor)
for (const { item0: op, factor } of node.item1) {
result = op === '*' ? result * evalFactor(factor) : result / evalFactor(factor)
}
return result
}
function evalFactor(node: FactorNode): number {
const inner = node.item0
// FactorNode.item0 is NumberNode (has `kind`) or a grouped expression (does not)
return 'kind' in inner ? evalNumber(inner) : evalExpr(inner.expr)
}
function evalNumber(node: NumberNode): number {
return parseInt([node.digit, ...node.item1].map((d) => d.item0).join(''), 10)
}Usage:
import { evaluate, ExpressionError } from './expr.js'
evaluate('3 + 4 * 2') // → 11
evaluate('(3 + 4) * 2') // → 14
evaluate('??') // throws ExpressionErrorA separate facade module is not needed here — the interpreter's unexported functions
already keep the parser and tree types private, and the result (number) is the
computation itself rather than a domain object to expose.
The interpreter is also the pattern that most directly mirrors how the generated parser itself works. Both have one function per grammar rule, with each function calling the others in exactly the shape the grammar describes:
Parser: #parse_expr() → #parse_term() → #parse_factor() → #parse_expr()
Interpreter: evalExpr() → evalTerm() → evalFactor() → evalExpr()
The call graphs are isomorphic. The only difference is what each function traverses — the parser consumes a character stream (building structure), the interpreter consumes the tree the parser produced (collapsing that structure back to a value). The grammar defines both processes simultaneously; the interpreter just makes the second one explicit.
Tree walking
When you are building a language tool — a linter, formatter, translator, compiler, or analysis pass — the tree itself is the domain. You need to traverse every node, and you may want several independent passes over the same tree. For complex tools, consider lowering the CST to an AST before applying your visitor passes.
There are three runtime pieces — Visitor<T> + visit() for partial dispatch,
Transformer<T> + transform() for exhaustive conversion — and one grammar-specific
function, childNodes(), always emitted alongside the generated parser.
rdp-gen expr.ebnf --parser-name ExprParser --output src/ExprParser.tsVisitor<ParseTree, T> and visit()
Visitor<ParseTree, T> is a mapped type that derives one optional handler per
kind value from your ParseTree union. You only declare handlers for the node
types you care about; unhandled nodes return undefined.
import { visit, type Visitor } from '@configuredthings/rdp.js'
import type { ParseTree } from './ExprParser.js'
// Handle exactly the node types you need — all others return undefined
const linter: Visitor<ParseTree, string | undefined> = {
Number: (n) => n.digit.item0 === '0' ? 'leading zero' : undefined,
}
const warning = visit(node, linter) // string | undefinedWhen you want TypeScript to enforce that every node kind is handled — so that a new
grammar rule produces a compile error rather than silently returning undefined —
use Required<Visitor<ParseTree, T>>:
// TypeScript error here if any case is missing
const printer: Required<Visitor<ParseTree, string>> = {
Expr: (n) => `expr`,
Term: (n) => `term`,
Factor: (n) => `factor`,
Number: (n) => `number`,
Digit: (n) => n.item0,
}childNodes()
childNodes(node) returns the direct ParseTree children of any node — every
named rule node reachable from its fields, with arrays and anonymous sequence
objects unwrapped. Terminal strings are not included.
import { childNodes } from './ExprParser.js'
import type { ParseTree } from './ExprParser.js'
childNodes(exprNode) // → [termNode, termNode, ...]
childNodes(digitNode) // → [] (leaf — no ParseTree children)With childNodes in hand, a full recursive walk is just a few lines:
function walk(node: ParseTree, fn: (n: ParseTree) => void): void {
fn(node)
for (const child of childNodes(node)) walk(child, fn)
}Worked example — collect all number literals
import { ExprParser, childNodes } from './ExprParser.js'
import { visit, type Visitor } from '@configuredthings/rdp.js'
import type { ParseTree, NumberNode } from './ExprParser.js'
export function collectNumbers(input: string): NumberNode[] {
const tree = ExprParser.parse(input)
const numbers: NumberNode[] = []
const collector: Visitor<ParseTree, void> = {
Number: (n) => { numbers.push(n) },
}
function walk(node: ParseTree): void {
visit(node, collector)
for (const child of childNodes(node)) walk(child)
}
walk(tree)
return numbers
}
collectNumbers('3 + 4 * 2') // → [NumberNode('3'), NumberNode('4'), NumberNode('2')]Because walk and collector are separate, you can compose multiple visitors
over a single traversal or run independent passes on the same tree:
function walk(node: ParseTree): void {
visit(node, linter) // first pass
visit(node, collector) // second pass — same traversal
for (const child of childNodes(node)) walk(child)
}Max depth — a traversal without visit()
For traversals that don't need per-kind dispatch, childNodes alone is enough:
function maxDepth(node: ParseTree): number {
const kids = childNodes(node)
return kids.length === 0 ? 1 : 1 + Math.max(...kids.map(maxDepth))
}Transformer<ParseTree, T> and transform()
Visitor is intentionally partial — you handle the nodes you care about and ignore
the rest. When you need the opposite guarantee — every node kind must be handled and
a missing case is a compile error — use Transformer. This makes it the right tool
for exhaustive conversion: translating a parse tree to a domain IR, emitting to
another format, or any case where an unhandled node kind is a bug rather than a
deliberate omission.
import { transform, type Transformer } from '@configuredthings/rdp.js'
import type { ParseTree } from './ExprParser.js'
const evaluator: Transformer<ParseTree, number> = {
Expr: (n) => { /* ... */ return 0 },
Term: (n) => { /* ... */ return 0 },
Factor: (n) => { /* ... */ return 0 },
Number: (n) => parseInt(n.digit.item0, 10),
Digit: (n) => parseInt(n.item0, 10),
// Omitting any key here is a TypeScript compile error
}
const result = transform(ExprParser.parse('2 + 3'), evaluator) // number, never undefinedVisitor<Tree, T> + visit() | Transformer<Tree, T> + transform() | |
|---|---|---|
| Handler keys | Optional (?) | Required |
| Unhandled node | Returns undefined | Compile error |
| Return type | T | undefined | T |
| Use when | Partial traversal, multiple passes | Exhaustive conversion |
--transformer generates a ready-to-fill Transformer object — see the
CLI reference.
Translating between two DSLs
When both endpoints are grammars you own, scaffold a transformer for each, define a
shared domain IR, and wire them together. The IR is the hub — every parser lowers into
it, every emitter reads from it:
rdp-gen format-a.ebnf --parser-name FormatAParser --transformer --output src/format-a-transformer.ts
rdp-gen format-b.ebnf --parser-name FormatBParser --transformer --output src/format-b-transformer.tsEach scaffold gives you a Transformer<FormatXTree, unknown>. Replace unknown with
your IR type and fill in the handlers:
// shared-ir.ts — define once, import everywhere
export type IR =
| { kind: 'foo'; value: string }
| { kind: 'bar'; left: IR; right: IR }// format-a-transformer.ts (scaffolded, then filled in)
import { transform, type Transformer } from '@configuredthings/rdp.js'
import { FormatAParser, type FormatATree, type RootNode, /* ... */ } from './FormatAParser.js'
import type { IR } from './shared-ir.js'
export const formatAToIR: Transformer<FormatATree, IR> = {
Root: (n) => ({ kind: 'foo', value: n.item0 }),
// ... one handler per rule
}The emitters go the other direction — Transformer<IR, string> — and don't need a
scaffold since the IR is your own type, not derived from a grammar:
// format-b-emitter.ts — hand-written, no scaffold needed
import { transform, type Transformer } from '@configuredthings/rdp.js'
import type { IR } from './shared-ir.js'
export const irToFormatB: Transformer<IR, string> = {
foo: (n) => n.value,
bar: (n) => `${transform(n.left, irToFormatB)} + ${transform(n.right, irToFormatB)}`,
}Round-trip:
import { FormatAParser } from './FormatAParser.js'
import { FormatBParser } from './FormatBParser.js'
import { formatAToIR } from './format-a-transformer.js'
import { formatBToIR } from './format-b-transformer.js'
import { irToFormatA } from './format-a-emitter.js'
import { irToFormatB } from './format-b-emitter.js'
const aToB = (input: string) =>
transform(transform(FormatAParser.parse(input), formatAToIR), irToFormatB)
const bToA = (input: string) =>
transform(transform(FormatBParser.parse(input), formatBToIR), irToFormatA)Each parser, lowering transformer, and emitter is independent — adding a third format means writing one new scaffold + one new emitter without touching the others. For a complete worked example see Translation: Arith to RPN and JSON.
Translating to and from JSON
A common use of Transformer is translating between a DSL and JSON. TypeScript has
first-class JSON support via JSON.parse and JSON.stringify, and RDPjs provides
JSONAST — a typed discriminated union over the six JSON value types — together with
toJSONAST and fromJSONAST to convert between raw JSON strings and the union.
JSONAST is the JSON output format, not a domain IR. When translating a DSL to JSON,
the chain is:
FormatAST → domain IR → JSONAST → JSON string
Transformer Transformer fromJSONAST()
For simple grammars where the JSON structure is the semantics you can translate
directly from FormatAST to JSONAST, skipping the domain IR step.
import { toJSONAST, fromJSONAST, type JSONAST } from '@configuredthings/rdp.js'
const ast = toJSONAST('{"x":1}') // JSONAST (kind: 'object')
const text = fromJSONAST(ast) // '{"x":1}'toJSONAST wraps JSON.parse and hides the any return type; fromJSONAST wraps
JSON.stringify. Transformer<JSONAST, string> handles the reverse path — one handler
per JSON kind:
import { transform, type Transformer, toJSONAST, fromJSONAST, type JSONAST } from '@configuredthings/rdp.js'
import { MyParser, type MyTree } from './MyParser.js'
// Format → JSONAST
const myFormatToJSON: Transformer<MyTree, JSONAST> = {
Root: (n) => ({ kind: 'object', entries: [ /* ... */ ] }),
Value: (n) => ({ kind: 'string', value: n.item0 }),
// ...
}
// JSONAST → Format string
const jsonToMyFormat: Transformer<JSONAST, string> = {
string: (n) => `"${n.value}"`,
number: (n) => String(n.value),
boolean: (n) => n.value ? 'true' : 'false',
null: () => 'null',
array: (n) => `[ ${n.items.map(i => transform(i, jsonToMyFormat)).join(', ')} ]`,
object: (n) => n.entries.map(e => `${e.key}: ${transform(e.value, jsonToMyFormat)}`).join('\n'),
}
const toJSON = (input: string) => fromJSONAST(transform(MyParser.parse(input), myFormatToJSON))
const toFormat = (input: string) => transform(toJSONAST(input), jsonToMyFormat)--transformer json generates ready-to-fill stubs for both directions — see the
CLI reference. For a complete worked example see
Translation: Arith to RPN and JSON.
The pipeline pattern
The interpreter treats parsing and computation as a single pass. In a traditional multi-pass compiler the work is split across discrete stages — parsing, semantic analysis, transformation, emission — each receiving the output of the previous one. The pipeline pattern brings the same idea to a single grammar: it separates the work into distinct stages with explicit types at each boundary:
This mirrors the parse → semantic analysis → transformation stages of a multi-pass compiler, and becomes valuable when:
- validation needs to accumulate errors rather than stopping at the first failure — each stage can collect all problems and return them together
- the validation rules are complex enough to deserve their own tests, independent of parsing and transformation
- the transformation logic needs to trust that the tree is already semantically valid (no defensive checks inside transform)
Stage types:
import type { ConfigNode, EntryNode } from './ConfigParser.js'
import { RDParserException } from '@configuredthings/rdp.js'
export interface AppConfig { port: number; workers: number; timeout: number }
export interface ConfigError { key: string; message: string }
// Stage 1 — syntax: string → ParseTree (throws on malformed input)
export function parse(input: string): ConfigNode {
try {
return ConfigParser.parse(input)
} catch (e) {
if (e instanceof RDParserException) throw new SyntaxError(e.message)
throw e
}
}
// Stage 2 — semantics: ParseTree → Result<ParseTree, ConfigError[]>
export function validate(
tree: ConfigNode,
): { ok: true; tree: ConfigNode } | { ok: false; errors: ConfigError[] } {
const errors: ConfigError[] = []
const seen = new Set<string>()
const allowed = new Set(['port', 'workers', 'timeout'])
for (const entry of tree.item0) {
const key = extractKey(entry)
if (!allowed.has(key)) errors.push({ key, message: `unknown key "${key}"` })
if (seen.has(key)) errors.push({ key, message: `duplicate key "${key}"` })
seen.add(key)
}
return errors.length > 0 ? { ok: false, errors } : { ok: true, tree }
}
// Stage 3 — transform: ParseTree → Domain (safe to call only after validate)
export function transform(tree: ConfigNode): AppConfig {
const entries = Object.fromEntries(
tree.item0.map((entry) => [extractKey(entry), extractValue(entry)]),
)
return {
port: entries['port'] ?? 3000,
workers: entries['workers'] ?? 1,
timeout: entries['timeout'] ?? 30,
}
}Composing the pipeline:
export function loadConfig(input: string): AppConfig {
const tree = parse(input)
const result = validate(tree)
if (!result.ok) throw new AggregateError(result.errors, 'config validation failed')
return transform(result.tree)
}Because validate and transform are separate exported functions, they can each
be tested in isolation — validate against known-invalid trees, transform against
known-valid ones — without needing to construct grammar input for every case.
// Grammar-specific helpers — shape depends on the Entry rule in config.ebnf:
// Entry = Ident, '=', Number;
// Ident = letter, {letter}; Number = digit, {digit};
function extractKey(entry: EntryNode): string {
return [entry.ident.letter, ...entry.ident.item1].map((c) => c.item0).join('')
}
function extractValue(entry: EntryNode): number {
return parseInt([entry.number.digit, ...entry.number.item1].map((d) => d.item0).join(''), 10)
}The facade pattern
The facade is a module boundary, not a traversal strategy. It wraps whichever traversal strategy you have chosen — interpreter, pipeline, or tree-walker — and hides the generated parser types behind a clean domain API. It is overkill for internal or single-use code; for anything with external callers it pays for itself immediately.
The problem it solves: parse trees are shaped like grammars, not domains. A
domain object is a type that makes sense in your application —
CalendarDate { year, month, day } — as opposed to a grammar artefact —
DateNode → YearNode → DigitNode → string. Callers typically want the former;
the parse tree gives them the latter.
Wrapping the parser in a facade module solves this by:
- keeping the generated class and its tree types private to the module
- exporting only clean domain types that callers can depend on
- translating
RDParserExceptioninto a meaningful application-level error
The generated parser can then be freely regenerated — rule renames, grammar refactors — without breaking any code outside the facade.
Adding semantic validation
Grammars express structure, not meaning. Syntax answers "is this well-formed?"; semantics answers "is this valid?". Checks like month and day ranges cannot be expressed in EBNF — they need to be enforced in code, after the tree walk:
export function parseDate(input: string): CalendarDate {
// ... parse and convert as before ...
if (d.month < 1 || d.month > 12) throw new InvalidDateError(input)
if (d.day < 1 || d.day > 31) throw new InvalidDateError(input)
return d
}Callers always receive either a valid CalendarDate or an InvalidDateError —
they never see a partially-valid tree. For more complex validation — accumulating
multiple errors, or keeping validation and transformation independently testable —
see the pipeline pattern.
Worked example — ISO date strings
This example shows the facade and
pipeline patterns working together. The pipeline gives
the semantic checks a clean home — the validate stage — keeping them separate
from the structural conversion in transform. The facade wraps everything behind
a single parseDate function so callers never see the generated types.
Grammar (date.ebnf):
Date = Year, '-', Month, '-', Day;
Year = Digit, Digit, Digit, Digit;
Month = Digit, Digit;
Day = Digit, Digit;
Digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';
Generate the parser:
rdp-gen date.ebnf --parser-name DateParser --output src/DateParser.tsFacade module (src/date.ts):
import { DateParser, type DateNode, type DigitNode } from './DateParser.js'
import { RDParserException } from '@configuredthings/rdp.js'
// ── Domain types ──────────────────────────────────────────────────────────────
export class CalendarDate {
constructor(
readonly year: number,
readonly month: number,
readonly day: number,
) {}
static from(tree: DateNode): CalendarDate {
const { year, month, day } = tree
return new CalendarDate(
CalendarDate.#digits(year.digit0, year.digit1, year.digit2, year.digit3),
CalendarDate.#digits(month.digit0, month.digit1),
CalendarDate.#digits(day.digit0, day.digit1),
)
}
static #digits(...nodes: DigitNode[]): number {
return parseInt(nodes.map((n) => n.item0).join(''), 10)
}
}
export class InvalidDateError extends Error {
constructor(input: string) {
super(`not a valid ISO date: "${input}"`)
this.name = 'InvalidDateError'
}
}
// ── Public API ───────────────────────────────────────────────────────────────
export function parseDate(input: string): CalendarDate {
return DatePipeline.run(input)
}
// ── Implementation ────────────────────────────────────────────────────────────
class DatePipeline {
static run(input: string): CalendarDate {
const tree = DatePipeline.#parse(input) // Stage 1 — syntax
const result = DatePipeline.#validate(tree) // Stage 2 — semantics
if (!result.ok) throw new InvalidDateError(input)
return DatePipeline.#transform(result.tree) // Stage 3 — transform
}
static #parse(input: string): DateNode {
try {
return DateParser.parse(input)
} catch (e) {
if (e instanceof RDParserException) throw new InvalidDateError(input)
throw e
}
}
static #validate(tree: DateNode): { ok: true; tree: DateNode } | { ok: false } {
const d = CalendarDate.from(tree)
if (d.month < 1 || d.month > 12) return { ok: false }
if (d.day < 1 || d.day > 31) return { ok: false }
return { ok: true, tree }
}
static #transform(tree: DateNode): CalendarDate {
return CalendarDate.from(tree)
}
}Usage:
import { parseDate, CalendarDate, InvalidDateError } from './date.js'
// DateParser and the raw tree types are not visible here
const d: CalendarDate = parseDate('2024-03-15')
// → { year: 2024, month: 3, day: 15 }
parseDate('2024-13-01') // throws InvalidDateError (month out of range)
parseDate('not-a-date') // throws InvalidDateError (syntax error)Scaffolding
Scaffold flags emit a typed starter file wired up to your grammar — imports, entry points, stubs, and error handling already in place. Flags compose: pass multiple to combine strategies:
| Flag(s) | Generates |
|---|---|
--traversal interpreter | One eval{Rule}() function per rule + evaluate() entry point |
--traversal tree-walker | walk() utility + commented Visitor stubs for every rule |
--transformer | Transformer<ParseTree, unknown> object with one stub per rule + entry function |
--transformer json | Two Transformer objects (format → JSONAST and JSONAST → format) + round-trip helpers |
--traversal <strategy> --facade | Public parse{Base}(), {Base}Result class, {Base}Error — wraps the chosen traversal |
--traversal tree-walker --pipeline | Exported parse/validate/transform stages + load{Base}() combinator |
Because the patterns compose, you can run multiple scaffold commands for the same grammar:
for example, --traversal interpreter --facade for the public module and --transformer
for the implementation it wraps. See the
CLI reference,
or the arithmetic worked example for all patterns
applied to a complete grammar.