Bootstrapping

In compiler and parser theory, bootstrapping is the property where a tool can process its own input format — or, in the strongest form, where a compiler is used to compile itself.

rdp.js and self-description

rdp-gen reads grammar files (EBNF or ABNF) and generates TypeScript recursive-descent parsers. The parsers it generates extend RDParser — and so does the EBNF parser inside rdp-gen that reads those grammar files.

In other words: the grammar-file reader and the generated parsers are the same kind of object, built on the same base class. This means the EBNF format can be described using EBNF itself — the tool can describe its own input.

The EBNF meta-grammar

The EBNF format accepted by rdp-gen can be expressed as an ABNF grammar (src/grammars/ebnf-meta.abnf):

; ISO 14977 EBNF meta-grammar written in ABNF.
; Describes the EBNF syntax accepted by rdp-gen (ISO 14977).
;
; Rules are terminated by ";".  Concatenation uses ",".  Alternation uses "|".
; Grouping uses "()", optional uses "[]", repetition uses "{}".
; n * P is counted repetition.  A - B is the exception operator.
; Block comments (* ... *) are not covered by this meta-grammar.
;
; Test input: any valid ISO EBNF grammar (without (* *) comments), e.g.:
;   Greeting = 'hello' | 'world';

Grammar  = *Rule *ws
Rule     = *ws Name *ws "=" *ws Body *ws ";"
Body     = Sequence *(*ws "|" *ws Sequence)
Sequence = Term *(*ws "," *ws Term)
Term     = Primary [*ws "-" *ws Primary]
Primary  = "{" *ws Body *ws "}"
         / "[" *ws Body *ws "]"
         / "(" *ws Body *ws ")"
         / Integer *ws "*" *ws Primary
         / Literal
         / Name
Integer  = 1*DIGIT
Literal  = %x27 *SqChar %x27 / %x22 *DqChar %x22
SqChar   = EscSeq / %x20-26 / %x28-5B / %x5D-7E
DqChar   = EscSeq / %x20-21 / %x23-5B / %x5D-7E
EscSeq   = %x5C (%x6E / %x74 / %x72 / %x5C / %x27 / %x22)
Name     = NameF *NameR
NameF    = ALPHA / "_"
NameR    = ALPHA / DIGIT / "_" / "-"
ws       = %x20 / %x09 / %x0A / %x0D

This grammar validates the syntax of any EBNF grammar file. Load it in the playground via Load example → EBNF meta-grammar (bootstrapping), compile it, then paste a small EBNF grammar into the test input to verify it parses correctly:

A = 'hello' | 'world';

The ABNF meta-grammar

The ABNF format is self-describing — the grammar for ABNF can itself be written in ABNF (src/grammars/abnf-meta.abnf):

; ABNF meta-grammar written in ABNF — self-describing
; Describes the subset of ABNF syntax used by rdp-gen.
;
; Covers: rule definitions (=), alternation (/), double-quoted string literals,
; zero-or-more (*X), one-or-more (1*X), optional groups ([...]), and the
; DQUOTE core rule.  Does not cover char-range rules (%xNN-NN) or prose-val.
;
; Because this grammar is itself written in ABNF and describes ABNF, it
; demonstrates the "self-describing" property of the grammar format.
; Feed this file to rdp-gen to produce a TypeScript parser that reads ABNF.
;
; Test input: any valid ABNF rule, e.g.:
;   Greeting = "hello" / "world"

Grammar  = *ws Rule *(*ws Rule) *ws
Rule     = RuleName hsp "=" hsp Body
Body     = Sequence *(hsp "/" hsp Sequence)
Sequence = Term *(hsp Term)
Term     = Rep Element / Element
Rep      = ("1" "*") / "*"
Element  = Group / OptGrp / Str / RuleName
Group    = "(" *ws Body *ws ")"
OptGrp   = "[" *ws Body *ws "]"
Str      = DQUOTE *DChar DQUOTE
DChar    = %x20-21 / %x23-7E
RuleName = ALPHA *( ALPHA / DIGIT / "-" / "_" )
ws       = %x20 / %x09 / %x0A / %x0D
hsp      = *( %x20 / %x09 )

Generating a grammar-file parser with rdp-gen

Because the meta-grammars above are valid grammar files, you can feed them directly to rdp-gen. The repository includes a convenience script that generates all four parsers in one step:

npm run build && npm run generate:meta

This writes four files into src/parsers/generated/ (which is git-ignored — the files are generated artefacts, not source), covering all combinations of "which format is described" × "which format the description is written in":

Output fileParser classGrammar sourceWritten in
ebnf-in-ebnf-parser.tsEBNFInEBNFParserebnf-meta.ebnfEBNF (self-describing)
ebnf-in-abnf-parser.tsEBNFInABNFParserebnf-meta.abnfABNF
abnf-in-ebnf-parser.tsABNFInEBNFParserabnf-meta.ebnfEBNF
abnf-in-abnf-parser.tsABNFInABNFParserabnf-meta.abnfABNF (self-describing)

Both files are compiled to dist/ as part of the normal npm run build step — they are picked up automatically because tsconfig includes all of src/**/*.ts.

To generate a single parser manually (for example, with --observable for tracing support):

rdp-gen src/grammars/ebnf-meta.abnf \
        --parser-name EBNFInABNFParser \
        --observable \
        --output src/parsers/generated/ebnf-in-abnf-parser.ts

The resulting EBNFInABNFParser class is an RD parser that reads EBNF grammar files — generated by the very tool whose input format it describes.

The bootstrapping loop

Bootstrapping loop diagram showing ebnf-meta.abnf fed into rdp-gen at Stage 1, which emits EBNFMetaParser.ts, which can in turn parse EBNF files at Stage 2 — closing the loop
  1. Stage 0rdp-gen ships with a hand-written EBNFParser that reads EBNF files.
  2. Stage 1 — You use rdp-gen to compile ebnf-meta.abnf, producing a generated EBNFMetaParser.
  3. Stage 2EBNFMetaParser can read EBNF grammar files. If you wired it into rdp-gen in place of the hand-written parser, rdp-gen would now be (partially) self-hosted.

Why this matters

The bootstrapping property is not just a curiosity — it validates the expressiveness of the grammar language. If EBNF can describe itself, it can describe any context-free language whose structure fits the LL(1) model.

It also makes the parser generator easier to test: any valid EBNF grammar string that passes the meta-grammar can be assumed to have correct syntax before being compiled, giving you an early validation layer.

The bootstrap tests in src/__tests__/bootstrap.test.ts verify all four combinations (EBNF-in-EBNF, EBNF-in-ABNF, ABNF-in-EBNF, ABNF-in-ABNF) using generateParser directly — no interpreter involved.

Try it in the playground

  1. Open the playground.
  2. In the Grammar editor, choose Load example → EBNF meta-grammar (bootstrapping) and click Compile.
  3. In the Test input box, type a small EBNF grammar — for example:
    Greeting = 'hello' | 'hi';
    Name     = Letter, {Letter};
    Letter   = 'a' | 'b' | 'c';
    
  4. The interpreter confirms the grammar text is syntactically valid EBNF.
  5. Click Debug to step through the parse trace and watch each production match character by character.