POC: Perf/handlebars v2 parser by johanrd · Pull Request #13 · johanrd/ember.js

johanrd · 2026-04-14T06:11:57Z

v2-parser: index-based cursor parser replacing Jison

Replaces the Jison-generated HBS parser with a hand-written recursive descent parser. Jison's lexer tests up to 40 regexes per token and slices the input string on every match; the v2-parser uses indexOf('{{') for content scanning and charCodeAt dispatch — no string copies, no regex gauntlet. Keeps simple-html-tokenizer for the HTML layer.

@handlebars/parser is deleted entirely. Parser files (v2-parser.js, whitespace-control.js, visitor.js, exception.js) moved into packages/@glimmer/syntax/lib/parser/.

Benchmark (`pnpm bench:precompile`)

Apple M1 Max, Node 24.14, prod dist.

phase	size	main (Jison)	this PR (index-based)	index-based single-pass
precompile	small (1517c)	1.76 ms	1.46 ms (1.2×)	1.31 ms (1.3×)
	medium (4551c)	5.36 ms	4.39 ms (1.2×)	3.94 ms (1.4×)
	large (33374c)	42.26 ms	35.57 ms (1.2×)	32.07 ms (1.3×)
parse	small	592 µs	295 µs (2.0×)	159 µs (3.7×)
	medium	1.73 ms	874 µs (2.0×)	495 µs (3.5×)
	large	14.68 ms	7.70 ms (1.9×)	3.93 ms (3.7×)
normalize	small	747 µs	453 µs (1.6×)	335 µs (2.2×)
	medium	2.24 ms	1.34 ms (1.7×)	1.01 ms (2.2×)
	large	18.53 ms	11.32 ms (1.6×)	8.37 ms (2.2×)

The unified single-pass parser (#17) is faster at parse (3.7× vs 2.0×), but precompile end-to-end is similar (1.3× vs 1.2×) because parse is only ~10-20% of total compile time — the compile phases (pass0, pass2, stringify) are identical shared code.

Reproduce: pnpm build && pnpm bench:precompile, compare branches.

github-actions · 2026-04-14T06:14:23Z

📊 Package size report `-0.22%↓`

File	Before (Size / Brotli)	After (Size / Brotli)
`dist/dev/packages/shared-chunks/compiler-pzJ8tcyS.js`	`177.7 kB` / `34 kB`	^31%↑`231.9 kB` / ^28%↑`43.5 kB`
`dist/dev/packages/shared-chunks/transform-resolutions-D4TVtqjf.js`	`188.6 kB` / `38.1 kB`	^-31.3%↓`129.6 kB` / ^-28%↓`27.4 kB`
`dist/prod/packages/shared-chunks/compiler-DhOQiSaV.js`	`190.9 kB` / `36.4 kB`	^28%↑`245.1 kB` / ^26%↑`45.9 kB`
`dist/prod/packages/shared-chunks/transform-resolutions-DZ4TivNl.js`	`174.2 kB` / `35.3 kB`	^-33.9%↓`115.2 kB` / ^-30.3%↓`24.6 kB`
`types/stable/@glimmer/syntax/lib/parser/exception.d.ts`	—	`313 B` / `154 B`
`types/stable/@glimmer/syntax/lib/parser/v2-parser.d.ts`	—	`267 B` / `128 B`
`types/stable/@glimmer/syntax/lib/parser/whitespace-control.d.ts`	—	`1.2 kB` / `283 B`
`types/stable/@handlebars/parser/types/ast.d.ts`	`3.6 kB` / `614 B`	—
`types/stable/@handlebars/parser/types/index.d.ts`	`400 B` / `183 B`	—
Total _{(Includes all files)}	`5.4 MB` / `1.3 MB`	^-0.22%↓`5.3 MB` / ^-0.21%↓`1.3 MB`
Tarball size	`1.2 MB`	^-0.33%↓`1.2 MB`

_{🤖 This report was automatically generated by pkg-size-action}

…ract tests Split into three files by concern: - parser-escape-test.ts: backslash escape sequences (\{{, \\{{, \\\{{) in top-level text, elements, attributes, and unclosed cases. - parser-whitespace-test.ts: tilde stripping and standalone detection. - parser-error-test.ts: inputs that must be rejected ({{}}}, {{~}}, {{@}}, etc). parser-node-test.ts is unchanged.

…t (v2-parser) The Jison LALR(1) parser was the #1 bottleneck in @glimmer/syntax's preprocess(), taking ~50% of total parse time. The generated parser tested up to 40 regexes per token and sliced the input string on every token match. The v2 parser uses index-based scanning, indexOf for content, charCodeAt dispatch, and batched line/col tracking. It produces AST-identical output (104/104 unit tests pass). HBS parse: 6-10x faster End-to-end preprocess(): 2-3x faster See PERF-INVESTIGATION.md for full analysis and benchmarks.

8 bugs fixed: 1. Sub-expression path locations (4 cases): paths like {{(helper).bar}} now correctly span from the sub-expression start, not just the .tail portion. Fixed by passing the pre-sub-expression position through parseSexprOrPath. 2. {{else if}} chain locations (2 cases): content after {{else}} had column offsets 4 too low because line/col were being restored from before 'else' was consumed. Fixed position tracking in consumeOpen's else-chain handling. 3. Raw block program location: now uses the overall block loc (matching Jison's prepareRawBlock behavior) instead of content-derived locs. 4. Nested raw blocks: {{{{bar}}}}...{{{{/bar}}}} inside {{{{foo}}}}...{{{{/foo}}}} is now correctly treated as raw content (not parsed as a nested block). Added depth tracking and mismatch detection for raw block close tags. 104/104 @handlebars/parser tests pass. 8768/8788 Ember tests pass (7 remaining are reserved-arg error type mismatches — same parse error, different Error class).

The hash loc was including trailing whitespace (newlines before }}) because skipWs() ran before capturing the hash end position. Now captures endP before the trailing whitespace skip. Caught by exhaustive 153-template audit comparing full JSON output (including all locations) against the Jison parser. 153/153 identical.

Found by stress testing: \{{foo}} caused an infinite loop in scanContent(). Two bugs: 1. After processing \{{ (escaped mustache), the scanner advanced to the {{ position but then findNextMustacheOrEnd found the same {{ immediately, causing an infinite loop. Fixed by advancing past the {{ and including it as literal content. 2. After scanContent returned for \\{{ (double-escaped), the next call saw the backslash at idx-1 from the PREVIOUS scan and re-entered escape handling. Fixed by only checking backslashes within the current scan range (idx > pos, not idx > 0). Also added stress-test.mjs with 181 test cases covering: - Escaped mustaches (single, double, with surrounding text) - Unicode identifiers - Whitespace edge cases - All strip flag combinations - Comment edge cases (short, long, adjacent, containing }}/{{) - Raw blocks (empty, nested, with mustache-like content) - Deeply nested sub-expressions - Complex block nesting with else chains - Real-world Ember patterns - Error cases

Round 2 of stress testing (106 additional cases) found: 1. Multiple consecutive escaped mustaches (x\{{y\{{z) failed — findNextMustacheOrEnd returned the position of \{{ instead of before the backslash, causing the main loop to miss the escape. 2. Content splitting after \{{ didn't match Jison. Jison emits separate ContentStatements at each \{{ boundary (emu state). The v2 parser now matches: \{{y\{{z produces 3 content nodes ["x", "{{y", "{{z"] instead of one merged ["x{{y{{z"]. 287 total stress tests now pass (181 round 1 + 106 round 2). 104/104 unit tests. 8771/8791 Ember tests.

Tested against 375 templates from a production Ember app (proapi-webapp). Found 38 location-only differences — all the same pattern: hash pairs with sub-expression values like bar=(helper arg) had their loc end extended past trailing whitespace/newlines. Root cause: parseSexprOrPath() called skipWs() after the sub-expression to peek for a path separator (.bar), but this whitespace belongs to the containing HashPair's loc boundary. Fixed by save/restore of pos around the peek. 375/375 real-world templates now produce byte-identical JSON output compared to the Jison parser. 104/104 unit tests. 287/287 stress tests.

Tested against: - 1014 templates from all projects in ~/fremby (including proapi-webapp, ember-power-select, glint, content-tag) - 500 randomly generated templates (adversarial fuzzing) - 27 pathological patterns (deep nesting, long content, etc.) Results: 1473/1541 pass (byte-identical to Jison). The 68 remaining differences are ALL the same issue: escaped mustache (\{{) content loc includes the backslash in Jison but not in v2. This is a Jison quirk — the regex match includes the \ (which gets stripped from the value), so the loc spans the full source including the \ character. The v2 parser's loc spans only the value content. This only affects templates using \{{ (escaped mustaches), which is extremely rare in real-world code (3 files across 550 scanned). No structural differences. No crashes. No hangs.

Makes CI green without deleting the POC's investigation artifacts: - eslint: ignore bench-cli.mjs, bench-compare.mjs, bench-full-pipeline.mjs, and the three @handlebars/parser/stress-test*.mjs scratch scripts (they use console.log for pass/fail output, which trips no-console). - prettier: format v2-parser.js, PERF-INVESTIGATION.md, and the scratch scripts in-place.

@handlebars

…kage Mirrors #15's structural cleanup, but keeps simple-html-tokenizer — this PR replaces only the HBS layer (Jison → recursive descent v2-parser), not the HTML layer. Changes: - Move v2-parser.js, whitespace-control.js, visitor.js, exception.js from packages/@handlebars/parser/lib/ into packages/@glimmer/syntax/lib/parser/. - v2-parser.js adds parse() (v2ParseWithoutProcessing + WhitespaceControl) and parseWithoutProcessing() exports matching the handlebars API that tokenizer-event-handlers consumes. - tokenizer-event-handlers.ts: swap '@handlebars/parser' import for './v2-parser'. - Delete packages/@handlebars entirely. - Remove @handlebars/parser from package.json, pnpm-workspace.yaml, rollup.config.mjs, eslint.config.mjs, CI workflows, and build docs. With v2-parser now on the default preprocess() path, main's pnpm bench:precompile shows modest consistent speedups vs Jison across all phases and sizes (no regressions). See PR body for numbers.

@0

…th starts The ember-template-compiler test suite asserts parse errors for {{@}}, {{@0}}, {{@@}} etc. against the regex /Expecting 'ID'/. Jison emits exactly that string; v2-parser used to emit 'Expected path identifier' / 'Expected path identifier after @'. Align on the Jison wording so the existing tests match again (same approach as #15's unified-scanner).

…+ real mustache Also port #15's prettier-smoke-test workflow step that regenerates error snapshots (our error messages differ from Jison's verbose format — that's accepted, not a regression). Two fixes: 1. findNextMustacheOrEnd backs up past ALL consecutive backslashes before {{, not just the last one. With the old single-backup behavior, input like '\\{{Y}}' after a previous \{{X}} emu scan left exactly one for the next content scan, which misclassified it as single-backslash escape (entering emu mode again). The fix ensures the full backslash run is handed to the next scanContent iteration, which correctly routes \\{{ into the 'literal backslash + real mustache' branch. 2. .github/workflows/glimmer-syntax-prettier-smoke-test.yml: add the 'Update error snapshots' step #15 has. Our parse errors are short ('Expecting ID') vs Jison's verbose enumerations; regenerating prettier's error snapshots before running the tests accepts that. EOF

Reverts the digit-first segment rejection. Jison has a quirk where {{foo.0}} (trailing digit) is rejected but {{foo.0.bar}} (middle digit) is accepted. Real Ember templates use the middle-digit form for array access (e.g. {{@list.0.node.id}}). The v2-parser uniformly accepts digit segments in all positions — more permissive than Jison but doesn't break any real-world templates. Tests updated to document the accepted behavior rather than assert rejection.

johanrd force-pushed the perf/handlebars-v2-parser branch 3 times, most recently from 2cb61b0 to ffa8d9b Compare April 14, 2026 06:55

johanrd mentioned this pull request Apr 14, 2026

[FEATURE rust-parser] Rust/WASM template parser using pest.rs emberjs/ember.js#21313

Closed

7 tasks

johanrd changed the title ~~Perf/handlebars v2 parser~~ POC: Perf/handlebars v2 parser Apr 14, 2026

johanrd force-pushed the perf/handlebars-v2-parser branch from ffa8d9b to b644a3f Compare April 16, 2026 19:41

johanrd added 19 commits April 17, 2026 00:15

p

6c4d815

bench full pipeline

6264ccc

cl

b0b6072

Update comment to reflect parser's purpose

2b83ea2

Update file paths to real-world-project

7685847

Update stress test project directory path

6c6a0e6

Update file paths for V2 and V2_SYNTAX

36a59ed

johanrd force-pushed the perf/handlebars-v2-parser branch from e51774f to e51aaef Compare April 16, 2026 22:20

johanrd mentioned this pull request Apr 16, 2026

Perf/handlebars v2 parser single pass #17

Open

johanrd added 2 commits April 17, 2026 07:27

test: add escaped literal with newline inside brackets

634071e

johanrd added 2 commits April 17, 2026 07:50

fix: narrow PathHead to VarHead before accessing .name (TS2339)

5037a2e

style: use module() callback form, match emberjs#21317 format

90d380a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: Perf/handlebars v2 parser#13

POC: Perf/handlebars v2 parser#13
johanrd wants to merge 23 commits intomainfrom
perf/handlebars-v2-parser

johanrd commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johanrd commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

v2-parser: index-based cursor parser replacing Jison

Benchmark (pnpm bench:precompile)

Uh oh!

github-actions bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Package size report -0.22%↓

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

johanrd commented Apr 14, 2026 •

edited

Loading

Benchmark (`pnpm bench:precompile`)

github-actions bot commented Apr 14, 2026 •

edited

Loading

📊 Package size report `-0.22%↓`