Skip to content

[v2.0.39] Add Typst math output format with typed AST serializer #410

Draft
OlgaRedozubova wants to merge 69 commits intomasterfrom
dev/olga/typst-math-ast-refactor
Draft

[v2.0.39] Add Typst math output format with typed AST serializer #410
OlgaRedozubova wants to merge 69 commits intomasterfrom
dev/olga/typst-math-ast-refactor

Conversation

@OlgaRedozubova
Copy link
Copy Markdown
Contributor

@OlgaRedozubova OlgaRedozubova commented Apr 7, 2026

Add native Typst math output format (LaTeX/MathML → Typst)

Summary

  • Add SerializedTypstVisitor that converts MathJax's internal MathML tree into native Typst math syntax
  • LaTeX → Typst and MathML → Typst conversion via TexConvertToTypstData() and MathMLConvertToTypstData()
  • Typed AST intermediate layer between MathML tree traversal and Typst string output — all 22 handlers return typed nodes, escaping centralized in a single serializer with context registry

Continues and supersedes #405.

Specifications:

  • Initial implementation — feature design, symbol mapping, escape handling, dual output
  • AST refactor — typed AST layer, delimiter fixes, bracket scoping, code cleanup

Features

  • Full token handling: mi, mo, mn, mtext, mspace with font variants, operator detection, context-aware spacing
  • Script/limit constructs: msub, msup, msubsup, munder, mover, munderover, mmultiscripts with movablelimits-aware placement
  • Structural elements: mfrac, msqrt, mroot, mtable (matrix/cases/equation arrays), menclose, mphantom
  • Delimiter handling: per-cell bracket scoping, nested chevron pairing, bare brace escaping, opening-bracket-as-script-base separation
  • Equation numbering: auto-numbered and \tag{} equations, numcases/subnumcases grid layout with labels
  • Symbol mapping: 500+ Unicode → Typst symbol mappings
  • Dual output: block (typst) and inline (typst_inline) variants

Architecture

MathML tree → markUnpairedBrackets (per-cell scoping)
  → AST dispatcher (7 pattern priorities)
  → 22 handlers return TypstMathNode
  → serializeTypstMath with context-aware escaping
  → { typst, typst_inline }
  • 16 AST node types with const enum discriminants
  • 7 escape contexts mapped to function names
  • Per-cell bracket pairing for table environments (mtr/mlabeledtr in SCOPE_BOUNDARIES)

Test plan

  • 924 Typst tests passing (LaTeX→Typst + MathML→Typst)
  • Verify complex aligned/eqnarray formulas render correctly in Typst
  • Verify \left...\right. one-sided delimiters in multi-line environments
  • Verify bare \{...\} and \langle...\rangle output

Implement SerializedTypstVisitor that converts MathJax's internal MathML
tree into native Typst math syntax, enabling direct LaTeX → Typst and
MathML → Typst conversion.

Key features:
- Full token handling: mi, mo, mn, mtext, mspace with font variants,
  operator detection, and context-aware spacing
- Script/limit constructs: msub, msup, msubsup, munder, mover,
  munderover, mmultiscripts with movablelimits-aware placement
- Structural elements: mfrac, msqrt, mroot, mtable (matrix/cases/
  equation arrays with alignment), mfenced, menclose, mphantom
- Delimiter handling: paired/unpaired bracket detection via pre-serialization
  tree walk, lr() wrapping, abs/norm/floor/ceil shorthand with separator
  fallback to lr()
- Equation numbering: auto-numbered and \tag{} equations, numcases/
  subnumcases grid layout with per-row counters and labels
- Symbol mapping: 500+ Unicode → Typst symbol mappings including arrows,
  relations, accents, Greek, operators, geometry, and suits
- Escape handling: unified scanner for comma/semicolon/colon escaping
  in function calls, string literal skipping, bracket depth tracking
- Dual output: block (typst) and inline (typst_inline) variants
- Context menu integration for copying Typst math output

Architecture:
- Modular handler files: token-handlers, script-handlers,
  structural-handlers, table-handlers
- Shared utilities: common.ts, consts.ts, types.ts, escape-utils.ts,
  bracket-utils.ts, typst-symbol-map.ts
- Strict TypeScript typing throughout, no any casts
Extract serializeThousandSepChain into common.ts, replacing duplicated
chain logic in index.ts and structural-handlers.ts. Add tree mutation
order comment, forLatex/include_typst comments, and remove test logging.
…content spacing

- isDerivativePattern now checks for actual prime chars (′ ″ ‴) instead of any mo
- Move SCRIPT_NODE_KINDS and PRIME_CHARS to consts.ts, remove duplicate SCRIPT_PARENT_KINDS
- Revert post-content loop to needsTokenSeparator (fixes tau_(i,j)(t) regression)
- Add comment about prevNode in \left...\right delimiter handling
- Add test for f^{(n)}(a) → f^((n))(a) (TeXAtom derivative, no space)
- \left.\right\} now produces lr(mat(delim: #none, ...) \}) instead of losing the brace
- Mismatched pairs like \left[\right) use lr() wrapping instead of mat(delim:)
- Matched pairs (same char or standard open→close) still use compact mat(delim: ...)
- Add tests for \left.\right\} in align* and mismatched \left[\right) on array
Two MathJax patterns for \not:
- Overlay: TeXAtom(REL) > mpadded[width=0] > mtext(⧸) — detected in
  mrow/inferredMrow loops, next sibling wrapped in cancel()
- Combining char: U+0338 appended to mi/mo text — stripped and wrapped
  in cancel() in token handlers
- Fix cancel() loss on early returns in mo handler (multiword/namedOp)
- Add tests for \not 7,60 and \not k + \not q
…t-handling API

- Add two-pass escapeUnpairedBrackets (reuses scanBracketTokens + findUnpairedIndices)
- Integrate into escapeContentSeparators for all function-call arguments
- Integrate replaceUnpairedBrackets into escapeCasesSeparators for consistent API
- Remove manual replaceUnpairedBrackets calls from table-handlers
- Add unit tests for escape-utils
… lr()

The mrow handler incorrectly delegated to mtable when ANY child was a
table, even when other content (arrows, operators) sat alongside it.
Now hasTableChild is true only when mtable is the sole content child.
The mtable handler also checks parent delimiters only for sole-content
case, preventing double lr() wrapping.

Extract getContentChildren into common.ts and containsTable as a
standalone helper to eliminate duplication between the two handlers.
…call parsing

In Typst math mode, identifier( is parsed as a function call (see
typst/typst#7274). Insert a space before ( when the preceding token
is a multi-char name not in TYPST_BUILTIN_OPS (e.g. emptyset, sigma,
Gamma, psi). Single-char identifiers (f, g) and built-in operators
(sin, cos, ln, arg) keep no space.
…rsing

Add escapeColon to escapeContentSeparators so word: inside any Typst
function call becomes word : (space prevents named-arg syntax).
Apply escapeContentSeparators to abs(), norm(), floor(), ceil()
content which previously had no escaping.
…content

- Extract resolveDelimiterMo helper to access texClass on delimiter nodes
- Reject ‖ pairing when opener has CLOSE texClass (surrounding pair context)
- Reject ‖ pairing when content contains PUNCT (comma between standalone ‖)
- Reject ‖ pairing spanning entire row when content has REL operator (=)
- Apply escapeContentSeparators to bare delimiter func-call content (norm,
  floor, ceil) to prevent commas/semicolons/colons breaking Typst parsing
- Add explicit isFuncCall flag to BARE_DELIM_PAIRS instead of endsWith('(')
- Add 4 test cases: standalone ‖, complex ‖ with comma/number/variable
…ith ; separators

Typst ignores \\ linebreaks inside mat() cells. When aligned/gathered
environments are nested inside a matrix or cases cell, convert them to
mat(delim: #none, ...) using ; row separators instead.

- Add isInsideMatrixCell() recursive parent-chain walker
- Wrap nested mat() in display() for block output to reset scriptlevel
- Propagate typst_inline (without display()) through cell/row pipeline
- Determine alignment from column usage: gathered→center, rl-pairs→right/left
- Extract buildMatExpr helper to deduplicate block/inline mat construction
…gh lr()

- Wrap cases() and plain matrices in display() when inside a mat() cell
  to prevent Typst scriptlevel reduction (block only, not inline)
- Route eqnArrays with rowlines/columnlines through mat() format to
  preserve augment: #(hline/vline); add stroke: (dash: "dashed") when
  all separator lines are dashed
- Propagate typst_inline through structural-handlers lr() path by
  building parallel contentInline and extracting buildLrExpr() helper
- Extract computeAugment() and buildEqnArrayAsMat() helpers to
  deduplicate augment computation and eqnArray-as-mat construction
- Detect eqnArray-with-lines parents in isInsideMatrixCell()
- Cache isInsideMatrixCell() result to avoid redundant parent walks
- Use separate needsSpaceBetweenNodes() calls for block/inline content
…lose

Brackets inside these nodes are now paired independently from brackets
outside, preventing false pairing when content is split across Typst
function-call arguments (e.g. \sqrt( arg ) where ( and ) end up in
different scopes). Each child of a scope-boundary node is processed
as a separate pairing scope. SCOPE_BOUNDARIES set is module-level.
…mrows

- Detect \left.\aligned\right\} as cases(reverse: #true, ...) for
  eqnArray-like tables (displaystyle rows); regular arrays keep matrix form
- Add hasTableFirst in structural-handlers: \left\{ table extra \right.
  lets the table inherit { as cases(), extra content follows outside
- Add isFirstWithInvisibleClose in table-handlers so the table picks up
  the open delimiter from the parent mrow when close is invisible
- Track contentInline in the hasTableChild/hasTableFirst mrow branch so
  typst_inline propagates correctly when children return differing inline
- Add tests for reverse cases and cases() + stretch() patterns
- Digits before ( (.4() are no longer treated as function calls —
  only ASCII letters qualify (isFuncCallParen)
- When a supposed function-call ( has no matching ), backtrack so the
  for-loop re-scans the range and picks up any [, ], {, } inside
- Use non-whitespace check for spacing around symbol names (paren.l,
  bracket.r, etc.) instead of \w — fixes missing space after quoted
  strings ("л"paren.l) and other non-\w tokens
- Extend RE_WORD_CHAR, RE_WORD_DOT_END, RE_WORD_START with \p{L} for
  Unicode letter support
- Move RE_ASCII_LETTER, RE_TRAILING_WS, RE_LEADING_WS to consts.ts
- Add tests: unpaired brackets across matrix rows with digits, letters,
  real functions, and inner brackets inside failed function-call scans
escapeLrSemicolons now also escapes colons after identifiers (g: → g :),
matching the behavior already present in escapeCasesSeparators and
escapeContentSeparators. Without this, lr(g: K_0 ]) would be parsed
by Typst as a named argument.

Add tests for colon escaping in lr(), abs(), and general lr() paths.
MathJax splits \mathrm{टेक} into individual mi nodes per character,
breaking Devanagari/Arabic combining sequences. serializeCombiningMiChain
merges consecutive non-Latin mi nodes with the same mathvariant into a
single font-wrapped quoted string. Known math symbols (∂, ψ, ∅) are
excluded via typstSymbolMap lookup. Uses Unicode script properties
(\p{Script=Latin}) for robust Latin vs non-Latin classification.
… bases

Remove overline/underline from RE_SPECIAL_FN_CALL — they do not imply
below/above placement like overbrace/underbrace do. Add overbracket/
underbracket which were missing. Now \underset{...}{\underline{x}}
correctly produces limits(underline(x))_(...).
- escapeLrBrackets: escapes bare bracket chars matching the lr() delimiter
  type so Typst doesn't auto-scale inner brackets (e.g. \left[ [...] \right]
  → lr([ \[...\] ])). Only same-type brackets are escaped.
- isSyntaxParen: renamed from isFuncCallParen, now also skips _() and ^()
  script grouping parens in scanBracketTokens.
- Fix RE_SPECIAL_FN_CALL: remove overline/underline (they don't imply
  below/above placement), add overbracket/underbracket.
Wrap #box(stroke:...) and #circle(inset:...) with #align(center, ...)
for block display so they center like LaTeX \boxed and \enclose{circle}.
Inline variant remains unwrapped. Add integral.surf (\oiint), slash.o
(\oslash), lt.approx (\lessapprox), gt.approx (\gtrapprox) to symbol map.
Rewrite escapeUnbalancedParens to use scanBracketTokens + findUnpairedIndices
instead of single-pass scanExpression — handles both unbalanced ( and )
(previously only )). Add mover/munder to SCOPE_BOUNDARIES so brackets
inside accents don't pair with brackets outside. Remove dead
escapeUnbalancedCloseParen option from scanExpression.

Fixes \overline(x), \underline(x), \hat(x) producing unescaped parens.
Replace overline(")"content) / underline(")"content) with
overline(lr(\) content)) / underline(lr(\) content)) so the )
delimiter auto-scales via lr() instead of rendering at fixed size.
- \xcancel → cancel(cross: #true, ...) when both diagonal strikes present
- Script children (sub/sup) of msub/msup/msubsup are now separate scopes
  in markUnpairedBrackets, while base stays in parent scope — fixes
  \cancelto{5(y}x) where ( in script paired with ) outside
- safeFormatScript wrapper in script-handlers applies escapeUnbalancedParens
  to ^(…)/_(…) content; removes escape-utils import from common.ts
\underset and \overset create munder/mover without accentunder/accent
attributes — they must use the general limits() path, not accent handlers.
Previously \underset{\rightarrow}{r} produced attach(r, b: arrow.r)
instead of limits(r)_(arrow.r).
MathJax builds \longrightleftharpoons and \longleftrightarrows from
mover with harpoon/arrow pieces. Detect these patterns via
CONSTRUCTED_LONG_ARROWS map and emit single Typst symbols
(harpoons.rtlb, arrows.lr).

Flatten mover(munder(...), over) via unwrapToScriptNode so
\stackrel{k_1}{\underset{k_2}{...}} produces limits(base)_(k_2)^(k_1)
instead of nested limits(limits(base)_(k_2))^(k_1).
… notation matching

- menclose with border-side notation (left/right/top/bottom combos from
  \begin{array}{|l|}\hline) now generates #box(stroke: (...)) with per-side
  strokes instead of overline()/underline()
- Cap vline augment indices at actual column count to prevent out-of-bounds
  when column spec has more columns than data cells
- Refactor menclose notation checks from String.includes() to Set-based
  word-boundary safe matching via parseNotation()/hasNotation()
…airs

serializeRange used needsTokenSeparator which lacks the script+bracket
spacing check. Switched to needsSpaceBetweenNodes so that e.g.
\|L_N^n(\Delta S)\|_\infty produces norm(L_N^n (Delta S)) with a space
before ( to prevent Typst from parsing n( as a function call.
…ntent

Narrow gathered-like detection so that gathered directly inside align*
(sole cell content) keeps \\-separated rows, while gathered with siblings
in an aligned cell becomes display(mat(delim: #none, ...)).

Also propagate typst_inline through eqnArray row-building so that
display() never leaks into the inline variant.
…ymbol)

\underleftarrow, \underrightarrow, \underleftrightarrow now produce
limits(...)_arrow.l etc. without extra parens around the symbol.
- Sync TYPST_MATH_OPERATORS / TYPST_BUILTIN_OPS, remove non-built-in ops
- Extract TEX_ATOM and MLABELEDTR constants into consts.ts
- Rename SCRIPT_KINDS → SUBSCRIPT_KINDS for clarity
- Extract escapeTypstString into common.ts, remove duplicate
- Add needsTokenSeparator in handleAll for regular children
- Add findTypstSymbol guard in big-delimiter pattern
- Add console.warn in getBigDelimInfo/resolveDelimiterMo catch blocks
- Fix duplicate "Pattern 6" → "Pattern 7" comment
- Fix close variable shadowing in mrow handler (→ closeMapped)
- scanExpression: string concat → parts.push() + join()
- Extract escapeAtPositions helper, use in 3 escape functions
- findUnpairedIndices: single-pass instead of two passes
- Add JSDoc to ANCESTOR_MAX_DEPTH and SHALLOW_TREE_MAX_DEPTH
- Add /pr-specs to .npmignore
- Add edge-case tests (empty, invalid, deep nesting, separators, etc.)
- Update PR spec
mfrac handler now checks the actual delimiter on the parent mrow
via unwrapToMoText (TeXAtom > inferredMrow > mo). Only ) produces
binom(); { and [ produce mat(delim: "{"/"]", ...; ...).
…e tests

- Return empty typst with error field for merror nodes instead of serializing
  error text into typstmath/typstmath_inline (findMerror in toTypstData)
- Clean up tree mutations after serialization using removeProperty() instead
  of setProperty(undefined) or any-cast delete
- Deduplicate TYPST_MATH_OPERATORS: single source in consts.ts, imported by
  token-handlers.ts; TYPST_BUILTIN_OPS built as union with TYPST_MATH_FUNCTIONS
- Remove unused optionTypst parameter and type toTypstData with MathNode
- Rename SUBSCRIPT_KINDS → IDOTSINT_SCRIPT_KINDS for clarity
- Make RE_TAG_EXTRACT_G local to extractTagFromConditionCell (no lastIndex leak)
- Remove console.warn from catch blocks (silent error handling for library)
- Extract BOX_STROKE/BOX_INSET constants for boxed/circle/border styling
- Remove extra parens wrapping toTypstData arrow function
- Add edge-case tests: 15-level frac, unbalanced delimiters, unknown command,
  escape injection; error tests verify error field presence
- Update PR spec with error handling section
…ypstData

- README: add typst to format list, HTML output tags, JSON output examples,
  include_typst in all options blocks, conversion examples with real object format
- Changelog: add [2.0.39] entry for Typst math format
- index.ts: extract ITypstConvertResult interface, type TexConvertToTypstData
Underscores are invalid in HTML tag names. Renamed to hyphenated form
in tag generation, parsing, and documentation.
- Add include_typst: true to node-examples (math, tabular, tabular_include/not_include_sub_math)
- Add typst checkbox and option to react-app form.jsx
- Fix react-app webpack 5 build: add react-app-rewired + path-browserify
  polyfill for postcss's path dependency
- Add MMD_TYPES const (abstract, theorem, proof, align) to consts.ts
- Tag tokens with token.meta.mmd_type in begin-align, block-rule, mdPluginText
- Export MMD_TYPES from index.tsx for external consumers
…hML→Typst

- Move AsciiMath conversion from render to parse stage via convertAsciiMathToHtml
  (consistent with convertMathToHtml pattern for LaTeX/MathML tokens)
- Add TypesetAsciiMath method returning {html, data}; keep AsciiMathToSvg
  as deprecated backward-compatible wrapper returning string
- Add MathMLConvertToTypstData for direct MathML→Typst conversion
- Add include_typst support to OuterDataMathMl and OuterDataAscii pipelines
- Add tests for MathML→Typst, TypesetAsciiMath, full pipeline, backward compat
Semicolons in LaTeX math are punctuation, but in Typst math ; inside ()
creates arrays/vectors. Escape ; → \; in the mo handler (token-handlers.ts)
as the single source of truth — this fixes all contexts: nested builtins
sin(cos(a;b)), operatorname op("mf")(a;b), subscripts x_(a;b), and any
future cases. Remove now-dead tryBuiltinOpParensPattern and helpers.
… backslash handling

- mo handler: bare `"` now emits `quote.double` (Typst symbol name) instead of
  `\"` which broke quote pairing across the expression
- typst-symbol-map: updated `"` entry to `quote.double`
- mtext handler: targeted backslash escaping — only double `\` before `"` and at
  end-of-string; leaves MathJax-preserved raw LaTeX (e.g. `\geq` in numcases) intact
- Added 16 edge-case tests (quote.double in scripts/frac/sqrt/matrix/cases,
  multi-special-char combos, mtext backslash edges)
- Updated PR spec to document new escaping approach
MathJax can produce msub/msup with comma as the base (e.g. LaTeX ,_{-}).
A bare comma in Typst math is a separator, so ,_- inside lr() with
non-standard delimiters (chevron.l, \(, |) caused "unexpected underscore"
compilation errors. Wrapping in quotes makes it a text atom: ","_-

Added safeScriptBase() helper in script-handlers.ts, applied to msub,
msup, and msubsup handlers. Added 6 test cases including the full
real-world formula that triggered the bug.
\not( produced cancel((), \not[ produced cancel([) — unbalanced
grouping syntax broke Typst compilation. Added wrapCancel() helper
with escapeUnbalancedParens + escapeContentSeparators. Applied in
token-handlers, structural-handlers, and index.ts.
Introduce a typed intermediate representation (TypstMathNode) between
MathJax MathML tree traversal and Typst string output. All 22 handlers
return typed AST nodes; escaping is centralized in a single serializer
with a context registry. raw() usage reduced from 56 call sites to 1
static constant.

Architecture:
- 16 AST node types with const enum discriminants
- Discriminated unions: ArgValueType (8), FuncArgKind (2)
- Escape context registry (7 contexts) maps function names to escaping rules
- AST dispatcher with 7 documented pattern priorities for inferred mrow
- Per-cell bracket scoping (mtr/mlabeledtr/mstyle in SCOPE_BOUNDARIES)
  prevents orphaned brackets in aligned/eqnarray cross-row Typst output

Delimiter handling:
- Bare \{/\} escaped as \{/\} (paired) or brace.l/brace.r (unpaired in tables)
- Nested chevron pairing via nesting depth tracking
- mstyle in SCOPE_BOUNDARIES (fixes \Varangle hidden bracket)
- Opening bracket separated from script base: [^circ -> [""^(compose)
- One-sided lr() preserved for \left...\right. with escaped delimiters

File structure (src/mathjax/serialized-typst/ast/):
  types.ts, builders.ts, serialize.ts, serialize-context.ts,
  dispatcher.ts, token-handlers.ts, script-handlers.ts,
  structural-handlers.ts, table-handlers.ts, table-builders.ts,
  table-helpers.ts

924 tests passing.
@OlgaRedozubova OlgaRedozubova self-assigned this Apr 7, 2026
matchBraceAnnotation matched already-annotated braces and added extra
args: \underset{x}{\underbrace{y}_{z}} produced underbrace(y, z, x).
buildLimitBase skipped limits() for annotated braces via isSpecialFnCall,
placing underset text as right-side subscript instead of below.

- matchBraceAnnotation: skip if brace already has annotation (2+ positional args)
- buildLimitBase: wrap annotated braces in limits() for correct placement
- Add tests for underset+underbrace, overset+overbrace, nested braces, chemistry

928 tests passing.
- Add escapeTypstContent() in common.ts: escapes * _ ` $ # < @ ~ [ ] { }
  and comment starts // /* for Typst content-mode [...] blocks
- Add sanitizeTypstLabel() in common.ts: encodes invalid label chars as
  _XX hex (e.g. space→_20, <→_3C) preserving uniqueness and <label> syntax
- Apply escapeTypstContent in serializeTagContent and numcases tag builder
- Apply sanitizeTypstLabel in LabelNode serializer and buildFigureTag/
  buildAutoTagWithLabel
- Update RE_CONTENT_SPECIAL to include all Typst content-mode specials
- Add tests for tag with URL //, tag with # ~, labels with special chars

932 tests passing.
… content-mode specials

Labels:
- Pass MathJax tags.labels map through SerializedTypstVisitor → ITypstMathSerializer
  → getLabelKey, avoiding tree walk to set properties
- getLabelKey falls back to labels map when data-label-key is absent (bare display
  math where MathJax getTag() is never called)
- Deduplicate tagLabels extraction in OuterData functions
- sanitizeTypstLabel encodes invalid chars as _XX hex for Typst <label> syntax
- Add test for \tag + \label on bare display math

Content escaping:
- Add escapeTypstContent() for Typst content-mode [...] blocks: escapes
  * _ ` $ # < @ ~ [ ] { } and comment starts // /*
- Update RE_CONTENT_SPECIAL to include all Typst content-mode specials
- Apply in serializeTagContent and numcases tag builder

Brace annotation:
- matchBraceAnnotation skips already-annotated braces (2+ positional args)
- buildLimitBase wraps annotated braces in limits() for underset/overset
- Add tests for underset+underbrace, overset+overbrace, chemistry

933 tests passing.
…ath \label

- Add sanitizeLabel() in labels.ts: encodes invalid label chars as _XX hex
- Add ILabel.sanitizedKey field, auto-populated in addIntoLabelsList
- Pass MathJax tags.labels through SerializedTypstVisitor → ITypstMathSerializer
  → getLabelKey fallback for bare display math where getTag() isn't called
- Remove sanitizeTypstLabel re-export, use sanitizeLabel directly
- Deduplicate tagLabels extraction in OuterData functions
- Revert mathjax.ts getTag patch to original (labels map handles bare math)

933 tests passing.
- isDuplicateLabel uses getLabelByKeyFromLabelsList: if key already in
  global labelsList (added by previous equation), skip label emission
  to prevent duplicate <label> errors in Typst
- addIntoLabelsList auto-populates sanitizedKey via sanitizeLabel()
- sanitizeLabel() moved to labels.ts as shared utility (not Typst-specific)
- Remove sanitizeTypstLabel re-export, import sanitizeLabel directly

933 tests passing.
… in Typst output

- Extend fence balance check in tryBareDelimiterPattern to track all bracket
  types ([], ⟨⟩, ⌊⌋, ⌈⌉), preventing |...| from swallowing content across
  unmatched ⟩ or [ in bra-ket notation
- Skip unpaired brackets (marked by markUnpairedBrackets) in fence balance
  to avoid false rejections
- Use inline variants when code-mode blocks (#align, #box) have siblings
  in inferred mrow or eqnArray cells, keeping math flow intact
- Add space before [ and { after FuncCall/Delimited nodes to prevent Typst
  trailing-content-block parsing (frac(1,2)[x] → frac(1,2) [x])
- Escape all ASCII bracket types inside lr() with non-bracket delimiters
  (|, ‖, ⟨, etc.) to prevent unintended auto-scaling
- Escape colon after any non-space char (not just word chars) to prevent
  H_+: from being parsed as a named argument in mat()
- Add 17 regression tests for bra-ket, bordered arrays, boxed in aligned,
  and trailing content block edge cases
\Bigg[ \bigg( \Big( \big( ... \big) \Big) \bigg) \Bigg] was incorrectly
pairing \Bigg[ with \big) (the first CLOSE), ignoring inner nesting.
Track open/close depth so inner \bigg( ... \bigg) pairs are skipped
and \Bigg[ correctly matches \Bigg].
Characters like ṭ (U+1E6D), é, ç, ö, č, ά decompose into base +
combining mark in NFD. Typst math mode cannot shape them as single
glyphs, producing "yielded more than one glyph" errors. Detect via
NFD normalization length and wrap in text() ("ṭ") instead of bare
symbol. Standard Latin and Greek (a, α) are unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant