Skip to content

ENH: Rewrite ArrayCalculator filter with new expression engine#1567

Draft
joeykleingers wants to merge 19 commits intoBlueQuartzSoftware:developfrom
joeykleingers:worktree-ArrayCalculatorRewrite
Draft

ENH: Rewrite ArrayCalculator filter with new expression engine#1567
joeykleingers wants to merge 19 commits intoBlueQuartzSoftware:developfrom
joeykleingers:worktree-ArrayCalculatorRewrite

Conversation

@joeykleingers
Copy link
Copy Markdown
Contributor

@joeykleingers joeykleingers commented Mar 23, 2026

Summary

Complete rewrite of the ArrayCalculator filter, replacing ~65 utility class files with a consolidated expression engine in a single algorithm file pair.

What's better about the new version

  • Dramatically simpler codebase: ~65 separate operator/utility class files (each with .hpp/.cpp) consolidated into ArrayCalculator.hpp/cpp. Adding a new operator is one line in a registry table instead of two new files.
  • Robust hand-written tokenizer replacing the fragile regex-based parser that was difficult to maintain and debug.
  • Static operator registry table with all 23 operators defined as data, replacing 30+ class files with deep inheritance hierarchies where the actual logic was a one-line lambda.
  • Clean shunting-yard RPN evaluator with proper precedence, associativity, and trig mode handling.
  • 7 bugs fixed from the original implementation: null-check-after-dereference, off-by-one in component extraction, dead code, non-unique scratch names, and more.
  • All index variables use usize — no integer overflow risk with large arrays.
  • CalculatorParameter changed from MutableDataParameter to ValueParameter — removes the framework-rendered data path selector that is no longer needed.

New features

  • Arrays from anywhere in the DataStructure: No longer restricted to a single AttributeMatrix. Bare array names are auto-searched across the entire DataStructure. If ambiguous, users can disambiguate with quoted full paths (e.g., "CellData/Confidence Index").
  • Selected group as priority hint: When a selected_group is present (from old pipelines), it takes priority for bare name resolution, ensuring backward compatibility.
  • Built-in constants: pi and e resolve to std::numbers::pi and std::numbers::e with full double precision.
  • Modulo operator: % for std::fmod(a, b).
  • Tuple+component indexing: Array[T, C] extracts a single scalar value at tuple T, component C. Broadcasts like a literal when combined with arrays.
  • Sub-expression component access: (ArrayA + ArrayB)[0] extracts component 0 from the result of a computed expression.
  • Sub-expression tuple+component access: (ArrayA + ArrayB)[T, C] extracts a single scalar value at tuple T, component C from a computed expression result. Broadcasts like a literal.
  • Quoted full-path references: "DataContainer/CellData/Array Name" resolves directly by DataPath.

Backward compatibility

  • All existing unit tests pass unchanged (3689 assertions) — this was the hard gate before any old code was removed.
  • Old pipeline JSON loads and executes identically: The selected_group key is still read from pipeline JSON. When present, it acts as a priority hint for bare name resolution, producing the same behavior as the old filter.
  • FromSIMPLJson conversion still works: Legacy SIMPL pipelines convert correctly.
  • Error code numeric values preserved: External code checking specific error codes continues to work.
  • parametersVersion bumped to 2: fromJsonImpl handles both version 1 and version 2 JSON seamlessly.
  • Backward-compatible pipeline tested via nxrunner with non-empty selected_group — all steps execute successfully.

Known issue: TBB crash on exit in Debug builds

These changes surface a pre-existing crash-on-exit in Debug builds only. The crash occurs in a TBB worker thread (tbb::detail::r1::rml::private_worker::run()) jumping to unmapped memory during exit() / __cxa_finalize_ranges. It does not occur in Release builds.

Root cause: Debug builds link libtbb_debug.12.4.dylib (from vcpkg/simplnx), but VTK is built in Release and links libtbb.12.4.dylib (release). Both TBB runtimes are loaded into the same process simultaneously. During shutdown, the release TBB's worker threads try to execute tasks whose code has been unloaded, crashing at an invalid address.

How to fix: Ensure VTK and simplnx link the same TBB variant. Options:

  1. Build VTK in Debug mode for Debug builds so it also links libtbb_debug.
  2. Configure the Debug build to use release TBB (libtbb instead of libtbb_debug) so it matches VTK.
  3. Add a TBB finalization call (e.g., oneapi::tbb::finalize()) in DREAM3D-NX's shutdown path to drain worker threads before libraries are unloaded.

This issue is not caused by the ArrayCalculator changes — it existed before but was not triggered consistently. The code changes in this PR shift the shared library layout enough to make the timing race reproducible.

Companion PR

This PR must be merged together with the DREAM3D-NX companion PR BlueQuartzSoftware/DREAM3DNX#1087 — the DREAM3D-NX widget changes depend on the CalculatorParameter base class change (MutableDataParameterValueParameter) and the new engine API.

Test Plan

  • All 8 existing + new ArrayCalculatorFilter test cases pass (100%)
  • Backward compatibility pipeline tested via nxrunner
  • DREAM3D-NX Release build compiles and runs
  • No crash on exit in Release builds

Signed-off-by: Joey Kleingers joey.kleingers@bluequartz.net

@joeykleingers joeykleingers added the enhancement New feature or request label Mar 23, 2026
@joeykleingers joeykleingers force-pushed the worktree-ArrayCalculatorRewrite branch from 0f6e3d6 to 272aa73 Compare March 23, 2026 16:52
@imikejackson imikejackson force-pushed the worktree-ArrayCalculatorRewrite branch 2 times, most recently from 0b8317f to 35231a3 Compare April 7, 2026 23:09
joeykleingers and others added 19 commits April 7, 2026 19:15
Rewrite ArrayCalculator.hpp with new enum classes (CalculatorErrorCode,
CalculatorWarningCode, TokenType), value structs (Token, OperatorDef,
CalcValue, RpnItem), parser class (ArrayCalculatorParser), and algorithm
class (ArrayCalculator). The .cpp provides stub implementations so the
algorithm files compile independently; ArrayCalculatorFilter.cpp will
break until it is updated to use the new API in the next task.
…it tests

Single-pass character scanner that produces tokens for numbers, identifiers,
quoted strings, operators, brackets, and commas with position tracking.
…lator

Populate getOperatorRegistry() with the complete table of binary infix
operators (+, -, *, /, %, ^), unary math functions (abs, sqrt, ceil,
floor, exp, ln, log10), trig functions (sin, cos, tan, asin, acos, atan),
and binary functions (log, root, min, max).  log10 is placed before log
in the vector so prefix-based identifier matching resolves correctly.
Add the core parsing logic that converts tokens into typed ParsedItems
ready for shunting-yard conversion. Includes multi-word identifier
merging, identifier resolution with priority chain (operator > constant
> selected group array > unique DataStructure array), bracket indexing
for component and tuple+component extraction, minus sign disambiguation
(unary negative vs binary subtraction), WrapFunctionArguments for
multi-arg functions, and validation (matched parentheses, consistent
tuple counts and component shapes).
…Calculator

Add the final engine components to complete the ArrayCalculator rewrite:
- Shunting-yard algorithm at the end of parse() converts ParsedItems to RPN
- RPN stack evaluator in evaluateInto() with unary/binary ops, scalar
  broadcasting, trig angle unit conversion, and component extraction
- CopyResultFunctor for type-dispatched output with scalar fill support
- ArrayCalculator::operator()() wires the parser to the filter algorithm
Relax CalculatorParameter::validate() to accept empty or non-existent
selected groups (the path is only a resolution hint). Rewrite
preflightImpl() to use ArrayCalculatorParser::parseAndValidate() and
executeImpl() to delegate to the ArrayCalculator algorithm class. Remove
old ICalculatorArray include and bump parametersVersion to 2.
…yCalculator

Replace CalculatorItem:: enum references with CalculatorErrorCode:: and
CalculatorWarningCode:: in ArrayCalculatorTest.cpp.  Remove the old
CalculatorItem.hpp include since the test now uses the new
ArrayCalculator.hpp header exclusively.

Add validation in the parser to emit the same legacy error codes:
- OperatorNoOpeningParen / OperatorNoClosingParen for bare functions
- OperatorNoLeftValue / OperatorNoRightValue for binary operators
- TooManyArguments / NotEnoughArguments for function argument counts
- NoPrecedingUnaryOperator for commas in non-function parens
- AmbiguousNameWarning for operator symbols matching array names
…ameter

The selected group is no longer a required data path -- it is just a
resolution hint. Switching to ValueParameter removes the framework-rendered
data path selector from the UI.
Signed-off-by: Joey Kleingers <joey.kleingers@bluequartz.net>
Adds TupleComponentExtract as a new RPN item type that extracts a single
scalar value from a computed sub-expression result at a specific tuple
and component index. Produces a Number that broadcasts like a literal.

Signed-off-by: Joey Kleingers <joey.kleingers@bluequartz.net>
Signed-off-by: Michael Jackson <mike.jackson@bluequartz.net>
Specifies the CalcBuffer RAII sentinel pattern for eliminating
unnecessary temp DataArray allocations in the ArrayCalculator evaluator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7 tasks: CalcBuffer RAII class, data-free parser, evaluator rewrite
with CalcBuffer stack, OutputDirect optimization, and full test
verification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce CalcBuffer, a move-only RAII sentinel that wraps Float64Array
references (borrowed for zero-copy, owned for temps with automatic
cleanup via DataStructure::removeData()). This eliminates unnecessary
memory allocations:

- Parser is now data-free: produces RPN items with DataPath/scalar
  metadata instead of allocating temporary Float64Arrays. The
  m_TempDataStructure, m_IsPreflight, createScalarInTemp(), and
  copyArrayToTemp() members are removed from the parser class.
- Evaluator uses a CalcBuffer stack with RAII cleanup: intermediates
  are freed the instant they are consumed by an operator.
- Float64 input arrays are borrowed (zero-copy) instead of copied.
- The last RPN operator writes directly into the output DataArray
  when the output type is float64 (OutputDirect optimization).
- Array[C] bracket indexing now emits ArrayRef + ComponentExtract
  RPN items (unified with sub-expression indexing) instead of
  extracting component data during parsing.

All 3,629 existing test assertions pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fallback

Replace silent no-ops with std::runtime_error throws when write() or
fill() is called on a read-only Borrowed CalcBuffer. Replace null
pointer dereference fallback in array() with a throw.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@imikejackson imikejackson force-pushed the worktree-ArrayCalculatorRewrite branch from 35231a3 to faf7d11 Compare April 7, 2026 23:15
@joeykleingers joeykleingers marked this pull request as draft April 8, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants