Skip to content

Add multi-index apply_time_correction with einsum contraction (variant)#104

Open
jon-proximafusion wants to merge 1 commit into
developfrom
d1s-multi-index-einsum
Open

Add multi-index apply_time_correction with einsum contraction (variant)#104
jon-proximafusion wants to merge 1 commit into
developfrom
d1s-multi-index-einsum

Conversation

@jon-proximafusion

@jon-proximafusion jon-proximafusion commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

This is a faster variant of the multi-index apply_time_correction from #103. Like #103, it adds support for an iterable of time indices (returning a list of derived tallies, one per index) and reads/reshapes the tally arrays once instead of per call.

The difference is in the summed (sum_nuclides=True) path: the TCF-weighted reduction over the parent-nuclide axis is evaluated as an einsum contraction (BLAS-backed) rather than a Python-level broadcast-multiply-and-reduce. Variances combine in quadrature, so std_dev contracts the squared values.

Trade-off vs #103

The einsum contraction sums in a different order than the explicit broadcast-and-reduce, so results match develop to machine precision (~1e-15 relative) rather than bit-for-bit. This is far below Monte Carlo statistical noise and well within regression-test tolerances, but it is the reason this is offered as a separate PR.

The two PRs are mutually exclusive; only one should be merged.

Behavior

Identical public API to #103:

  • Scalar index → single Tally (unchanged API).
  • Iterable indexlist[Tally], one per index, in the order requested.
  • When sum_nuclides=True, the derived result leaves sum/sum_sq unset (the public accessors return None for any derived tally, matching develop). Keeping the per-nuclide arrays would only waste two full-array multiplies per index and would be shaped inconsistently with the popped ParentNuclideFilter, breaking Tally.sparse.
  • Contracting one index at a time keeps each result bit-for-bit identical to a scalar call for that same index (the ~1e-15 difference is relative to develop, not between scalar and multi-index calls here).

Testing

Adds test_apply_time_correction_multi_index, which checks each element of a multi-index call against the corresponding scalar call (mean, std_dev, filters, sum/sum_sq), that unordered/partial index sequences are honored, that scalar input still returns a single Tally, and that the original tally is left unmodified.

Fixes # (issue)

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 18) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

Variant of the multi-index apply_time_correction that evaluates the summed
TCF-weighted sum over the parent-nuclide axis with an einsum contraction
instead of a broadcast-multiply-and-reduce. This is substantially faster for
large mesh tallies but shifts mean/std_dev by ~1e-15 relative (floating-point
summation order), so results match develop to machine precision rather than
bit-for-bit -- well below Monte Carlo statistical noise and regression-test
tolerances.

Same multi-index API and sum/sum_sq handling as the bitwise variant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants