Add CeedOperatorCompositeSetSequential and fix edge cases in CUDA gen at points assembly by zatkins-dev · Pull Request #1930 · CEED/libCEED

zatkins-dev · 2026-02-18T21:15:32Z

These are changes needed to support functionality in Ratel (https://gitlab.com/micromorph/ratel/-/merge_requests/1213).

First, I need to be able to turn off the parallel sub-operator functionality.

The particular use case is that I need to project a quantity from different mesh elements to the same set of points. The points here are "shared" between the two sides - they have the same quadrature weights and areas, etc. I then need to use both of those at-points vectors as inputs to the "real" QFunctions which use each of the interpolated values to compute the physics at quadrature points. These "real" QFunctions then interpolate back to each of the two sets of mesh elements. That's a use-case that's fundamentally incompatible with the way that our at points operators work, which group points by which element they are inside of. We strictly need 2 "setup" operators (one for each set of mesh elements) that just interp to points, then 2 "output" operators (again, one for each set of mesh elements) to compute the physics and interp-transpose back to the mesh nodes. There is never a situation where I would only run the "setup" or "output" -- they are explicitly coupled. But, the math is perfectly well-defined if that sequence is allowed.

Second, the linear assemble diagonal (and maybe also the normal assembly) were trying to write to passive outputs (particularly with CeedEvalNone), which is incorrect. This fixes that by adding an additional flag to the QFunction builder function.

Specifically, previously the point loop would have something like this:

        // EvalMode: none
        const CeedInt comp_stride_out_1 = 1;
        WritePoint<num_comp_out_1, comp_stride_out_1, max_num_points>(data, elem, i, points.num_per_elem[elem], indices.outputs[1], r_s_out_1, d_out_1);

Here, d_out_1 is a passive (CeedEvalNone`) output. We should instead just not include inactive outputs in the QFunction build routine.

jrwrigh · 2026-02-18T21:39:47Z

First, I need to be able to turn off the parallel sub-operator functionality.

Could you not implement that via calls to separate CeedOperators? I'm not sure how much I like adding the backdoor here (though admittedly it's just a gut feeling).

I guess part of the dislike is that now the order of CeedOperator additions gives an implicit notion of ordering. Granted, CeedQFunctionAddInput() already has that implicit notion of ordering, so it's not unprecedented.

backends/cuda-gen/ceed-cuda-gen-operator.c

jeremylt · 2026-02-18T21:44:14Z

Can you outline for posterity here why non-sequential is important? Also probably should be in the docs why you might need to do this

jeremylt · 2026-02-18T21:46:35Z

Codecov is correct - this should be tested, if we can cook a representative reason why we would want this

interface/ceed-operator.c

zatkins-dev · 2026-02-18T22:38:41Z

First, I need to be able to turn off the parallel sub-operator functionality.

Could you not implement that via calls to separate CeedOperators? I'm not sure how much I like adding the backdoor here (though admittedly it's just a gut feeling).

I guess part of the dislike is that now the order of CeedOperator additions gives an implicit notion of ordering. Granted, CeedQFunctionAddInput() already has that implicit notion of ordering, so it's not unprecedented.

That causes issues when, for example, I need to run one suboperator first in a MatCeed MatMult. In order to do that with separate operators, I would have to overload the MatCeed functions, when all I really need is to know that the suboperators will execute in order.

zatkins-dev · 2026-02-18T22:44:08Z

The particular use case is that I need to project two quantities from different mesh elements to the same set of points. I then need to use both of those as inputs to the "real" qfunctions -- one that interpolates the computed output to each of the two sets of mesh nodes. That's a use-case that's fundamentally incompatible with the way that our at points operators work -- it requires 2 "setup" operators (one for each set of mesh elements) that just interp to points, then 2 "output" operators (again, one for each set of mesh elements) to compute the physics and interp-transpose back to the mesh nodes. But, the math is perfectly well-defined if that sequence is allowed.

zatkins-dev · 2026-02-18T22:45:59Z

Codecov is correct - this should be tested, if we can cook a representative reason why we would want this

Yeah, I didn't add tests yet. I think that splitting the input and output parts of the operator like described above (but with only one mesh element set) would be a fine test.

backends/cuda-gen/ceed-cuda-gen-operator.c

jeremylt · 2026-02-19T17:25:28Z

Was that ref bug present in HIP or Sycl?

zatkins-dev · 2026-02-19T18:15:38Z

Was that ref bug present in HIP or Sycl?

good question -- there were actually a few more cases I missed (none in Sycl though, I don't think it ever got the field ordering optimization)

zatkins-dev · 2026-02-19T19:22:50Z

Hmm, I'm realizing that this doesn't fix all my issues, especially when assembling. I have some additional ideas I'm workshopping

interface/ceed-operator.c

zatkins-dev · 2026-02-19T23:19:28Z

I still think that this MR is useful, but I think I need a more fundamental solution long-term. See #1931 for my best idea.

…ub-operators

…o non-active outputs

zatkins-dev requested a review from jeremylt February 18, 2026 21:15

zatkins-dev self-assigned this Feb 18, 2026

zatkins-dev added interface at-points labels Feb 18, 2026

jeremylt reviewed Feb 18, 2026

View reviewed changes

backends/cuda-gen/ceed-cuda-gen-operator.c Outdated Show resolved Hide resolved

jeremylt reviewed Feb 18, 2026

View reviewed changes

interface/ceed-operator.c Outdated Show resolved Hide resolved

zatkins-dev commented Feb 19, 2026

View reviewed changes

backends/cuda-gen/ceed-cuda-gen-operator.c Outdated Show resolved Hide resolved

zatkins-dev force-pushed the zach/add-sequential-and-at-points-fixes branch from 059c0f3 to 4543c65 Compare February 19, 2026 16:55

zatkins-dev force-pushed the zach/add-sequential-and-at-points-fixes branch from f237a56 to dad8f31 Compare February 19, 2026 18:14

zatkins-dev force-pushed the zach/add-sequential-and-at-points-fixes branch 2 times, most recently from a366d8a to b681510 Compare February 19, 2026 19:01

jeremylt reviewed Feb 19, 2026

View reviewed changes

interface/ceed-operator.c Outdated Show resolved Hide resolved

jeremylt reviewed Feb 19, 2026

View reviewed changes

interface/ceed-operator.c Outdated Show resolved Hide resolved

zatkins-dev added 4 commits February 19, 2026 16:20

operator(composite): add interface to force sequential execution of s…

be39585

…ub-operators

backends(cuda/gen,hip/gen): fix at-points assembly to avoid writing t…

745f16d

…o non-active outputs

backends(cuda/ref,hip/ref): fix use of incorrect output field index

736f144

ci: add sequential composite operator test

11a8069

zatkins-dev force-pushed the zach/add-sequential-and-at-points-fixes branch from b681510 to 11a8069 Compare February 19, 2026 23:21

jeremylt approved these changes Feb 20, 2026

View reviewed changes

jeremylt merged commit a49e5d5 into main Feb 20, 2026
31 checks passed

jeremylt deleted the zach/add-sequential-and-at-points-fixes branch February 20, 2026 14:54

Comments

Conversation

zatkins-dev commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jrwrigh commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jeremylt commented Feb 18, 2026

Uh oh!

jeremylt commented Feb 18, 2026

Uh oh!

Uh oh!

zatkins-dev commented Feb 18, 2026

Uh oh!

zatkins-dev commented Feb 18, 2026

Uh oh!

zatkins-dev commented Feb 18, 2026

Uh oh!

Uh oh!

jeremylt commented Feb 19, 2026

Uh oh!

zatkins-dev commented Feb 19, 2026

Uh oh!

zatkins-dev commented Feb 19, 2026

Uh oh!

Uh oh!

Uh oh!

zatkins-dev commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zatkins-dev commented Feb 18, 2026 •

edited

Loading

jrwrigh commented Feb 18, 2026 •

edited

Loading