Skip to content

Threaded mesh topology construction improvements#4083

Merged
garth-wells merged 38 commits intomainfrom
garth/mesh-hot-loops
Feb 28, 2026
Merged

Threaded mesh topology construction improvements#4083
garth-wells merged 38 commits intomainfrom
garth/mesh-hot-loops

Conversation

@garth-wells
Copy link
Copy Markdown
Member

@garth-wells garth-wells commented Feb 11, 2026

dolfinx::sort_perm didn't respect BITS template parameter. PR fixes this, which gives a significant performance boost for some cases.

@garth-wells garth-wells marked this pull request as ready for review February 15, 2026 11:01
Copy link
Copy Markdown
Contributor

@schnellerhase schnellerhase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removes thread spawning for num_threads=1, recovering previous sequential execution path. New sorting algorithm affects non-threaded execution path.

const common::IndexMap& vertex_index_map)
auto build_entity_list
= [](std::span<std::int32_t> entity_list,
std::span<std::int32_t> entity_list_sorted, auto&& cell_idx,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restrict to std::int32_t ranges.

int num_entities_per_cell = cell_type_entities[k].size();
std::vector<std::jthread> threads(num_threads);
for (int i = 0; i < num_threads; ++i)
if (num_threads > 1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should allow for compile time deduction for no threading configuration. Resolvable with #3716.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to overcomplicate - it will have no measurable runtime effect.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not about performance. It ensures optionality of the functionality. When passing num_threads around explicitly this needs to be deduced from the associated type.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. The if/else switch isn't inside a hot loop, so what's the issue?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to maintain optionality of the threading support. We can (should) not change function signatures with an altered configuration, so this would need to be indicated by the type of the argument.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is optional now - '0' is no thread launch.

std::vector<std::int32_t> sort_order(
entity_list_sorted.size() / num_vertices_per_entity, 0);
std::iota(sort_order.begin(), sort_order.end(), 0);
boost::sort::parallel_stable_sort(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds new library dependency to Boost.Sort.sort_by_perm in particular avoids lexicographical_compare - performance impact?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Significantly faster (several factors)

@garth-wells garth-wells changed the title Threaded mesh improvements and extensions Fix radix sort interface and threaded mesh improvements and extensions Feb 22, 2026
@garth-wells garth-wells changed the title Fix radix sort interface and threaded mesh improvements and extensions Threaded mesh topology construction improvements Feb 28, 2026
@garth-wells garth-wells added this pull request to the merge queue Feb 28, 2026
Merged via the queue into main with commit 1218736 Feb 28, 2026
18 of 19 checks passed
@garth-wells garth-wells deleted the garth/mesh-hot-loops branch February 28, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants