Skip to content

Add integration with Paralight#656

Open
gendx wants to merge 16 commits intorust-lang:masterfrom
gendx:paralight
Open

Add integration with Paralight#656
gendx wants to merge 16 commits intorust-lang:masterfrom
gendx:paralight

Conversation

@gendx
Copy link

@gendx gendx commented Oct 11, 2025

Paralight is a lightweight parallelism library tuned for indexed structures such as slices. Given that the internal representation of hashbrown's hash tables is a slice of buckets (that each optionally contain a value), it's a good fit to integrate with (gendx/paralight#5).

This pull request is here to iterate on the design. As the integration needs access to the raw hash table representation, it's done here in the hashbrown crate (similarly to Rayon's integration).

@clarfonthey
Copy link
Contributor

I think that it's fair to do this given the presence of a Rayon implementation also (although, IMHO, we shouldn't be including these implementations in this crate…) but I also think that regardless of what is done, any primitives needed to make this work should be added to HashTable directly so that people can code their own versions of this.

For example, at one point I was contemplating offering an API that directly provided access to the &[MaybeUninit<T>] and &[Tag] slices in the table, and I think such an API might be helpful here too. Note that the tags and items in the hash table are actually in reverse order to each other, since one uses negative offsets from the central pointer and the other one uses positive offsets.

@Amanieu
Copy link
Member

Amanieu commented Oct 20, 2025

I talked about this in person with @gendx at EuroRust. The main issue with the current implementation is that it returns an iterator of Option<&(mut) T> instead of &(mut) T. I think this should be addressed before paralight supported is added to this crate.

A mid-level API that just exposes buckets like #613 might work for par_iter but would be insufficient for par_iter_mut and into_par_iter. You would need a more complete low-level API as proposed in #545 for that, but it would fundamentally be unstable since it exposes too much of the internal layout of the hash table.

@gendx gendx force-pushed the paralight branch 2 times, most recently from fa5e8c8 to a386031 Compare January 14, 2026 10:11
@gendx
Copy link
Author

gendx commented Jan 14, 2026

With Paralight v0.0.10, it's now possible to define parallel iterators where each bucket produces an Option<Self::Item> without needing Item to contain an option. This addresses @Amanieu's comment and makes the API much nicer to use. This change is therefore now ready for review.

Some general remarks:

  • Regarding Send/Sync bounds, I've created dedicated wrapper types around RawTable, because the default bounds were not suitable:
    • into_par_iter() requires Send items but the wrapper needs to be Sync to be shared with worker threads.
    • HashMap::par_iter_mut() requires a Sync key (to produce &K) and a Send value (to produce &mut V) and again the wrapper needs to be Sync.
    • In all cases, the Allocator shouldn't have any particular Send nor Sync bounds, as Paralight iterators don't deallocate (nor allocate) the table on other threads. Only the Drop implementation will deallocate in the into_par_iter() case, and none of the wrappers have new Send implementations anyway.
  • I've added a new support function RawTable::deallocate_cleared_table() to directly deallocate the table without clearing the control bytes when the iterator is dropped (after having fetched all the items*). This should be more efficient than using the pre-existing clear_no_drop() (although using that would work too).
  • Paralight is still in the 0.0.x version. It's already usable but the API changes often between versions as more iterator adaptors become supported, hence I haven't committed to publishing a 0.x version yet. That said I don't envision further changes to the IntoParallelSource traits for the time being. Happy to leave this pull request open and keep updating it until Paralight v0.1 is ready (but feedback on the design is valuable to progress towards version 0.1).

*Internally, the Paralight execution engine will ensure that each bucket index has been passed once and only once to a call to fetch_item() + cleanup_item_range() calls. While it's possible to bypass the execution engine and directly manipulate and drop a SourceDescriptor with safe code (definitely not the intended use of the API), this is at worst a memory leak (not dropping all the items in the map) and therefore not unsound.

@gendx gendx marked this pull request as ready for review January 14, 2026 10:42
@clarfonthey
Copy link
Contributor

Haven't fully read the code but would like to reiterate my opinion that instead of offering implementations of existing parallel helper traits, we should make all the primitives necessary to implement them available to use by any crate at minimum. Then, ideally, those crates can just provide their own implementations and APIs.

In particular, what APIs are needed to accomplish this that you think should be added?

@gendx
Copy link
Author

gendx commented Feb 20, 2026

In particular, what APIs are needed to accomplish this that you think should be added?

Practically, "the code is the contract". At the moment, several internal-only APIs on internal-only types:

  • Some APIs of RawTable (currently pub(crate) struct):
    • RawTable::num_buckets()
    • RawTable::is_bucket_full()
    • RawTable::bucket()
  • Some APIs of Bucket (currently pub(crate) struct):
    • Bucket::as_ref()
    • Bucket::as_mut()
    • Bucket::read()
  • New APIs:
    • deallocate_cleared_table() - this is mostly a performance optimization over RawTable::clear_no_drop() (see 875dbd1), yet I still think it's worth doing (users that opt into a parallelism framework do it for performance reasons, so having avoidable performance overhead would be unfortunate).
  • Some more flexible Send/Sync trait bounds, but as shown in the code these can be implemented by external wrappers. However, whether these wrappers are sound or not may depend on the internals, I think.

However, I think it's important to agree on the high-level approach to handle integration between crates before diving too much into the technical details. Fundamentally, there's only a handful of ways to allow this kind of integration. I don't really have a preference as long as integration is technically feasible - ultimately each provider crate decides which approach they want to pursue.

  1. Hashbrown (the "provider" crate) implements the external traits of Paralight (the "consumer" crate). (What this pull request proposes so far)
  • Pros: The internals of the provider crate don't need to be exposed in the public API. This means that a semver version bump isn't needed each time internals change. Each time internal details are changed in the provider crate, there is little chance of breaking the integration because the provider crate is aware of these integrations.
  • Cons: If there are N consumer crates, the provider crate needs to implement N integrations. Each new consumer needs to ask permission from the provider crate to integrate with it. Changing internal details in the provider crate means making sure that's compatible with all the N consumers (additional friction). Each time a consumer crate changes its API (or simply publishes a semver-breaking version), the provider crate needs to be updated accordingly.
  1. The provider crate exposes its internals and the consumer crate implements its own traits directly. (What Refactoring proposal: cleaning up the internal APIs #545 paves the way for)
  • Pros: Consumer crates don't need to ask for each integration, the provider crate doesn't need to provide support for each integration (however, sometimes the provider crate should expose new APIs to support new use cases).
  • Cons: The public API surface of the provider crate is larger. Unsafe APIs are exposed, which means that invariants must be well documented throughout the code base, and consider all the way they could be (mis)used by external consumers. There is less flexibility to change the now-public internals as one needs to think about how it impacts downstream consumers. The chances of semver breakage are higher in each release, which can lead to ecosystem fragmentation as there are more semver-incompatible versions. Each time the provider crate changes its API, the consumer crate needs to be updated accordingly.
  1. No agreement is found and no integration happens. (The status quo)
  • Pros: Less coupling between crates means more flexibility for provider crates to change their internal design.
  • Cons: I don't think that's good for the ecosystem, as it in general reduces the expressiveness of Rust for users who would benefit from such integrations. It can also lead to crates being forked to add integrations anyway, and ecosystem fragmentation.
  1. Splitting a provider crate into a high-level hashbrown API and a lower-level hashbrown-raw API?
  • I'm not sure about practical implications (how would this play with the orphan rule?), but this might improve things slightly over option 2.

@clarfonthey
Copy link
Contributor

I think it's also worth pointing out that the main purpose of hashbrown as a crate is to offer the low-level raw API, since it's the implementation for the standard library HashMap and HashSet. While there are some issues with giving those no_std support, ultimately, the endgame seems to be that the main purpose of hashbrown is the HashTable API and its internals, to be accessed by crates that want finer-grain control over the actual table structure.

And, since HashMap and HashSet still exist with mostly the same API as the standard library versions, downstream dependencies of providers like paralight would only need to switch from std::collections::HashMap to hashbrown::HashMap if they wanted to use something that requires access to these APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants