-
Notifications
You must be signed in to change notification settings - Fork 107
Hash data structure(s) design document #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jrhemstad
wants to merge
9
commits into
NVIDIA:main
Choose a base branch
from
jrhemstad:design-document
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
221dce2
Create design.md
jrhemstad e8b4460
Update design.md
jrhemstad f10529d
Update design.md
jrhemstad fada531
Update design.md
jrhemstad bcb19eb
Update design.md
jrhemstad c604746
Update design.md
jrhemstad aeca41d
Update design.md
jrhemstad 9d8922a
Update design.md
jrhemstad 896f60c
Update design.md
jrhemstad File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,126 @@ | ||
| There are several “hash map” like data structures lie on a spectrum from high-performance, bare metal with restricted features to more convenient, | ||
| full-featured structures that may be less performant. | ||
| cuCollections will likely have several classes that are on different points on this spectrum. | ||
|
|
||
| # HashArray (TODO Naming) | ||
|
|
||
| ## Summary | ||
|
|
||
| Lowest-level, highest performance "bare metal" data structure with limited feature set. | ||
|
|
||
| - Fixed-size | ||
| - Keys limited to native integral types where `sizeof(Key) <= largest atomicCAS (64bits) | ||
| - Insert/Find/Erase | ||
| - Storage for "Erased" values cannot be reclaimed | ||
| - Uses sentinel values to indicate empty/erased cells | ||
|
|
||
| ```c++ | ||
| template <typename Key, typename Value, typename Hash, typename KeyEqual, typename Allocator> | ||
| class HashArray{ | ||
| using value_type = thrust::pair<Key,Value>; | ||
| } | ||
| ``` | ||
|
|
||
| ## Keys and Values | ||
|
|
||
| Key types are limited to native, integral types to allow bitwise equality comparison (i.e., no floating-point keys). | ||
|
|
||
| ### DECISION REQUIRED: Integral Values vs Arbitrary Values | ||
|
|
||
| #### Integral, Packable Values | ||
|
|
||
| Require `Value` to be an integral, "packable" type. | ||
|
|
||
| "Packable" key/value types are those types where `sizeof(Key) + sizeof(Value) <= largest atomicCAS (64bits)` | ||
|
|
||
| Requires Array of Struct layout. | ||
|
|
||
| - Pros: | ||
| - Performance: enables update of key/value in a single atomicCAS operation (assumes AoS layout) | ||
| - Find/Insert/Erase can be concurrent | ||
|
|
||
| - Cons: | ||
| - Least flexible | ||
| - Requires user to specify `EMPTY/ERASED` sentinels for both `Key` and `Value` | ||
|
|
||
| #### Arbitrary Values | ||
|
|
||
| `Value` can be any device constructible type. | ||
|
|
||
| Can use either AoS or SoA. | ||
|
|
||
| - Pros: | ||
| - Flexible | ||
| - Sentinels only required for `Key` `EMPTY/ERASED/(FILLING)` | ||
|
|
||
| - Cons: | ||
| - Potentially Less Performant: | ||
| - `atomicCAS` key w/ dependent write for value (placement new) | ||
| - Concurrent insert/find/erase requires additional sentinel for FILLING state | ||
|
|
||
| ## Layout | ||
|
|
||
| ### DECISION NEEDED: Array of Structs vs Struct of Arrays | ||
|
|
||
| Layout largely determined by decision on integral vs. arbitrary `Value`s. | ||
|
|
||
| ## Operations | ||
|
|
||
| ### `insert` | ||
| ```c++ | ||
| /** | ||
| * @brief Attempts to insert a key, value pair into the map. | ||
| * | ||
| * Returns an iterator, boolean pair. | ||
| * | ||
| * If the new key already present in the map, the iterator points to | ||
| * the location of the existing key and the boolean is `false` indicating | ||
| * that the insert did not succeed. | ||
| * | ||
| * If the new key was not present, the iterator points to the location | ||
| * where the insert occured and the boolean is `true` indicating that the | ||
| * insert succeeded. | ||
| * | ||
| * @param insert_pair The pair to insert | ||
| * @param key_equal Binary predicate to use for equality comparison between keys | ||
| * @return Iterator, Boolean pair. Iterator is to the location of the | ||
| * newly inserted pair, or the existing pair that prevented the insert. | ||
| * Boolean indicates insert success. | ||
| */ | ||
| template <typename KeyEqual> | ||
| thrust::pair<iterator,bool> insert( value_type const& v, KeyEqual key_equal = std::equal_to<Key>{}); | ||
| ``` | ||
| ### `find` | ||
| ```c++ | ||
| /** | ||
| * @brief Searches the map for the specified key. | ||
| * | ||
| * @param k The key to search for | ||
| * @param key_equal Binary predicate to use for equality comparison between keys | ||
| * @return An iterator to the key if it exists, else map.end() | ||
| */ | ||
| template <typename KeyEqual> | ||
| const_iterator find( Key const& k, KeyEqual key_equal = std::equal_to<Key>{}); | ||
|
|
||
| ### `erase` | ||
| ```c++ | ||
| /** | ||
| * @brief Erases the specified key (if it exists). | ||
| * | ||
| * @param k The key to erase | ||
| * @param key_equal Binary predicate to use for equality comparison between keys | ||
| * @returns `true` If `key` existed as was removed | ||
| * @returns `false` If `key` does not exists | ||
| */ | ||
| template <typename KeyEqual> | ||
| bool erase( Key const& k, KeyEqual key_equal = std::equal_to<Key>{}); | ||
| ``` | ||
|
|
||
|
|
||
| # HashMap, Bryce version from SC libcu++ talk (TODO Name) | ||
|
|
||
| Higher-level, with more features: | ||
| - Arbitrary key/value types | ||
| - Per-bucket status byte/bit(s) | ||
| - EMPTY, FILLING, FILLED, DELETED | ||
| - Fixed Size? | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.