Parallel algorithms#120
Conversation
Discussion with the US group
Mention unit testing
Move building a lib and link it to a C++ executable to the main level
|
Co-authored-by: Victor Eijkhout eijkhout@tacc.utexas.edu |
| @@ -0,0 +1,124 @@ | |||
| ## Module name: Asynchronous programming | |||
There was a problem hiding this comment.
It seems that this PR is combining two topic introductions, could you please replace the asynch one here with a dummy so that your other PR can properly introduce that topic?
| #### Points to cover | ||
|
|
||
| * The header `<execution>` needs to be included | ||
| * The first argument of the algorithm is the execution policy |
There was a problem hiding this comment.
The execution policies them selves and their semantics should also be covered.
| _These are important topics that are not expected to be covered but provide | ||
| guidance where one can continue to investigate this topic in more depth._ | ||
|
|
||
| * If the implementation cannot parallelize or vectorize (e.g. due to lack of resources), all standard execution policies can fall back to sequential execution. |
There was a problem hiding this comment.
This sounds more like a points to cover that should be added above as a student should be aware of that when using the policies.
| guidance where one can continue to investigate this topic in more depth._ | ||
|
|
||
| * If the implementation cannot parallelize or vectorize (e.g. due to lack of resources), all standard execution policies can fall back to sequential execution. | ||
| * None of the execution policies allow for reproducibilty. This is obvious for the parallel execution policies. But even `std::ececution::seq` can execute the iterations in any order. |
There was a problem hiding this comment.
This could also be a caveat or points to cover in main.
| * Nvidia supports to run `std::execution::par` on Nvidia GPUs. However, that is not yet in the C++ standard and will only work with Nvidia's HPC compiler. | ||
| * Currently, parallel algorithms are implemented using Intel's TBB library in GCC. You can set the number of used cores using `tbb::global_control(tbb::global_control::max_allowed_parallelism, nthreads);` provided by the header `#include "tbb/tbb.h"`. |
There was a problem hiding this comment.
This seems to me more like additional information, compared to advanced features about the topic. I'm not sure where to best put this.
Co-authored-by: Florian Sattler <vuld3r@gmail.com>
Co-authored-by: Florian Sattler <vuld3r@gmail.com>
Co-authored-by: Florian Sattler <vuld3r@gmail.com>
Co-authored-by: Florian Sattler <vuld3r@gmail.com>
Attempt for parallel algorithms by @diehlpk and @VictorEijkhout