Implement parallel hpx::uninitialized_relocate_* algorithms for overlapping ranges#6879
Conversation
|
Can one of the admins verify this patch? |
|
@isidorostsa I believe this is another PR for you to look at. Would you have the time? |
|
@ArivoliR Could you please rebase your branch onto master (and resolve the conflicts)? |
51e050b to
d4fd325
Compare
|
|
||
| // test_uninitialized_relocate_overlap_forward( | ||
| // hpx::execution::par(hpx::execution::task)) | ||
| // .get(); |
There was a problem hiding this comment.
Do you plan to re-enable these tests?
There was a problem hiding this comment.
No sir, those were only added temporarily for debugging. I’ll be removing them in the next commit. Thank you for pointing it out
There was a problem hiding this comment.
Wouldn't it be useful to have those tests functioning?
There was a problem hiding this comment.
Hi @hkaiser,
I’ve split the original overlap test into two clearer pieces as guided by Isidoros:
-
A sequenced-only semantics test that validates the expected behavior for overlapping ranges under execution::seq.
-
A policy comparison test that checks that execution::par (and par_unseq) produce the same observable results as the sequenced implementation.
For now I’ve left par(task) out and added a TODO. I haven’t added a test for the async behavior yet, since it requires explicitly handling the task/sender return path. I’d prefer to add a dedicated test for that once I’ve spent more time understanding the expected testing pattern.
There was a problem hiding this comment.
You can ignore the sender&receiver part for now. I'd focus on the async/future implementation which should be easier to do. However I'd be fine for this ending up in a separate PR.
There was a problem hiding this comment.
I don’t have a working test for the async/future-based path yet, and I’m still figuring out the right way to implement it. I’d prefer to follow up with a separate PR once I figure it out in sometime. If you’re okay with it, I’d be happy to close this PR here once the current commits are approved
d4fd325 to
ac3fc68
Compare
2c7490c to
fec562e
Compare
|
There are still some minor things left:
The other reported problems are unrelated. |
345ef7f to
4df7159
Compare
4df7159 to
5fa5d27
Compare
@hkaiser I believe this PR is ready for merge. I’m rebasing it now, please let me know if youd' like any further changes. |
5fa5d27 to
e851773
Compare
|
In this PR, ranges that overlap in any direction will be handled by the sequential version of the code, instead of just the ones that overlap in the direction that would work with forward iteration. This means it is still a bug to call this with a range overlapping in the wrong direction, but at least its a bug in the sequential code. In the case of relocation this will either end up destroying objects and moving from them afterwards (bug) or if the type is trivially relocatable it will keep memcpying the same object (not as bad, but still a bug). In the parallel code case it would do this but also with data races. All in all, I think we should keep the logic of the check as it was before, since this change is not protecting us from any bugs, but looks like it is. But we should rewrite it in the style you wrote it, as it is clearer than what we had before. For bug identification we should HPX_ASSERT that the ranges are not overlapping on either direction if we decide to go for the parallel version. When we implement the safe parallel version we can change the selection logic to fit this. |
|
Hi, I've been going through this PR as part of researching the GSoC project on "Implement parallel hpx::uninitialized_relocate_ algorithms for overlapping ranges." Really helpful context here. @isidorostsa - If I'm understanding correctly, the current check falls back to sequential for all overlaps including d_first > first cases where forward parallel chunking would actually be safe. The fix should preserve full parallelism when the shift direction is safe, only serialize when there is a genuine hazard, and HPX_ASSERT for the truly undefined cases in the parallel path. Would appreciate any correction if I’ve misunderstood something. |
dcf51bc to
6d6d8ca
Compare
|
@ArivoliR, thanks for working more on this. What is the state of this PR? What does it achieve compared to master? |
|
@ArivoliR any comments to @isidorostsa's question? |
The PR essentially adds HPX_ASSERT(!overlap) before any parallel algorithm is executed in all three functions. This catches wrong-direction overlaps in debug builds before they reach parallel execution. There is no other major change. Pushing fix for |
2fc4adb to
3dad9a8
Compare
…ap detection Signed-off-by: ArivoliR <arivoli2005@gmail.com>
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
…structors For non-trivially relocatable types with a potentially-throwing move constructor, the parallel chain algorithm cannot guarantee the same move/destruction counts as sequential execution (P1144 requirement). Fall back to sequential in that case. Signed-off-by: ArivoliR <arivoli2005@gmail.com>
…ackward Signed-off-by: ArivoliR <arivoli2005@gmail.com>
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
0c2c61c to
9a6a0b8
Compare
hkaiser
left a comment
There was a problem hiding this comment.
LGTM, thanks!
Please resolve the merge conflicts, however.
There are a few errors sir, still working on them. Will push the fixes soon when I'm done and resolve the conflicts as well |
9a6a0b8 to
6d0817f
Compare
|
To be specific, there was a clang-tidy error since there was a pointer to a non-trivially-copyable type. |
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
64c2d72 to
aebb292
Compare
aebb292 to
bd5d836
Compare
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
7ab62e5 to
6f1cc34
Compare
Signed-off-by: ArivoliR <arivoli2005@gmail.com>
6f1cc34 to
d5080d9
Compare
Up to standards ✅🟢 Issues
|
Fixes #6878
Proposed Changes
dpositions, the range is split intodindependent chains where chaincprocesses indicesc, c+d, c+2d, ....Any background context you want to provide?
P1144 Object relocation in terms of move plus destroy
GSoC Project Wiki
Checklist
Not all points below apply to all pull requests.