Changes to OpenMP scripts to extract arguments from iom_put#3373
Changes to OpenMP scripts to extract arguments from iom_put#3373LonelyCat124 wants to merge 36 commits intomasterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3373 +/- ##
=======================================
Coverage 99.96% 99.96%
=======================================
Files 389 389
Lines 54541 54591 +50
=======================================
+ Hits 54522 54572 +50
Misses 19 19 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Transformation failing on nemo5, and maybe not for structure of array: |
|
ITs all pass now, the remaining question is whether there's any performance degredation - I don't think there should be but its possible this is being done "too widely". One for either @sergisiso or @arporter to review. |
sergisiso
left a comment
There was a problem hiding this comment.
@LonelyCat124 The changes look good, but I would like that the transformation provide more feedback in order to understand why it hasn't given a performance improvement. Also, see if it can be more generic.
| if call.symbol.name == "iom_put": | ||
| arg = call.arguments[1] | ||
| dtype = arg.datatype | ||
| if isinstance(dtype, ArrayType) and isinstance(arg, Operation): | ||
| try: | ||
| DataNodeToTempTrans().apply(arg) | ||
| except TransformationError: | ||
| pass |
There was a problem hiding this comment.
The integration tests don't show any performance advantage which is not what we expected (we are changing from multiple gpu->cpu reads to one, and to maybe preventing the data touched from the gpu to be brought back):
- I can check with a grep how many more loops are offloaded but for the places that it was not applied, could you add as preceding comment the reason why not (if not all transformation errors provide useful information, a verbose option like other transformation have can help)
- There is nothing specific of iom_put, other than we know it is a common pattern. We want to avoid touching things from the CPU as much as possible, could this be applied to all subroutine calls (not functions)?
There was a problem hiding this comment.
I've added a preceding comment now. I'll try generalising it as well.
There was a problem hiding this comment.
This currently causes stuff to fail, I'll try to see if I can get my VPN to start working again and see if I can try manually building NEMO5 to find the cause.
There was a problem hiding this comment.
This has shown a few issues with the DataNodeToTempTrans (partly because some things are Statements that I didn't think, e.g. an IfBlock's condition).
There was a problem hiding this comment.
Fixed the bugs now
|
The DataNode2Temp has a: This may be a problem as walk returns self, and iom_put is not pure. I suppose this was meant for the calls in the arguments and we just forgot to skip self? |
@sergisiso I don't think so. We're calling the transformation on the argument, not the call so the iom_put is the parent. |
|
@sergisiso This is ready for another look now at last. |
sergisiso
left a comment
There was a problem hiding this comment.
@LonelyCat124 The implementation is getting closer, and the generated NEMO diff seems to make sense, but it is a bit puzzling that the performance is not getting better. I will have a look with the profiler but I won't stop the PR while I do it.
| node.argument_by_name("l"). | ||
| datatype.intrinsic, |
There was a problem hiding this comment.
Keep this in the same line.
There was a problem hiding this comment.
Wasn't done? There is also the same unecessary line breaks in lines 3775, 4458, 4730 and 4755
There was a problem hiding this comment.
Ah my bad, I reverted some commits to the file I think and so this was also reverted.
There was a problem hiding this comment.
Fixed those now.
| try: | ||
| DataNodeToTempTrans().apply(arg) | ||
| except TransformationError as err: | ||
| call.append_preceding_comment( |
There was a problem hiding this comment.
Also, consider doing the same but inside the transformation with a verbose option.
There was a problem hiding this comment.
Done, should be covered by tests too.
There was a problem hiding this comment.
Thanks, now you can delete this in favour of the option.
| if isinstance(self.intrinsic.return_type, Callable): | ||
| try: | ||
| return self.intrinsic.return_type(self) | ||
| except TypeError as err: |
There was a problem hiding this comment.
I know we discussed this before but can you remind me why we don't simply use:
try:
return self.intrinsic.return_type(self)
except (TypeError, AttributeError):
return UnresolvedType()
I know for debugging the more explicit error could help, but from a user point of view it doesn't matter that much it would be better to get the UnresolvedType to communicate that we can't do it but let it continue.
There was a problem hiding this comment.
I think I'd prefer to be more precise as from a user point of view I think it should never reach the InternalError, so its mostly being defensiveness from a development standpoint to avoid reaching things that are unexpected, but I'm happy to change it if you would prefer.
There was a problem hiding this comment.
Ok (it still feels not ideal that we compare against the error message, hopefully we can improve on this in the furture)
|
@sergisiso This is ready for another look, I am going to set ITs going again since output code changes will likely happen. |
There was a problem hiding this comment.
@LonelyCat124 I am happy with the current implementation and bringing the remaining rough edges to separate issues/TODOs. There is just final clean up to do before merging it.
| # Extract any array operations from iom_put calls to temporary | ||
| # expressions that can be parallelised. |
| MaximalProfilingOutsideDirectivesTrans().apply(children) | ||
|
|
||
|
|
||
| def iom_put_argument_to_temporary(calls: list[Call]): |
There was a problem hiding this comment.
Added. I also have slightly changed the behaviour so I will rerun integration.
| try: | ||
| DataNodeToTempTrans().apply(arg) | ||
| except TransformationError as err: | ||
| call.append_preceding_comment( |
There was a problem hiding this comment.
Thanks, now you can delete this in favour of the option.
| allocatable_datatype.shape]) | ||
| # If any of the bound information aren't static then we need | ||
| # to create an allocatable array. | ||
| is_static = True |
There was a problem hiding this comment.
maybe r/is_static/has_static_bounds/ to not confuse it with symbols with static interface
There was a problem hiding this comment.
Yeah thats probably better.
| node.argument_by_name("l"). | ||
| datatype.intrinsic, |
There was a problem hiding this comment.
Wasn't done? There is also the same unecessary line breaks in lines 3775, 4458, 4730 and 4755
| if isinstance(self.intrinsic.return_type, Callable): | ||
| try: | ||
| return self.intrinsic.return_type(self) | ||
| except TypeError as err: |
There was a problem hiding this comment.
Ok (it still feels not ideal that we compare against the error message, hopefully we can improve on this in the furture)
| # Is this reachable? Tested via monkeypatch as there may be | ||
| # some edge case I can't think of. |
There was a problem hiding this comment.
I would prefer a comment like "This should never happen, so propagate as an InternalError" instead.
There was a problem hiding this comment.
Yeah, thats neater, the previous comment probably should have been labelled as FIXME, updated now.
|
@sergisiso I've set integration running again, assuming no issues arise this is ready for another look. If the tests fail on treesitter related issues then I think we should wait until #3408 is merged. |
No description provided.