Skip to content

[NFC] Refactor delta debugging to use coroutines#8657

Open
tlively wants to merge 3 commits intomainfrom
ddb-coroutine
Open

[NFC] Refactor delta debugging to use coroutines#8657
tlively wants to merge 3 commits intomainfrom
ddb-coroutine

Conversation

@tlively
Copy link
Copy Markdown
Member

@tlively tlively commented Apr 29, 2026

Add a generator utility in a new support/coroutine.h header and use it to refactor away the callback in the delta debugging utility. Now the utility is a struct providing access to the test and working sets as well as accept() and reject() methods that cause the test and working sets to be updated appropriately. Rather than being refactored into an explicit state machine, the implementation of the delta debugging algorithm remains readable straight-line code the does a co_yield whenever it is ready to return control to the user. It co_yields a pointer to local state object that exposes all the information that the delta debugging utility exposes in its public API. This local object stays alive across suspend points. When the delta debugging algorithm is complete, we suspend the coroutine one final time and make sure never to resume it, which ensures the state remains alive and available after delta debugging has finished. It will ultimately be cleaned up when the outer DeltaDebugger struct is cleaned up.

Add a generator utility in a new support/coroutine.h header and use it to refactor away the callback in the delta debugging utility. Now the utility is a struct providing access to the test and working sets as well as `accept()` and `reject()` methods that cause the test and working sets to be updated appropriately. Rather than being refactored into an explicit state machine, the implementation of the delta debugging algorithm remains readable straight-line code the does a co_yield whenever it is ready to return control to the user. It co_yields a pointer to local state object that exposes all the information that the delta debugging utility exposes in its public API. This local object stays alive across suspend points. When the delta debugging algorithm is complete, we suspend the coroutine one final time and make sure never to resume it, which ensures the state remains alive and available after delta debugging has finished. It will ultimately be cleaned up when the outer `DeltaDebugger` struct is cleaned up.
@tlively tlively requested a review from a team as a code owner April 29, 2026 05:08
@tlively tlively requested review from aheejin, kripken and stevenfontanella and removed request for a team April 29, 2026 05:08
@tlively
Copy link
Copy Markdown
Member Author

tlively commented Apr 29, 2026

This is an alternative to #8651. It took some iteration, but I'm pretty happy with how it turned out. The arcane C++ coroutine nonsense is pretty well encapsulated in coroutine.h and the delta debugging implementation is essentially just as readable as before.

@kripken
Copy link
Copy Markdown
Member

kripken commented Apr 29, 2026

What about compiler support for coroutines - https://en.cppreference.com/cpp/compiler_support suggests clang on windows may not be done yet, but perhaps that page is out of date?

@tlively
Copy link
Copy Markdown
Member Author

tlively commented Apr 29, 2026

Looks like this is still a problem :( https://clang.llvm.org/cxx_status.html#:~:text=Clang%2017-,Coroutines,-P0912R5. But I think the ABI problem only affects 32-bit x86, and it doesn't look like we do any 32-bit releases at all. Maybe this is good enough for us? Here's the relevant LLVM bug: llvm/llvm-project#59382

@kripken
Copy link
Copy Markdown
Member

kripken commented Apr 29, 2026

If it only affects 32-bit windows I think we are ok here. But the wording in those links is a little ambiguous to me if that is the case?

@tlively
Copy link
Copy Markdown
Member Author

tlively commented Apr 29, 2026

The opening post on the LLVM issue says "The 32-bit Windows ABI passes objects of non-trivially-copyable class type by value on the stack" and never mentions any other problems, so I think we're good. (And I asked an expert internally and he confirmed this understanding.)

Copy link
Copy Markdown
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good about compiler support!

if (working.empty()) {
finished = true;
co_yield &state;
co_return;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to yield before returning? Isn't the output in the right place already?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tricky thing here is that we need to prevent the coroutine from ever returning because we depend on its local state staying live for the lifetime of the outer DeltaDebugger. So we yield before returning here and below, then make sure we never resume the coroutine again.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks, that's what I was missing. Please document that, it is indeed tricky...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, could we std::move the final state from the coroutine?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately there is not a great way to do that. This is by far the simplest approach I tried. Will add comments.

Comment thread src/support/coroutine.h Outdated
return false;
}
PromiseType* await_resume() const noexcept { return promise; }
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some docs for these classes? I'm not really sure what "GetPromise" means or does just from this code (which seems so generic as to do almost nothing but store a "promise"..?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these methods are well-known to the compiler and configure the suspending and resuming behavior of our Generator utility. Unfortunately this is just a bunch of unavoidable boilerplate that doesn't do anything interesting (or comprehensible to non-experts). I'll document the interesting user-exposed methods, but for most of this there's not anything more to say than // Unavoidable boilerplate.

@tlively tlively requested review from juj and sbc100 April 29, 2026 20:20
@tlively
Copy link
Copy Markdown
Member Author

tlively commented Apr 29, 2026

@sbc100, @juj, we'd like to land this PR introducing coroutines usage in Binaryen. But my understanding is that coroutines require XCode 16 or above to be stable. Can we find a way to make that acceptable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants