Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion scripts/fuzz_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -2524,7 +2524,7 @@ def handle(self, wasm):
TrapsNeverHappen(),
CtorEval(),
Merge(),
# Split(), # https://github.com/WebAssembly/binaryen/issues/8510
Split(),
RoundtripText(),
ClusterFuzz(),
Two(),
Expand Down
83 changes: 78 additions & 5 deletions src/ir/module-splitting.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,12 @@
// placeholder function (and eventually to the original secondary
// function), allocating a new table slot for the placeholder if necessary.
//
// 4. Replace all references to each secondary module's functions in the
// primary module's and each other secondary module's table segments with
// references to imported placeholder functions.
// 4. Pre-create trampolines for secondary functions referenced in global
// initializers but not yet replace the function references. Replace all
// other references to each secondary module's functions in the primary
// module's and each other secondary module's table segments with
// references to trampoline functions that call imported placeholder
// functions.
//
// 5. Rewrite direct calls from primary functions to secondary functions to be
// indirect calls to their placeholder functions (and eventually to their
Expand All @@ -48,6 +51,9 @@
// import them in the secondary modules. If possible, move those module
// items instead to the secondary modules.
//
// 9. Delete unused trampoline functions pre-created in 3, since some global
// initializers end up referencing functions the same module.
//
// Functions can be used or referenced three ways in a WebAssembly module: they
// can be exported, called, or referenced with ref.func. The above procedure
// introduces a layer of indirection to each of those mechanisms that removes
Expand All @@ -73,7 +79,6 @@
// from the IR before splitting.
//
#include "ir/module-splitting.h"
#include "ir/export-utils.h"
#include "ir/find_all.h"
#include "ir/module-utils.h"
#include "ir/names.h"
Expand Down Expand Up @@ -336,6 +341,7 @@ struct ModuleSplitter {
void exportImportCalledPrimaryFunctions();
void setupTablePatching();
void shareImportableItems();
void cleanupUnusedTrampolines();

ModuleSplitter(Module& primary, const Config& config)
: config(config), primary(primary), tableManager(primary),
Expand All @@ -348,6 +354,7 @@ struct ModuleSplitter {
exportImportCalledPrimaryFunctions();
setupTablePatching();
shareImportableItems();
cleanupUnusedTrampolines();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to update the big comment at the top of the file to include this new step.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: 8501ed6

}
};

Expand Down Expand Up @@ -508,7 +515,9 @@ Name ModuleSplitter::getTrampoline(Name funcName) {
primary, std::string("trampoline_") + funcName.toString());
it->second = trampoline;

// Generate the call and the function.
// Generate the call and the function. We generate a direct call here, but
// this will be converted to a call_indirect in
// indirectCallsToSecondaryFunctions.
Comment on lines +518 to +520
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comment improvement. Not related to this PR

std::vector<Expression*> args;
for (Index i = 0; i < oldFunc->getNumParams(); i++) {
args.push_back(builder.makeLocalGet(i, oldFunc->getLocalType(i)));
Expand Down Expand Up @@ -559,6 +568,21 @@ static void walkSegments(Walker& walker, Module* module) {
}

void ModuleSplitter::indirectReferencesToSecondaryFunctions() {
// Pre-create trampolines for secondary functions referenced in global
// initializers. We don't mutate the globals yet (that happens later in
// shareImportableItems), but we must create the trampolines now so they
// are processed correctly by indirectCallsToSecondaryFunctions and
// setupTablePatching.
for (auto& global : primary.globals) {
if (global->init) {
for (auto* ref : FindAll<RefFunc>(global->init).list) {
if (allSecondaryFuncs.count(ref->func)) {
getTrampoline(ref->func);
}
}
}
}

// Turn references to secondary functions into references to thunks that
// perform a direct call to the original referent. The direct calls in the
// thunks will be handled like all other cross-module calls later, in
Expand Down Expand Up @@ -1222,6 +1246,55 @@ void ModuleSplitter::shareImportableItems() {
}
}

void ModuleSplitter::cleanupUnusedTrampolines() {
// We pre-created trampolines for all functions referenced in global
// initializers' ref.funcs in indirectReferencesToSecondaryFunctions.
// But after splitting module items, if a trampoline is not used in the
// primary module, it means it was created for a global initializer but that
// global ended up moving to the same secondary module as the function
// referenced in the global initializer's ref.func. We can remove them here.
// TODO Remove placeholders and table elements created by unused trampolines
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix conservatively creates extra trampolines, but then we have to clean them up afterwards. What would it look like to fix the bug by instead moving the globals to secondary modules earlier or being more precise about which globals need to use trampolines in their initializers? That way we wouldn't have to clean up afterwards.

Copy link
Copy Markdown
Member Author

@aheejin aheejin Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay!

I think this is doable, but this basically means we split shareImportableItems into two methods, one for globals and the other for tables/memorys/tags, or, one for global/memory/tags and the other for tables, because, we can't move table analysis before setupTablePatching. I think the current way can be a cleaner compromise than having two split functions of shareImporTableItems. WDYT? If you feel strongly with the splitting globals first, I can try.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to handle globals separately from the rest of the items handled in shareImportableItems. We're now moving globals into different modules, which makes them more similar to functions than they are to the rest of the items here. My intuition is that moving globals to secondary modules right after we move functions to secondary modules and before we start creating any trampolines at all will be the simplest solution because we wouldn't have to treat globals or their initializers any differently from other module-level code after that.

std::unordered_set<Name> usedFuncs;

struct FuncrefCollector : public PostWalker<FuncrefCollector> {
std::vector<Name>& used;
FuncrefCollector(std::vector<Name>& used) : used(used) {}
void visitRefFunc(RefFunc* curr) { used.push_back(curr->func); }
};

ModuleUtils::ParallelFunctionAnalysis<std::vector<Name>> analysis(
primary, [&](Function* func, std::vector<Name>& used) {
if (!func->imported()) {
FuncrefCollector(used).walkFunction(func);
}
});

for (auto& [_, usedList] : analysis.map) {
usedFuncs.insert(usedList.begin(), usedList.end());
}

std::vector<Name> moduleCodeUsed;
FuncrefCollector(moduleCodeUsed).walkModuleCode(&primary);
usedFuncs.insert(moduleCodeUsed.begin(), moduleCodeUsed.end());

for (auto& ex : primary.exports) {
if (ex->kind == ExternalKind::Function) {
usedFuncs.insert(*ex->getInternalName());
}
}

std::vector<Name> trampolinesToRemove;
for (auto& [_, trampoline] : trampolineMap) {
if (!usedFuncs.count(trampoline)) {
trampolinesToRemove.push_back(trampoline);
}
}
for (auto& name : trampolinesToRemove) {
primary.removeFunction(name);
primaryFuncs.erase(name);
}
}

} // anonymous namespace

Results splitFunctions(Module& primary, const Config& config) {
Expand Down
43 changes: 0 additions & 43 deletions test/lit/wasm-split/global-funcref.wast

This file was deleted.

51 changes: 51 additions & 0 deletions test/lit/wasm-split/global-reffunc.wast
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
;; RUN: wasm-split %s -all -g -o1 %t.1.wasm -o2 %t.2.wasm --keep-funcs=keep
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was renamed from global-funcref.wast‎, and the expectations were rewritten. Previously I tried to put each line of expectation above the corresponding code line, but it's not easy when we have many other things like imports, exports, and trampolines.

;; RUN: wasm-dis %t.1.wasm | filecheck %s --check-prefix PRIMARY
;; RUN: wasm-dis %t.2.wasm | filecheck %s --check-prefix SECONDARY

;; When a split global ($a here)'s initializer contains a ref.func of a split
;; function, we should NOT create any trampolines, and the split global should
;; direclty refer to the function.

(module
(global $a funcref (ref.func $split))
(global $b funcref (ref.func $keep))

(func $keep)

(func $split
(drop
(global.get $a)
)
(drop
(global.get $b)
)
)
)

;; PRIMARY: (module
;; PRIMARY-NEXT: (type $0 (func))
;; PRIMARY-NEXT: (import "placeholder.deferred" "0" (func $placeholder_0))
;; PRIMARY-NEXT: (table $0 1 funcref)
;; PRIMARY-NEXT: (elem $0 (i32.const 0) $placeholder_0)
;; PRIMARY-NEXT: (export "table" (table $0))
;; PRIMARY-NEXT: (export "keep" (func $keep))
;; PRIMARY-NEXT: (func $keep
;; PRIMARY-NEXT: )
;; PRIMARY-NEXT: )

;; SECONDARY: (module
;; SECONDARY-NEXT: (type $0 (func))
;; SECONDARY-NEXT: (import "primary" "table" (table $timport$0 1 funcref))
;; SECONDARY-NEXT: (import "primary" "keep" (func $keep (exact)))
;; SECONDARY-NEXT: (global $a funcref (ref.func $split))
;; SECONDARY-NEXT: (global $b funcref (ref.func $keep))
;; SECONDARY-NEXT: (elem $0 (i32.const 0) $split)
;; SECONDARY-NEXT: (func $split
;; SECONDARY-NEXT: (drop
;; SECONDARY-NEXT: (global.get $a)
;; SECONDARY-NEXT: )
;; SECONDARY-NEXT: (drop
;; SECONDARY-NEXT: (global.get $b)
;; SECONDARY-NEXT: )
;; SECONDARY-NEXT: )
;; SECONDARY-NEXT: )
53 changes: 53 additions & 0 deletions test/lit/wasm-split/global-reffunc2.wast
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
;; RUN: wasm-split %s -all -g -o1 %t.1.wasm -o2 %t.2.wasm --split-funcs=split1,split2
;; RUN: wasm-dis %t.1.wasm | filecheck %s --check-prefix PRIMARY
;; RUN: wasm-dis %t.2.wasm | filecheck %s --check-prefix SECONDARY

;; Global $g1 is used (exported) in the primary module so it can't move, and
;; global $g2 is only used in the secondary module so it will move there.

(module
(global $g1 funcref (ref.func $split1))
(global $g2 funcref (ref.func $split2))
(export "g1" (global $g1))

(func $split1
(unreachable)
)

(func $split2
(drop
(global.get $g2)
)
)
)

;; PRIMARY-NEXT: (module
;; PRIMARY-NEXT: (type $0 (func))
;; PRIMARY-NEXT: (import "placeholder.deferred" "0" (func $placeholder_0))
;; PRIMARY-NEXT: (import "placeholder.deferred" "1" (func $placeholder_1))
;; PRIMARY-NEXT: (global $g1 funcref (ref.func $trampoline_split1))
;; PRIMARY-NEXT: (table $0 2 funcref)
;; PRIMARY-NEXT: (elem $0 (i32.const 0) $placeholder_0 $placeholder_1)
;; PRIMARY-NEXT: (export "g1" (global $g1))
;; PRIMARY-NEXT: (export "table" (table $0))
;; PRIMARY-NEXT: (func $trampoline_split1
;; PRIMARY-NEXT: (call_indirect (type $0)
;; PRIMARY-NEXT: (i32.const 0)
;; PRIMARY-NEXT: )
;; PRIMARY-NEXT: )
;; PRIMARY-NEXT: )

;; SECONDARY-NEXT: (module
;; SECONDARY-NEXT: (type $0 (func))
;; SECONDARY-NEXT: (import "primary" "table" (table $timport$0 2 funcref))
;; SECONDARY-NEXT: (global $g2 funcref (ref.func $split2))
;; SECONDARY-NEXT: (elem $0 (i32.const 0) $split1 $split2)
;; SECONDARY-NEXT: (func $split1
;; SECONDARY-NEXT: (unreachable)
;; SECONDARY-NEXT: )
;; SECONDARY-NEXT: (func $split2
;; SECONDARY-NEXT: (drop
;; SECONDARY-NEXT: (global.get $g2)
;; SECONDARY-NEXT: )
;; SECONDARY-NEXT: )
;; SECONDARY-NEXT: )
6 changes: 3 additions & 3 deletions test/lit/wasm-split/ref.func.wast
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@

;; SECONDARY: (import "primary" "prime" (func $prime (exact (type $0))))

;; SECONDARY: (elem $0 (i32.const 0) $second-in-table $second)
;; SECONDARY: (elem $0 (i32.const 0) $second $second-in-table)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on the trampoline creating order. Not meaningful


;; SECONDARY: (elem declare func $prime)

Expand Down Expand Up @@ -109,13 +109,13 @@
;; (but we will get a placeholder, as all split-out functions do).
)
)
;; PRIMARY: (func $trampoline_second-in-table (type $0)
;; PRIMARY: (func $trampoline_second (type $0)
;; PRIMARY-NEXT: (call_indirect $1 (type $0)
;; PRIMARY-NEXT: (i32.const 0)
;; PRIMARY-NEXT: )
;; PRIMARY-NEXT: )

;; PRIMARY: (func $trampoline_second (type $0)
;; PRIMARY: (func $trampoline_second-in-table (type $0)
;; PRIMARY-NEXT: (call_indirect $1 (type $0)
;; PRIMARY-NEXT: (i32.const 1)
;; PRIMARY-NEXT: )
Expand Down
Loading