Describe the bug
Hi, after building the xpu_timer, the relevant lib could not be found, and needed to be packaged and installed to /usr/local/lib and /usr/local/bin. The generated matmul_dp8.bin after all this was done seemed to be problematic.
Using xpu_timer_gen_tracing_timeline also seems to produce svg that does not properly reflect timeline information. Are there more documentation or communication group to discuss these issues?
"Stack count is low (8). Did something go wrong?" is displayed after the calltrace is generated.
To Reproduce
Build according to the xpu_timer documentation
Logs or Screenshots


Expected behavior
generate normal timeline info
APP Info :
- xpu_timer: 0.4.0-rc
- Torch 2.4.0
- cuda 12.5
- NCCL 2.22.3
HARDWARE Info :
Describe the bug
Hi, after building the xpu_timer, the relevant lib could not be found, and needed to be packaged and installed to /usr/local/lib and /usr/local/bin. The generated matmul_dp8.bin after all this was done seemed to be problematic.
Using xpu_timer_gen_tracing_timeline also seems to produce svg that does not properly reflect timeline information. Are there more documentation or communication group to discuss these issues?
"Stack count is low (8). Did something go wrong?" is displayed after the calltrace is generated.
To Reproduce
Build according to the xpu_timer documentation
Logs or Screenshots
Expected behavior
generate normal timeline info
APP Info :
HARDWARE Info :