Skip to content

Feat: sm90 support mxfp8 fp8 groupgemm#362

Open
zhangxiaolei123456 wants to merge 17 commits into
deepseek-ai:mainfrom
zhangxiaolei123456:feat/sm90-mxfp8-fp8-main
Open

Feat: sm90 support mxfp8 fp8 groupgemm#362
zhangxiaolei123456 wants to merge 17 commits into
deepseek-ai:mainfrom
zhangxiaolei123456:feat/sm90-mxfp8-fp8-main

Conversation

@zhangxiaolei123456

@zhangxiaolei123456 zhangxiaolei123456 commented Jun 16, 2026

Copy link
Copy Markdown

accuracy test:

python -m pytest tests/test_sm90_mxfp8_fp8.py -q OK

Peformance test:

python -m pytest tests/test_sm90_mxfp8_fp8.py::test_m_grouped_mxfp8_vs_fp8_perf_contiguous_and_masked -q -s
kernel active M MXFP8 us FP8 us MXFP8 TFLOPS FP8 TFLOPS speedup diff
contiguous 512 23 31 45.9 35.0 1.31x 0.0462
masked 320 22 32 30.4 21.0 1.45x 0.0229

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant