Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion qa/L0_jax_unittest/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ NVTE_JAX_CUSTOM_CALLS="false" python3 -m pytest -c $TE_PATH/tests/jax/pytest.ini
# Exercise the docs/examples/jax tutorials. The multi-GPU tests are
# skipped at runtime when fewer than 4 devices are visible, so this is safe on
# single-GPU runners.
python3 -m pytest -c $TE_PATH/tests/jax/pytest.ini -v --junitxml=$XML_LOG_DIR/pytest_docs_examples_jax.xml $TE_PATH/docs/examples/jax/ || test_fail "docs/examples/jax"
CUDA_VISIBLE_DEVICES=0 python3 -m pytest -c $TE_PATH/tests/jax/pytest.ini -v --junitxml=$XML_LOG_DIR/pytest_docs_examples_jax.xml $TE_PATH/docs/examples/jax/ || test_fail "docs/examples/jax"
Comment on lines 45 to +48
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The comment above still reflects the old rationale — that multi-GPU tests would self-skip when fewer than 4 devices are visible. Now that CUDA_VISIBLE_DEVICES=0 actively enforces single-GPU visibility, that explanation is stale and slightly misleading. Updating it prevents future readers from removing the env var under the mistaken belief that runtime skipping is a sufficient guard.

Suggested change
# Exercise the docs/examples/jax tutorials. The multi-GPU tests are
# skipped at runtime when fewer than 4 devices are visible, so this is safe on
# single-GPU runners.
python3 -m pytest -c $TE_PATH/tests/jax/pytest.ini -v --junitxml=$XML_LOG_DIR/pytest_docs_examples_jax.xml $TE_PATH/docs/examples/jax/ || test_fail "docs/examples/jax"
CUDA_VISIBLE_DEVICES=0 python3 -m pytest -c $TE_PATH/tests/jax/pytest.ini -v --junitxml=$XML_LOG_DIR/pytest_docs_examples_jax.xml $TE_PATH/docs/examples/jax/ || test_fail "docs/examples/jax"
# Exercise the docs/examples/jax tutorials. CUDA_VISIBLE_DEVICES=0 restricts
# JAX to a single GPU so that distributed tests are not attempted on
# multi-GPU runners.
CUDA_VISIBLE_DEVICES=0 python3 -m pytest -c $TE_PATH/tests/jax/pytest.ini -v --junitxml=$XML_LOG_DIR/pytest_docs_examples_jax.xml $TE_PATH/docs/examples/jax/ || test_fail "docs/examples/jax"

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


if [ $RET -ne 0 ]; then
echo "Error: some sub-tests failed: $FAILED_CASES"
Expand Down
Loading