Skip to content

Commit 59bc44c

Browse files
minor doc update
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
1 parent 50b6b7e commit 59bc44c

1 file changed

Lines changed: 13 additions & 0 deletions

File tree

examples/megatron_bridge/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,19 @@ torchrun --nproc_per_node 8 distill.py \
172172

173173
To run the distillation script on a Slurm cluster for multi-node training, you just need use `python` instead of `torchrun` and set the number of nodes using `#SBATCH --nodes=<num_nodes>` clause in your Slurm script.
174174

175+
### Convert Megatron checkpoint to Hugging Face format
176+
177+
To convert the Megatron checkpoint from last iteration (or any intermediate iteration) to Hugging Face format, you need the pruned model config (`--output_hf_path` from `prune_minitron.py` script) and the distilled megatron checkpoint dir (`<distill_output_dir>/checkpoints/iter_<iter_number>`) to run the following command:
178+
179+
```bash
180+
uv run python /opt/Megatron-Bridge/examples/conversion/convert_checkpoints.py export \
181+
--hf-model <path_to_pruned_hf_ckpt> \
182+
--megatron-path <distill_output_dir>/checkpoints/iter_<iter_number> \
183+
--hf-path <path_to_save_distilled_hf_ckpt>
184+
```
185+
186+
For more details, you can refer to the checkpoint conversion scripts in the [Megatron-Bridge README](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/conversion).
187+
175188
## Quantization
176189

177190
TODO

0 commit comments

Comments
 (0)