From 1f06e19b2481d916c4748695b192ec3a3d9425c0 Mon Sep 17 00:00:00 2001 From: Xing Han Lu <21180505+xhluca@users.noreply.github.com> Date: Wed, 30 Aug 2023 18:44:36 -0400 Subject: [PATCH] Fix typo in pytorch-ddp-accelerate-transformers.md --- pytorch-ddp-accelerate-transformers.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pytorch-ddp-accelerate-transformers.md b/pytorch-ddp-accelerate-transformers.md index 07ac39f725..8815b7f5f2 100644 --- a/pytorch-ddp-accelerate-transformers.md +++ b/pytorch-ddp-accelerate-transformers.md @@ -173,7 +173,7 @@ The optimizer needs to be declared based on the model *on the specific device* ( Lastly, to run the script PyTorch has a convenient `torchrun` command line module that can help. Just pass in the number of nodes it should use as well as the script to run and you are set: ```bash -torchrun --nproc_per_nodes=2 --nnodes=1 example_script.py +torchrun --nproc_per_node=2 --nnodes=1 example_script.py ``` The above will run the training script on two GPUs that live on a single machine and this is the barebones for performing only distributed training with PyTorch.