- 
                Notifications
    You must be signed in to change notification settings 
- Fork 66
fix all2all support #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix all2all support #201
Conversation
| The fix looks good to me. Thanks for the fix. Let's wait until pytorch/pytorch#149485 is merged. | 
| Hi, @shengfukevin Then, can this Param PR be merged now? | 
| @shengfukevin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. | 
6b86d24    to
    113c54c      
    Compare
  
    | @TaekyungHeo has updated the pull request. You must reimport the pull request before landing. | 
113c54c    to
    21efb25      
    Compare
  
    | @TaekyungHeo has updated the pull request. You must reimport the pull request before landing. | 
21efb25    to
    83e653f      
    Compare
  
    | @TaekyungHeo has updated the pull request. You must reimport the pull request before landing. | 
| @shengfukevin please help review. thanks! | 
| will get it done this week. | 
| @shengfukevin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. | 
Summary
Fix support to replay all2all.
Depend on PyTorch PR-149485.
Test Plan
constructed 4 rank case to invoke

torch.distributed.all_to_all()andtorch.distributed.all_to_all_single(), then dump trace and replay.