Remove FSDP1 support: use FSDP2 exclusively #4260

behroozazarkhalili · 2025-10-11T14:43:35Z

Summary

With FSDP2 now stable, this PR removes all FSDP1-specific code to reduce maintenance burden.

Changes

Code Changes

Removed _sync_fsdp1_params_to_vllm() method from RLOO, GRPO, and Online DPO trainers
Removed fsdp_version detection and conditional logic in all trainers
Replaced FSDP1 version checks with direct calls to _sync_fsdp2_params_to_vllm()
All vLLM weight synchronization now uses FSDP2 methods exclusively

Configuration Changes

Deleted trl/accelerate_configs/fsdp1.yaml
Deleted examples/accelerate_configs/fsdp1.yaml

Documentation Changes

Updated docs/source/clis.md to remove fsdp1 from predefined config profiles table

Technical Details

FSDP1 Implementation (Removed):

Used recursive post-order traversal of FSDP modules
Required FSDP.summon_full_params(module, recurse=False, writeback=False) for each module
Manually tracked visited parameters to avoid duplication

FSDP2 Implementation (Now Standard):

Uses module.state_dict() which automatically handles all parameters
No manual recursion or parameter tracking needed
Simpler and more maintainable code

Files Changed

trl/trainer/rloo_trainer.py
trl/trainer/grpo_trainer.py
trl/trainer/online_dpo_trainer.py
docs/source/clis.md
trl/accelerate_configs/fsdp1.yaml (deleted)
examples/accelerate_configs/fsdp1.yaml (deleted)

With FSDP2 now stable, this commit removes all FSDP1-specific code to reduce maintenance burden. Changes: - Removed _sync_fsdp1_params_to_vllm() method from RLOO, GRPO, and Online DPO trainers - Removed fsdp_version detection and conditional logic in all trainers - Replaced FSDP1 version checks with direct calls to _sync_fsdp2_params_to_vllm() - Deleted fsdp1.yaml config files from trl/accelerate_configs and examples/accelerate_configs - Updated docs/source/clis.md to remove fsdp1 from predefined config profiles table Technical Details: - FSDP1 used recursive post-order traversal with FSDP.summon_full_params() - FSDP2 uses module.state_dict() which covers all parameters automatically - All vLLM weight synchronization now uses FSDP2 methods exclusively

HuggingFaceDocBuilderDev · 2025-10-11T14:46:10Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2025-10-28T21:14:41Z

It seems like QLoRA isn't supported for FSDP2, let's make this on hold

qgallouedec and others added 3 commits October 15, 2025 16:32

lil rename

669be75

Merge branch 'main' into remove-fsdp1-support

5c0e6ed

Merge branch 'main' into remove-fsdp1-support

7053a9f

behroozazarkhalili enabled auto-merge (squash) October 16, 2025 02:29

Merge branch 'main' into remove-fsdp1-support

e950546

qgallouedec mentioned this pull request Oct 30, 2025

Remove support for FSDP1 #4387

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Remove FSDP1 support: use FSDP2 exclusively #4260

Remove FSDP1 support: use FSDP2 exclusively #4260

behroozazarkhalili commented Oct 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 11, 2025

Uh oh!

qgallouedec commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Remove FSDP1 support: use FSDP2 exclusively #4260

Are you sure you want to change the base?

Remove FSDP1 support: use FSDP2 exclusively #4260

Conversation

behroozazarkhalili commented Oct 11, 2025

Summary

Changes

Code Changes

Configuration Changes

Documentation Changes

Technical Details

Files Changed

Uh oh!

HuggingFaceDocBuilderDev commented Oct 11, 2025

Uh oh!

qgallouedec commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants