huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.2k
Star 16k

Code
Issues 483
Pull requests 83
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 33 Milestones 0

New pull request New

83 Open 2,122 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

docs: Add RapidFire AI integration guide

#4340 opened Oct 26, 2025 by kamran-rapidfireAI • Draft

5 tasks

[vllm] update comment about communication group host ip

#4337 opened Oct 24, 2025 by kashif

Loading…

5 tasks

Update SFT QLoRA notebook with **14B** model on free Colab

#4336 opened Oct 24, 2025 by sergiopaniego

Loading…

5 tasks

Added custom prepare_model_for_kbit_training to save VRAM

#4335 opened Oct 24, 2025 by sergiopaniego

Loading…

5 tasks

feat(trainer): add PAPOTrainer for preference-based optimization

#4334 opened Oct 24, 2025 by SolarWindRider

Loading…

4 tasks done

Update Reducing Memory Consumption guide with more details

#4332 opened Oct 24, 2025 by sergiopaniego

Loading…

5 tasks

Use explicit tiny-Qwen2ForCausalLM-2.5 model_id param in CI tests

#4331 opened Oct 23, 2025 by albertvillanova

Loading…

Implement CI test workflow for experimental module

#4330 opened Oct 23, 2025 by albertvillanova

Loading…

Move tests of experimental GRPO with replay buffer to tests/experimental

#4329 opened Oct 23, 2025 by albertvillanova

Loading…

Use explicit tiny-Qwen2_5_VL model_id parameter in CI tests

#4325 opened Oct 23, 2025 by albertvillanova

Loading…

wip - env

#4320 opened Oct 22, 2025 by qgallouedec

Loading…

5 tasks

refactor: simplify parameter freezing in modeling_base.py

#4305 opened Oct 20, 2025 by Ki-Seki

Loading…

2 of 5 tasks

GRPO: ScaleRL -> Support casting LM Head to FP32

#4303 opened Oct 18, 2025 by pramodith

Loading…

4 of 5 tasks

[SFT] Log mean token accuracy from Liger kernel

#4302 opened Oct 18, 2025 by kashif

Loading…

5 tasks

Tool call

#4300 opened Oct 18, 2025 by qgallouedec • Draft

5 tasks

Add CISPO loss option and documentation

#4298 opened Oct 16, 2025 by gustavorubim

Loading…

feat: Add Multi-Token Prediction (MTP) support to SFTTrainer

#4290 opened Oct 15, 2025 by KLGR123

Loading…

fix CI issue for vlm_gemma_3n model

#4278 opened Oct 15, 2025 by kaixuanliu

Loading…

[SFT] add support for unified conversion logic for both images and videos

#4264 opened Oct 13, 2025 by kashif

Loading…

Remove FSDP1 support: use FSDP2 exclusively

#4260 opened Oct 11, 2025 by behroozazarkhalili

Loading…

Fix DPO Trainer Bug For Qwen2-VL (Issue 2660)

#4257 opened Oct 11, 2025 by FabianSchuetze

Loading…

1 of 3 tasks

Online-dpo-ben

#4252 opened Oct 10, 2025 by burtenshaw • Draft

5 tasks

[Utils] fix DataCollatorForChatML

#4231 opened Oct 8, 2025 by kashif • Draft

Add support for Python 3.14

#4225 opened Oct 8, 2025 by albertvillanova

Loading…

Update max_length explanation for VLM trainers

#4220 opened Oct 7, 2025 by sergiopaniego

Loading…

5 tasks

Previous 1 2 3 4 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!