[WIP] Large scale rejection sampling with VLLM #18

msaroufim · 2025-07-02T16:36:57Z

NOt intended to merge

This is a POC using Deepseek R (70B distlillation) 1, for some reason I'm running into connectivity issues for Qwen for large scale rejection sampling

If you squint it's a form of RL where you have a policy (generated kernels) with a tradeoff between explore (generate completely random samples) and exploit (generate using the best existing samples). It's not really gradient based RL but infra for that tends to be more complex whereas with this approach you can just purely use inference engines

python scripts/run_vllm_prototype.py --operations relu

deepseek-ai/DeepSeek-R1-Distill-Llama-70B

The core idea is that we have

Workers that generate samples running VLLM
Workers that execute those samples
A coordinator running on CPU and collecting metrics
Queue requests and store metrics using redis

An obvious problem is that utilization is insanely low for executors but that might be unavoidable considering noisy neighbor problems

Prototyping on 8 GPUs but the goal is that this should run on 1,000

[WIP] VLLM backend

b9ad6b8

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 2, 2025

update

81563ed

msaroufim marked this pull request as draft July 2, 2025 16:51

msaroufim added 11 commits July 2, 2025 10:15

update

73e24f9

update

69bf75b

update

5f0ae2d

update

b661c08

update

f1f88f8

update

6f294e1

update

70a933f

update

d6f2cfe

update

17689e0

update

3bc90e2

update

79b4348

msaroufim changed the title ~~[WIP] VLLM backend~~ [WIP] Large scale rejection sampling with VLLM Jul 2, 2025

msaroufim added 4 commits July 2, 2025 16:10

update

2569a63

update

936918f

update

294a9c4

update

0402fab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Large scale rejection sampling with VLLM #18

[WIP] Large scale rejection sampling with VLLM #18

Uh oh!

msaroufim commented Jul 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

[WIP] Large scale rejection sampling with VLLM #18

Are you sure you want to change the base?

[WIP] Large scale rejection sampling with VLLM #18

Uh oh!

Conversation

msaroufim commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

msaroufim commented Jul 2, 2025 •

edited

Loading