Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,30 @@
<td style="text-align: center;">✅</td>
<td><code>lmms-lab/LLaVA-OneVision-Data</code>, <code>Aeala/ShareGPT_Vicuna_unfiltered</code></td>
</tr>
<tr>
<td><strong>HuggingFace-MTBench</strong></td>
<td style="text-align: center;">✅</td>
<td style="text-align: center;">✅</td>
<td><code>philschmid/mt-bench</code></td>
</tr>
<tr>
<td><strong>HuggingFace-MTBench</strong></td>
<td style="text-align: center;">✅</td>
<td style="text-align: center;">✅</td>
<td><code>philschmid/mt-bench</code></td>
</tr>
<tr>
<td><strong>HuggingFace-Blazedit</strong></td>
<td style="text-align: center;">✅</td>
<td style="text-align: center;">✅</td>
<td><code>vdaita/edit_5k_char</code>, <code>vdaita/edit_10k_char</code></td>
</tr>
<tr>
<td><strong>Spec Bench</strong></td>
<td style="text-align: center;">✅</td>
<td style="text-align: center;">✅</td>
<td><code>wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl</code></td>
</tr>
<tr>
<td><strong>Custom</strong></td>
<td style="text-align: center;">✅</td>
Expand Down Expand Up @@ -239,6 +263,42 @@
--num-prompts 2048
```

### Spec Bench Benchmark with Speculative Decoding

``` bash
VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \
--speculative-config $'{"method": "ngram",
"num_speculative_tokens": 5, "prompt_lookup_max": 5,
"prompt_lookup_min": 2}'
```

SpecBench dataset: https://github.com/hemingkx/Spec-Bench .

Check failure on line 275 in benchmarks/README.md

View workflow job for this annotation

GitHub Actions / pre-commit


Download the dataset using:

Check failure on line 277 in benchmarks/README.md

View workflow job for this annotation

GitHub Actions / pre-commit

Trailing spaces [Expected: 0 or 2; Actual: 1]
wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl

Check failure on line 278 in benchmarks/README.md

View workflow job for this annotation

GitHub Actions / pre-commit

Bare URL used [Context: "https://raw.githubusercontent...."]

Run all categories:
``` bash

Check failure on line 281 in benchmarks/README.md

View workflow job for this annotation

GitHub Actions / pre-commit

Fenced code blocks should be surrounded by blank lines [Context: "``` bash"]
vllm bench serve \
--model meta-llama/Meta-Llama-3-8B-Instruct \
--dataset-name spec_bench \
--dataset-path "<YOUR_DOWNLOADED_PATH>/data/spec_bench/question.jsonl" \
--num-prompts -1
```

Available categories include `[writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag]`.

Run only a specific category like "summarization":

``` bash
vllm bench serve \
--model meta-llama/Meta-Llama-3-8B-Instruct \
--dataset-name spec_bench \
--dataset-path "<YOUR_DOWNLOADED_PATH>/data/spec_bench/question.jsonl" \
--num-prompts -1
--spec-bench-category "summarization"
```

### Other HuggingFaceDataset Examples

```bash
Expand Down Expand Up @@ -295,6 +355,18 @@
--num-prompts 80
```

`vdaita/edit_5k_char` or `vdaita/edit_10k_char`:

``` bash
vllm bench serve \
--model Qwen/QwQ-32B \
--dataset-name hf \
--dataset-path vdaita/edit_5k_char \
--num-prompts 90 \
--blazedit-min-distance 0.01 \
--blazedit-max-distance 0.99
```

### Running With Sampling Parameters

When using OpenAI-compatible backends such as `vllm`, optional sampling
Expand Down
Loading