diff --git a/benchmarks/README.md b/benchmarks/README.md index 957c2f988051..ee172642033d 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -95,6 +95,24 @@ become available. ✅ lmms-lab/LLaVA-OneVision-Data, Aeala/ShareGPT_Vicuna_unfiltered + + HuggingFace-MTBench + ✅ + ✅ + philschmid/mt-bench + + + HuggingFace-Blazedit + ✅ + ✅ + vdaita/edit_5k_char, vdaita/edit_10k_char + + + Spec Bench + ✅ + ✅ + wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl + Custom ✅ @@ -239,6 +257,43 @@ vllm bench serve \ --num-prompts 2048 ``` +### Spec Bench Benchmark with Speculative Decoding + +``` bash +VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \ + --speculative-config $'{"method": "ngram", + "num_speculative_tokens": 5, "prompt_lookup_max": 5, + "prompt_lookup_min": 2}' +``` + +[SpecBench dataset](https://github.com/hemingkx/Spec-Bench) + +Run all categories: + +``` bash +# Download the dataset using: +# wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl + +vllm bench serve \ + --model meta-llama/Meta-Llama-3-8B-Instruct \ + --dataset-name spec_bench \ + --dataset-path "/data/spec_bench/question.jsonl" \ + --num-prompts -1 +``` + +Available categories include `[writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag]`. + +Run only a specific category like "summarization": + +``` bash +vllm bench serve \ + --model meta-llama/Meta-Llama-3-8B-Instruct \ + --dataset-name spec_bench \ + --dataset-path "/data/spec_bench/question.jsonl" \ + --num-prompts -1 + --spec-bench-category "summarization" +``` + ### Other HuggingFaceDataset Examples ```bash @@ -295,6 +350,18 @@ vllm bench serve \ --num-prompts 80 ``` +`vdaita/edit_5k_char` or `vdaita/edit_10k_char`: + +``` bash +vllm bench serve \ + --model Qwen/QwQ-32B \ + --dataset-name hf \ + --dataset-path vdaita/edit_5k_char \ + --num-prompts 90 \ + --blazedit-min-distance 0.01 \ + --blazedit-max-distance 0.99 +``` + ### Running With Sampling Parameters When using OpenAI-compatible backends such as `vllm`, optional sampling