11# Quicktour
22
3- We provide two main entry points to evaluate models:
3+
4+ > [ !TIP]
5+ > We recommend using the ` --help ` flag to get more information about the
6+ > available options for each command.
7+ > ` lighteval --help `
8+
9+ Lighteval can be used with a few different commands.
410
511- ` lighteval accelerate ` : evaluate models on CPU or one or more GPUs using [ 🤗
612 Accelerate] ( https://github.com/huggingface/accelerate )
713- ` lighteval nanotron ` : evaluate models in distributed settings using [ ⚡️
814 Nanotron] ( https://github.com/huggingface/nanotron )
15+ - ` lighteval vllm ` : evaluate models on one or more GPUs using [ 🚀
16+ VLLM] ( https://github.com/vllm-project/vllm )
17+ - ` lighteval endpoint `
18+ - ` inference-endpoint ` : evaluate models on one or more GPUs using [ 🔗
19+ Inference Endpoint] ( https://huggingface.co/inference-endpoints/dedicated )
20+ - ` tgi ` : evaluate models on one or more GPUs using [ 🔗 Text Generation Inference] ( https://huggingface.co/docs/text-generation-inference/en/index )
21+ - ` openai ` : evaluate models on one or more GPUs using [ 🔗 OpenAI API] ( https://platform.openai.com/ )
922
1023## Accelerate
1124
@@ -15,10 +28,8 @@ To evaluate `GPT-2` on the Truthful QA benchmark, run:
1528
1629``` bash
1730lighteval accelerate \
18- --model_args " pretrained=gpt2" \
19- --tasks " leaderboard|truthfulqa:mc|0|0" \
20- --override_batch_size 1 \
21- --output_dir=" ./evals/"
31+ " pretrained=gpt2" \
32+ " leaderboard|truthfulqa:mc|0|0"
2233```
2334
2435Here, ` --tasks ` refers to either a comma-separated list of supported tasks from
@@ -51,10 +62,8 @@ You can then evaluate a model using data parallelism on 8 GPUs like follows:
5162` ` ` bash
5263accelerate launch --multi_gpu --num_processes=8 -m \
5364 lighteval accelerate \
54- --model_args " pretrained=gpt2" \
55- --tasks " leaderboard|truthfulqa:mc|0|0" \
56- --override_batch_size 1 \
57- --output_dir=" ./evals/"
65+ " pretrained=gpt2" \
66+ " leaderboard|truthfulqa:mc|0|0"
5867` ` `
5968
6069Here, ` --override_batch_size` defines the batch size per device, so the effective
@@ -66,10 +75,8 @@ To evaluate a model using pipeline parallelism on 2 or more GPUs, run:
6675
6776` ` ` bash
6877lighteval accelerate \
69- --model_args " pretrained=gpt2,model_parallel=True" \
70- --tasks " leaderboard|truthfulqa:mc|0|0" \
71- --override_batch_size 1 \
72- --output_dir=" ./evals/"
78+ " pretrained=gpt2,model_parallel=True" \
79+ " leaderboard|truthfulqa:mc|0|0"
7380` ` `
7481
7582This will automatically use accelerate to distribute the model across the GPUs.
8188
8289# ## Model Arguments
8390
84- The ` --model_args ` argument takes a string representing a list of model
91+ The ` model-args ` argument takes a string representing a list of model
8592argument. The arguments allowed vary depending on the backend you use (vllm or
8693accelerate).
8794
@@ -150,8 +157,8 @@ To evaluate a model trained with nanotron on a single gpu.
150157` ` ` bash
151158 torchrun --standalone --nnodes=1 --nproc-per-node=1 \
152159 src/lighteval/__main__.py nanotron \
153- --checkpoint_config_path ../nanotron/checkpoints/10/config.yaml \
154- --lighteval_config_path examples/nanotron/lighteval_config_override_template.yaml
160+ --checkpoint-config-path ../nanotron/checkpoints/10/config.yaml \
161+ --lighteval-config-path examples/nanotron/lighteval_config_override_template.yaml
155162 ` ` `
156163
157164The ` nproc-per-node` argument should match the data, tensor and pipeline
0 commit comments