Describe the bug
transformers model padding is not efficient,
|
padding="max_length", # we pad to the longest sequence |
As the comments say, "we pad to the longest sequence", it should be padding="longest"
To Reproduce
lighteval accelerate \ "pretrained=gpt2" \ "leaderboard|truthfulqa:mc|0|0"
Expected behavior
The initial sequence length does not have to be that long (max length)
Version info
0.8.0