You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get a zero score with lighteval for gsm8k for one of the models and for the same model, the results are a bit better, at least in 20s. Any idea why this could be happening? And how is the lm_eval gsm8k different than lighteval gsm8k?