Skip to content

Different results for gsm8k via lighteval compared to internal pipeline #202

@Mugariya

Description

@Mugariya

Hi,

I get a zero score with lighteval for gsm8k for one of the models and for the same model, the results are a bit better, at least in 20s. Any idea why this could be happening? And how is the lm_eval gsm8k different than lighteval gsm8k?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions