Skip to content

Revisit Log-Mel spectrogram computation #568

@ggerganov

Description

@ggerganov

Last time I checked, the results produced by whisper.cpp for computing the Log-Mel spectrogram were not exactly identical to the OpenAI implementation:

I think, the produced spectrograms by the 2 methods should be pretty close to each other because the transcription obviously works correctly. But nevertheless, it would be useful to compare the spectrograms in more details and see if we can make the C++ code match better the Python code. Eliminating any differences in the audio input would make it easier to compare the transcription results between the 2 codebases.

This should be a good exercise for anyone looking to start contributing to the project, so feel free to open a PR or discuss your findings!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions