Skip to content

Refactor inference processes & add new engines (FasterWhisper, vLLM) #141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 23, 2025

Conversation

ssh-meister
Copy link
Collaborator

@ssh-meister ssh-meister commented Jul 6, 2025

Description

  1. Refactored inference-related processes into a separate group, mirroring the subgroup structure used in the NeMo repository (subgroup -> task, e.g., "asr", "nlp", etc.).
  2. Within each subgroup, the processors are further organized by the type of engine required to run them.

New processors added:

  1. FasterWhisperInference — based on SYSTRAN/faster-whisper
  2. vLLMInference — based on vllm-project/vllm

New post-processing processors:

  1. DetectWhisperHallucinationFeatures
  2. CleanQwenGeneration

Misc:

  1. Fixed docs build issues from Portuguese #77

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Sasha Meister <[email protected]>
Determine if generation should be replaced with reference text based on
CER and uppercase ratio.
"""
chars = generation.replace(' ', '')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this chars here? is it necessary to remove blanks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilithgrigoryan, thanks for the review!
This processor is used to select either the original text or a Qwen generation with restored punctuation.
One of the selection criteria is that if the model over-capitalizes the text (above a specified upper_case_threshold), we consider the generation poor.
To check this, we look only at non-space characters to compute the ratio of capital to lowercase letters.

@ssh-meister ssh-meister merged commit 93cfc46 into main Jul 23, 2025
10 checks passed
@ssh-meister ssh-meister deleted the Inference branch July 23, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants