You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Background:
I've noticed that when processing audio files containing silent or non-speech segments, Whisper tends to generate hallucinatory content. This not only affects the segments with silence or non-human voices but also seems to impact the subsequent normal speech parts in the audio.
Inquiry:
Given that this is an inherent issue with Whisper, I am curious to know if it's feasible to incorporate strategies similar to VAD in Whisper-turbo. I am aware of approaches like those used in projects such as WhisperX, which seem to effectively mitigate such issues.
Thank you for your time and the incredible work on this project.
The text was updated successfully, but these errors were encountered:
Background: I've noticed that when processing audio files containing silent or non-speech segments, Whisper tends to generate hallucinatory content. This not only affects the segments with silence or non-human voices but also seems to impact the subsequent normal speech parts in the audio.
Inquiry: Given that this is an inherent issue with Whisper, I am curious to know if it's feasible to incorporate strategies similar to VAD in Whisper-turbo. I am aware of approaches like those used in projects such as WhisperX, which seem to effectively mitigate such issues.
Thank you for your time and the incredible work on this project.
Background:
I've noticed that when processing audio files containing silent or non-speech segments, Whisper tends to generate hallucinatory content. This not only affects the segments with silence or non-human voices but also seems to impact the subsequent normal speech parts in the audio.
Inquiry:
Given that this is an inherent issue with Whisper, I am curious to know if it's feasible to incorporate strategies similar to VAD in Whisper-turbo. I am aware of approaches like those used in projects such as WhisperX, which seem to effectively mitigate such issues.
Thank you for your time and the incredible work on this project.
The text was updated successfully, but these errors were encountered: