🚀 New models by Bria.ai, generate and edit images at scale 🚀
openai/
Whisper is a set of multi-lingual, robust speech recognition models trained by OpenAI that achieve state-of-the-art results in many languages. Whisper models were trained to predict approximate timestamps on speech segments (most of the time with 1-second accuracy), but they cannot originally predict word timestamps. This version has implementation to predict word timestamps and provide a more accurate estimation of speech segments when transcribing with Whisper models.
Please upload an audio file
You need to login to use this model
LoginSettings
Task
task to perform
Initial Prompt
optional text to provide as a prompt for the first window.. (Default: empty)
Temperature
temperature to use for sampling (Default: 0)
Language
language that the audio is in; uses detected language if None; use two letter language code (ISO 639-1) (e.g. en, de, ja)
Chunk Level
chunk level, either 'segment' or 'word'
Chunk Length S
chunk length in seconds to split audio (Default: 30, 1 ≤ chunk_length_s ≤ 30)
© 2025 Deep Infra. All rights reserved.