transcribe: (default) task, transcribes the uploaded file.
word_timestamps parametervad_filter parameter (only with Faster Whisper for now).| Name | Values | Description |
|---|---|---|
| audio_file | File | Audio or video file to transcribe |
| output | text (default), json, vtt, srt, tsv |
Output format |
| task | transcribe, translate |
Task type - transcribe in source language or translate to English |
| language | en (default is auto recognition) |
Source language code (see supported languages) |
| word_timestamps | false (default) | Enable word-level timestamps (Faster Whisper only) |
| vad_filter | false (default) | Enable voice activity detection filtering (Faster Whisper only) |
| encode | true (default) | Encode audio through FFmpeg before processing |
| diarize | false (default) | Enable speaker diarization (WhisperX only) |
| min_speakers | null (default) | Minimum number of speakers for diarization (WhisperX only) |
| max_speakers | null (default) | Maximum number of speakers for diarization (WhisperX only) |