AudioTranscribeWhisperSettings interface
Settings for an Audio Transcribe operation using Whisper sdk (whisper-cpp) see: NorskTransform.audioTranscribeWhisper()
Signature:
export interface AudioTranscribeWhisperSettings extends ProcessorNodeSettings<AudioTranscribeWhisperNode>
Properties
Property | Type | Description |
---|---|---|
string |
(Optional) Initial prompt to prime the model - this is in addition to prompting based on past transcription history. |
|
number |
(Optional) Duration of audio to keep when clearing the buffer to allow for partial-word recognition. Default 400ms |
|
string |
(Optional) Language setting for the Whisper model. Leave unset to auto-detect (with a multi-language model) |
|
number |
(Optional) Max tokens per segment |
|
string |
The file name of the GGML-format whisper model. Information: https://github.com/ggerganov/whisper.cpp/blob/master/models/README.md Model downloads: https://huggingface.co/ggerganov/whisper.cpp/tree/main |
|
boolean |
(Optional) |
|
number |
(Optional) Number of threads to use. Note using a large number of threads rarely improves performance |
|
number |
Stream ID of the output subtitles |
|
WhisperSamplingStrategy |
(Optional) Greedy (default) or beam sampling strategy |
|
boolean |
(Optional) Experimental: speed-up the audio by 2x using Phase Vocoder. Can significantly reduce the quality of the output |
|
number |
(Optional) The duration of audio that is accumulated before performing one transcription step. Decreasing this value will decrease latency but also decrease performance. Visualiser metrics are available to monitor the duration of each "step" operation, if this is not clearly faster than the audio duration real-time output will not be attained and the workflow will back up. Default 3000ms. |
|
boolean |
(Optional) Whether to suppress non-speech tokens |
|
boolean |
(Optional) Enable tiny-diarize if supported in the given model |
|
boolean |
(Optional) Whether to translate a non-English input to English, or leave the foreign-language transcription in the source language. |
|
boolean |
(Optional) Use GPU if available. In the cases where GPU is available, it may not necessarily increase performance, but instead opt to move load from CPU to GPU. |