Tunable Parameters and Endpointing Constants
The Talkscriber API provides several tunable parameters for Voice Activity Detection (VAD) and endpointing constants. These parameters allow for optimization of performance and accuracy based on specific needs.
Voice Activity Detection (VAD) Parameters
- threshold: Sensitivity of the VAD (default: 0.55)
- min_speech_duration_ms: Minimum duration of speech in milliseconds (default: 250)
- max_speech_duration_s: Maximum duration of speech in seconds (default: infinity)
- min_silence_duration_ms: Minimum duration of silence in milliseconds (default: 1000)
- window_size_samples: Size of the window for VAD analysis (default: 512)
- speech_pad_ms: Padding around detected speech segments in milliseconds (default: 500)
Endpointing Constants
- eos_threshold: End-of-speech detection threshold (default: 0.3)
- split_threshold: Threshold for splitting audio segments (default: 14)
- max_gap: Maximum gap between speech segments in seconds (default: 1)
- hullicination_threshold: Threshold to filter out hallucinations (default: -2.5)
These customer-specific calibrations can be addressed by sharing test data with Talkscriber to fine-tune the parameters for