Tunable Parameters and Endpointing Constants
The Talkscriber API provides several tunable parameters for Voice Activity Detection (VAD) and endpointing constants. These parameters allow for optimization of performance and accuracy based on specific needs.
Voice Activity Detection (VAD) Parameters
- threshold: Sensitivity of the VAD (default: 0.55)
- min_speech_duration_ms: Minimum duration of speech in milliseconds (default: 250)
- max_speech_duration_s: Maximum duration of speech in seconds (default: infinity)
- min_silence_duration_ms: Minimum duration of silence in milliseconds (default: 1000)
- window_size_samples: Size of the window for VAD analysis (default: 512)
- speech_pad_ms: Padding around detected speech segments in milliseconds (default: 500)
Endpointing Constants
- eos_threshold: End-of-speech detection threshold (default: 0.3)
- split_threshold: Threshold for splitting audio segments (default: 14)
- max_gap: Maximum gap between speech segments in seconds (default: 1)
- hullicination_threshold: Threshold to filter out hallucinations (default: -2.5)
Smart Turn Detection Parameters
For advanced turn detection using machine learning, see the Smart Turn Detection documentation which covers:
- enable_turn_detection: Enable/disable ML-based turn detection (default: false)
- turn_detection_timeout: Fallback timeout threshold in seconds (default: 0.6)
These customer-specific calibrations can be addressed by sharing test data with Talkscriber to fine-tune the parameters for