Smart Turn Detection
The Smart Turn Detection system is an advanced feature that uses machine learning to intelligently determine when a speaker has finished their turn in a conversation. This system enhances the accuracy of endpoint detection beyond traditional time-based thresholds.
Configuration Parameters
enable_turn_detection
- Type: Boolean
- Default:
false
- Description: Enables or disables the smart turn detection feature.
When enabled:
- The system uses an ML model to predict turn completion, in addition to the standard timeout-based method.
When disabled:
- The system relies only on
turn_detection_timeout
(in seconds). - Example: if
turn_detection_timeout = 0.6
, a 0.6-second pause is treated as the end of the turn.
turn_detection_timeout
- Type: Float (seconds)
- Default:
0.6
- Description: The timeout threshold for end-of-speech detection. This value is stored as
eos_threshold_seconds
and used as a fallback when smart turn detection is disabled or when the ML model confidence is low.
2. Server Options
options = {
"enable_turn_detection": True,
"turn_detection_timeout": 0.6,
}
3. Environment Variables
EOS_THRESHOLD_SECONDS
: Default timeout value (default: 0.6)
Benefits
- Improved Accuracy: ML-based detection is more accurate than time-based thresholds alone
- Context Awareness: The model understands speech patterns and context
- Reduced False Positives: Better at distinguishing between natural pauses and actual turn completion
- Adaptive: Works across different speakers, languages, and speaking styles
- Fallback Safety: Traditional timeout still works as a safety net
Prediction Interpretation
- Probability > 0.95: Very high confidence - immediate EOS trigger
Best Practices
- Enable for Conversational AI: Most beneficial for interactive applications
- Tune Timeout: Adjust
turn_detection_timeout
based on your use case - Test with Real Data: Validate performance with your specific audio characteristics
- Fallback Strategy: Always keep timeout as a safety net