We’re happy to kick off the New Year with a new Smart Turn release, with two key improvements to the responsiveness of your AI voice agents.
Smart Turn is an open-source turn detection model, which listens to raw audio data and determines when a user has finished speaking. Using Smart Turn, an AI voice agent can tell precisely when to respond to the user, without interrupting them, or waiting unnecessarily.
As usual, all parts of the model are open: the weights, the datasets, and the training code.
What's new in v3.2
Short utterances
We’ve significantly improved the model’s handling of short utterances, for example single words like “yes” or “okay”. These samples are now miscategorized 40% less often according to our public benchmarks.
We’ve made two changes which make this possible: firstly, a new dataset of short utterances which we plan to expand over time, and secondly, a fix for a padding issue during training reported by the community, which was reducing accuracy.
Background noise
Smart Turn v3.2 is more robust to background ambience, thanks to the addition of realistic cafe/office noise to our training and testing datasets. The result is that the model will perform better in real-world scenarios where the user’s audio isn’t studio-quality.
Usage
The new version is a drop-in replacement for v3.1, and as before, we’re shipping the model in 8MB (CPU) and 32MB (GPU) variants. The weights are available now on HuggingFace:
https://huggingface.co/pipecat-ai/smart-turn-v3/tree/main
As with v3.1, we’ll bundle the weights with the next Pipecat release for use with LocalSmartTurnAnalyzerV3. You can also use v3.2 with Pipecat right now by setting the smart_turn_model_path parameter in the LocalSmartTurnAnalyzerV3 constructor.
More information and benchmarks
For more details on how the model was trained, including our full training code, please see our GitHub repo:
https://github.com/pipecat-ai/smart-turn
We’ve released two new datasets, which were used to train and test this release respectively:
For accuracy benchmarks with the new test dataset, please see the following link:
https://huggingface.co/pipecat-ai/smart-turn-v3/tree/main/benchmarks
Stay in touch
We hope you enjoy the new model! If you have questions about Smart Turn or run into any issues, feel free to join our Discord server, or open a ticket on GitHub.