
Keep your voice agents focused on the next step, by creating predefined conversation paths and dynamically generated flows with Pipecat Flows.
Daily’s modern, ergonomic APIs and high-level building blocks help you build next-generation social and gaming experiences.
Deliver real-time video and audio at the highest possible quality, with infrastructure that scales horizontally and geographically, with media servers in 10 geographic regions and 30 availability zones. This delivers a "first hop" network latency of 13ms or less for 5 billion people.
Full control over which audio and video tracks a participant sends or receives. Daily’s track subscription API allows you to manage call performance in busy rooms and build features like breakout rooms.
Daily’s integrated messaging layer facilitates real-time data exchange between clients, empowering dynamic, interactive UI experiences.
Build spatial audio experiences. Selectively subscribe to tracks, adjust volume levels based on proximity, and integrate audio into 3D worlds.
Build custom workflows and control camera, mic, and screen sharing with Daily’s roles and permissions APIs.
Leverage the most comprehensive suite of support tools, low-level metrics, logging capabilities, and data integrations with enterprise BI platforms.
With excellent docs, sample code, and a dedicated support team, Daily helps you build better apps in less time.
Direct access to multiple camera devices and video/audio tracks enables custom pre-and post-processing, augmented reality, and AI features.
Build worlds without limits. 100,000 active participants, real-time chat, flexible track subscriptions.
Daily’s SDKs give you CPU load metrics (even on the web) so you can build apps that adapt smoothly to all devices.

Keep your voice agents focused on the next step, by creating predefined conversation paths and dynamically generated flows with Pipecat Flows.

Pipecat Cloud, now generally available, is a managed, vendor-neutral platform for deploying and scaling open source voice agents with ultra-low latency, multi-region support, and enterprise-grade realtime infrastructure.

Smart Turn v3.2 improves AI voice agent turn detection with 40% better accuracy for short utterances and enhanced background noise handling. Open-source model with full weights, datasets, and training code available on HuggingFace.