Learn about WebSocket support for our Text-to-Speech (TTS) API, how it works, and when to use it.
Our Text-to-Speech (TTS) API supports WebSocket communication, providing a real-time, low-latency streaming experience for applications that require instant speech synthesis. WebSockets allow continuous data exchange, making them ideal for use cases that demand uninterrupted audio generation.
WebSockets are perfect for applications that need real-time speech synthesis, eliminating the delays associated with traditional HTTP requests.
For voice assistants, chatbots, and live transcription services, WebSockets ensure smooth, uninterrupted audio playback and response times.
A persistent WebSocket connection reduces the need for repeated request-response cycles, significantly improving performance for applications requiring rapid audio generation.
By default, the WebSocket connection enforces a 20-second inactivity timeout. This means that if the client does not send any data within 20 seconds, the server will automatically close the connection to free up resources.
To support longer sessions for use cases where clients need more time (e.g., long pauses between messages), the timeout can be extended up to 60 seconds.
You can include the timeout
parameter in the WebSocket URL like so:
This sets the inactivity timeout to 60 seconds. Valid values range from 20 (default) to 60 seconds.
The WebSocket TTS API is optimized to handle real-time text-to-speech conversions efficiently. Key aspects include:
The client sends a WebSocket message:
The API validates the request and retrieves the voice settings.
The text is split into chunks and processed in the background.
The client receives responses like:
For implementation details, check our WebSocket API documentation.
Learn about WebSocket support for our Text-to-Speech (TTS) API, how it works, and when to use it.
Our Text-to-Speech (TTS) API supports WebSocket communication, providing a real-time, low-latency streaming experience for applications that require instant speech synthesis. WebSockets allow continuous data exchange, making them ideal for use cases that demand uninterrupted audio generation.
WebSockets are perfect for applications that need real-time speech synthesis, eliminating the delays associated with traditional HTTP requests.
For voice assistants, chatbots, and live transcription services, WebSockets ensure smooth, uninterrupted audio playback and response times.
A persistent WebSocket connection reduces the need for repeated request-response cycles, significantly improving performance for applications requiring rapid audio generation.
By default, the WebSocket connection enforces a 20-second inactivity timeout. This means that if the client does not send any data within 20 seconds, the server will automatically close the connection to free up resources.
To support longer sessions for use cases where clients need more time (e.g., long pauses between messages), the timeout can be extended up to 60 seconds.
You can include the timeout
parameter in the WebSocket URL like so:
This sets the inactivity timeout to 60 seconds. Valid values range from 20 (default) to 60 seconds.
The WebSocket TTS API is optimized to handle real-time text-to-speech conversions efficiently. Key aspects include:
The client sends a WebSocket message:
The API validates the request and retrieves the voice settings.
The text is split into chunks and processed in the background.
The client receives responses like:
For implementation details, check our WebSocket API documentation.