ElevenLabs Voiceover - Text-to-Speech API

ElevenLabs integration

Powered by ElevenLabs technology, this API generates natural-sounding speech from text with customizable voice settings and multi-language support.

ElevenLabs Voiceover is a text-to-speech API that converts text into natural-sounding audio using AI voice synthesis. Generate professional voiceovers for videos, podcasts, presentations, e-learning content, and accessibility applications. The API supports up to 40,000 characters per request with the Turbo model and delivers high-quality audio output suitable for production use.

Key capabilities

AI model: eleven_turbo_v2_5 optimized for fast, high-quality generation (up to 10,000 characters)
Voice customization: Control stability (0-1), similarity boost (0-1), and speech speed (0.7-1.2x)
Speaker boost: Enhanced voice clarity and presence in generated audio
Multi-language support: UTF-8 text including accented letters and non-Latin scripts
Maximum text length: Up to 40,000 characters per request
Async processing: Webhook notifications or polling for task completion

Use cases

Video production: Create voiceovers for marketing videos, tutorials, and social media content
Podcast production: Generate intro/outro narration or read content aloud
E-learning: Convert educational materials to audio for accessibility and engagement
Accessibility: Provide audio versions of written content for visually impaired users
IVR systems: Generate professional voice prompts for phone systems
Audiobook creation: Convert written content into natural-sounding audio narration

Generate voiceover with ElevenLabs

Create voiceovers by submitting text to the API with a voice ID. The service returns a task ID for async polling or webhook notification.

POST /v1/ai/voiceover

Create a new voiceover generation task

GET /v1/ai/voiceover

List all voiceover tasks with status

GET /v1/ai/voiceover/{task-id}

Get task status and results by ID

Parameters

Parameter	Type	Required	Default	Description
`text`	`string`	Yes	-	Text to convert to speech. UTF-8 supported, 1-40,000 characters
`voice_id`	`string`	Yes	-	ElevenLabs voice ID from the voice library
`model`	`string`	No	`"eleven_turbo_v2_5"`	AI model for speech synthesis
`stability`	`number`	No	`0.5`	Voice consistency: 0.0 (expressive) to 1.0 (stable)
`similarity_boost`	`number`	No	`0.2`	Voice matching: 0.0 (varied) to 1.0 (close match, may have artifacts)
`speed`	`number`	No	`1.0`	Speech rate: 0.7 (30% slower) to 1.2 (20% faster)
`use_speaker_boost`	`boolean`	No	`true`	Enable enhanced voice clarity and presence
`webhook_url`	`string`	No	-	URL for task completion notification

Frequently Asked Questions

What is ElevenLabs Voiceover and how does it work?

ElevenLabs Voiceover is a text-to-speech API that converts text into natural-sounding audio using AI voice synthesis. You submit text with a voice ID, receive a task ID immediately, then poll for results or receive a webhook notification when processing completes. The output is a high-quality audio file containing the synthesized speech.

What model is used for voiceover generation?

The eleven_turbo_v2_5 model is used, optimized for fast, high-quality speech synthesis. It supports up to 10,000 characters per request and is ideal for both real-time applications and production use.

How do I find voice IDs for ElevenLabs voices?

Voice IDs are unique identifiers for each voice in the ElevenLabs voice library. You can find voice IDs in the ElevenLabs Voice Library. A common example voice ID is 21m00Tcm4TlvDq8ikWAM (Rachel - a calm, professional female voice).

What do the stability and similarity_boost parameters control?

stability (0-1) controls voice consistency: lower values produce more expressive, varied speech; higher values produce more consistent, stable output. similarity_boost (0-1) controls how closely the output matches the original voice sample: higher values match more closely but may introduce artifacts.

What languages does ElevenLabs Voiceover support?

ElevenLabs Voiceover supports multiple languages through UTF-8 text encoding, including accented letters (e.g., e, n, u) and non-Latin scripts. The specific languages available depend on the voice selected - many voices support multiple languages natively.

How long does voiceover generation take?

Processing time depends on text length and model selection. The eleven_turbo_v2_5 model is optimized for speed and generates audio faster. For production workflows with longer texts, use webhooks instead of polling to receive completion notifications.

What are the rate limits for voiceover generation?

Rate limits vary by subscription tier. See Rate Limits for current limits by tier. For high-volume production use, consider using webhooks for efficient task management.

Best practices

Voice selection: Choose a voice appropriate for your content and audience
Model: The eleven_turbo_v2_5 model provides fast, high-quality results for all use cases
Stability tuning: Start with 0.5, decrease for more expressive reads, increase for more consistent output
Speed adjustment: Use 0.7-0.9 for slower, clearer speech; 1.0-1.2 for faster narration
Text preparation: Use proper punctuation for natural pauses; avoid very long sentences
Production integration: Use webhooks instead of polling for scalable applications
Error handling: Implement retry logic with exponential backoff for 503 errors

Sound Effects: Generate sound effects from text descriptions
Audio Isolation: Extract specific sounds from audio or video files
Lip Sync: Synchronize lip movements with audio
OmniHuman 1.5: Generate human animations driven by audio

Get Started

APIs

ElevenLabs Voiceover - Text-to-Speech API

ElevenLabs integration

Key capabilities

Use cases

Generate voiceover with ElevenLabs

POST /v1/ai/voiceover

GET /v1/ai/voiceover

GET /v1/ai/voiceover/{task-id}

Parameters

Frequently Asked Questions

Best practices

Get Started

APIs

Documentation Index

ElevenLabs integration

​Key capabilities

​Use cases

​Generate voiceover with ElevenLabs

POST /v1/ai/voiceover

GET /v1/ai/voiceover

GET /v1/ai/voiceover/{task-id}

​Parameters

​Frequently Asked Questions

​Best practices

​Related APIs

Key capabilities

Use cases

Generate voiceover with ElevenLabs

Parameters

Frequently Asked Questions

Best practices

Related APIs