Alibaba WAN Integration
WAN 2.5 is developed by Alibaba and delivers high-quality text-to-video generation with multiple resolution options and prompt expansion capabilities.
Key capabilities
- Multiple resolutions: Choose between 480p, 720p, or 1080p based on quality and speed requirements
- Flexible duration: Generate 5-second clips for fast iteration or 10-second videos for more developed action
- Prompt expansion: AI-powered prompt optimizer expands simple ideas into detailed video scripts
- Negative prompts: Exclude unwanted elements like blur, watermarks, or distortion from output
- Reproducible results: Use seed values to regenerate similar videos with identical parameters
- Async processing: Webhook notifications or polling for task completion
- Maximum prompt length: 800 characters for main prompt, 500 characters for negative prompt
Resolution comparison
| Resolution | Best for | Processing speed | Output quality |
|---|---|---|---|
| 480p | Rapid prototyping, previews, mobile-first content | Fastest | Good |
| 720p | Social media, web content, balanced quality/speed | Medium | Better |
| 1080p | Marketing assets, professional content, high-detail scenes | Slower | Best |
Use cases
- Social media content: Quick video clips for TikTok, Instagram Reels, and YouTube Shorts
- Marketing previews: Rapid concept visualization before full production
- Product demonstrations: Animated product showcases from text descriptions
- Educational content: Explainer videos and visual learning materials
- Creative exploration: Experimental motion and abstract visualizations
- Storyboarding: Visual previews for film and video pre-production
API operations
Generate videos by submitting a text prompt to the appropriate resolution endpoint. The service returns a task ID for async polling or webhook notification.Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | Yes | - | Main description of the video including scene, characters, motion, camera moves, and style. Maximum 800 characters. |
duration | string | No | "5" | Video length: "5" seconds (faster) or "10" seconds (more action) |
negative_prompt | string | No | - | Elements to avoid in the output (e.g., “blurry, low quality, watermark”). Maximum 500 characters. |
enable_prompt_expansion | boolean | No | true | AI optimizer expands shorter prompts into detailed scripts |
seed | integer | No | Random | Seed for reproducibility (0 to 2147483647). Use same seed with identical parameters for similar results. |
webhook_url | string | No | - | URL for async completion notifications |
Frequently Asked Questions
What is WAN 2.5 Text-to-Video and how does it work?
What is WAN 2.5 Text-to-Video and how does it work?
WAN 2.5 Text-to-Video is an AI video generation API developed by Alibaba. You submit a text prompt describing your desired video, receive a task ID immediately, then poll for results or receive a webhook notification when processing completes. The model interprets your description and generates a video matching the scene, motion, and style you specified.
Which resolution should I choose: 480p, 720p, or 1080p?
Which resolution should I choose: 480p, 720p, or 1080p?
Choose based on your use case: 480p is fastest and ideal for rapid prototyping or mobile-first content. 720p balances quality and speed, suitable for most social media and web content. 1080p delivers the highest quality for marketing assets and professional content but takes longer to process.
How long does WAN 2.5 Text-to-Video take to process?
How long does WAN 2.5 Text-to-Video take to process?
Processing time depends on resolution and duration. 480p processes fastest, followed by 720p, then 1080p. A 5-second clip generates faster than a 10-second clip. For production workflows, use webhooks instead of polling.
What makes a good text-to-video prompt?
What makes a good text-to-video prompt?
Be specific about scenes and visual details. Describe camera movements (zoom, pan, tilt), lighting, atmosphere, and subject actions. Example: “fluffy orange cat on wooden windowsill, looking at snow falling outside, soft warm lighting, slow camera zoom in” produces better results than “cat looking outside.”
What does prompt expansion do?
What does prompt expansion do?
When enabled (default), the AI optimizer expands shorter prompts into detailed video scripts. This is useful when you have a simple idea but want richer video output. Disable it if you want precise control over exactly what the model generates.
What are the rate limits for WAN 2.5 Text-to-Video?
What are the rate limits for WAN 2.5 Text-to-Video?
Rate limits vary by subscription tier. See Rate Limits for current limits.
How much does WAN 2.5 Text-to-Video cost?
How much does WAN 2.5 Text-to-Video cost?
See the Pricing page for current rates and subscription options.
What is the difference between WAN 2.5 and WAN 2.6?
What is the difference between WAN 2.5 and WAN 2.6?
WAN 2.5 Text-to-Video offers 480p, 720p, and 1080p resolution options with prompt expansion. WAN 2.6 provides enhanced quality and is available for both text-to-video and image-to-video workflows. Choose WAN 2.5 for more resolution flexibility; choose WAN 2.6 for the latest quality improvements.
Best practices
- Prompt writing: Be specific about scenes, camera movements, lighting, and subject actions. Detailed prompts produce better results than vague descriptions.
- Resolution selection: Start with 480p for rapid iteration, then switch to higher resolutions for final output.
- Duration choice: Use 5-second clips for quick previews; 10-second clips allow more complex motion and narrative development.
- Negative prompts: Include common issues to avoid: “blurry, low quality, watermark, text, distortion, extra limbs.”
- Reproducibility: Save the seed value if you like a result and want to generate variations with similar characteristics.
- Production integration: Use webhooks for scalable applications instead of polling.
- Error handling: Implement retry with exponential backoff for 503 errors during high-demand periods.
Related APIs
- WAN 2.6 Text-to-Video: Latest WAN generation with enhanced quality at 720p and 1080p
- WAN 2.5 Image-to-Video: Generate video from an existing image with WAN 2.5
- WAN 2.6 Image-to-Video: Latest WAN generation for image-to-video at 720p and 1080p
- LTX 2 Pro Text-to-Video: Alternative text-to-video model with different style characteristics