Kling 3 API

Kling 3 integration

Generate high-quality videos from text prompts or images using Kling’s latest V3 model with multi-shot support and advanced frame control.

Kling 3 is a dual-mode video generation API that creates professional-grade videos from either text descriptions or source images. It supports multi-shot mode for creating complex narratives with up to 6 scenes, first and end frame image control, and flexible durations from 3 to 15 seconds. Available in Pro and Standard tiers to balance quality and cost.

Key capabilities

Text-to-Video (T2V): Generate videos from text prompts up to 2500 characters
Image-to-Video (I2V): Use first_frame and/or end_frame images to control video start and end points
Multi-shot mode: Create videos with up to 6 scenes, each with custom prompts and durations (max 15 seconds total)
Flexible durations: 3-15 seconds with per-shot duration control in multi-shot mode
Element consistency: Pre-registered element IDs for consistent characters/styles across videos
CFG scale control: Adjust prompt adherence from 0 (creative) to 1 (strict), default 0.5
Negative prompts: Exclude unwanted elements, styles, or artifacts
Async processing: Webhook notifications or polling for task completion

Pro vs Standard

Feature	Kling 3 Pro	Kling 3 Standard
Quality	Higher fidelity, richer detail	Good quality, cost-effective
Speed	Standard processing	Faster processing
Best for	Premium content, marketing	High-volume, testing

Use cases

Marketing and advertising: Create multi-scene product narratives with consistent branding
Social media content: Generate vertical videos for TikTok, Instagram Reels, and YouTube Shorts
E-commerce: Animate product images with controlled start and end frames
Storyboarding: Turn scripts into multi-shot video sequences
Creative storytelling: Build narratives with scene-by-scene control

Generate videos with Kling 3

Create videos by submitting a text prompt (T2V) or images with prompt (I2V) to the API. The service returns a task ID for async polling or webhook notification.

POST /v1/ai/video/kling-v3-pro

Generate video with Kling 3 Pro

POST /v1/ai/video/kling-v3-std

Generate video with Kling 3 Standard

GET /v1/ai/video/kling-v3

List all Kling 3 tasks

GET /v1/ai/video/kling-v3/{task-id}

Get task status by ID

Parameters

Parameter	Type	Required	Default	Description
`prompt`	`string`	No	-	Text prompt describing the video (max 2500 chars). Required for T2V.
`negative_prompt`	`string`	No	-	Text describing what to avoid (max 2500 chars)
`image_list`	`array`	No	-	Reference images with `image_url` and `type` (first_frame/end_frame)
`multi_shot`	`boolean`	No	`false`	Enable multi-shot mode for multi-scene videos
`shot_type`	`string`	No	`customize`	Shot segmentation: `customize` (manual per-shot prompts) or `intelligence` (AI auto-segmentation)
`multi_prompt`	`array`	No	-	Shot definitions: `index` (0-5), `prompt`, `duration` per scene
`element_list`	`array`	No	-	Pre-registered element IDs for character/style consistency
`aspect_ratio`	`string`	No	`16:9`	Video ratio: `16:9`, `9:16`, `1:1`
`duration`	`integer`	No	`5`	Duration in seconds: 3-15 (default 5)
`cfg_scale`	`number`	No	`0.5`	Prompt adherence: 0 (creative) to 1 (strict)
`webhook_url`	`string`	No	-	URL for task completion notification

Image list item

Field	Type	Description
`image_url`	`string`	Publicly accessible image URL (300x300 min, 10MB max, JPG/JPEG/PNG)
`type`	`string`	Image role: `first_frame` or `end_frame`

Multi-prompt item

Field	Type	Description
`index`	`integer`	Shot order (0-5)
`prompt`	`string`	Text prompt for this shot (max 2500 chars)
`duration`	`number`	Shot duration in seconds

Frequently Asked Questions

What is Kling 3 and how does it work?

Kling 3 is an AI video generation model that creates videos from text prompts (T2V) or images (I2V). You submit your request via the API, receive a task ID immediately, then poll for results or receive a webhook notification when processing completes. Typical generation takes 30-120 seconds depending on duration and complexity.

What is multi-shot mode?

Multi-shot mode lets you create videos with up to 6 distinct scenes. Each scene can have its own prompt and duration. The total duration across all shots cannot exceed 15 seconds, and each shot must be at least 3 seconds. Enable with multi_shot: true and define scenes in multi_prompt.

How do first_frame and end_frame work?

Use the image_list parameter to provide reference images. Set type: "first_frame" to use an image as the video’s starting point, or type: "end_frame" for the ending point. You can use both to create a transition from one image to another.

What image formats does Kling 3 support?

Kling 3 accepts JPG, JPEG, and PNG images via publicly accessible URLs. Requirements: minimum 300x300 pixels, maximum 10MB file size, aspect ratio between 1:2.5 and 2.5:1.

What is cfg_scale and how should I set it?

CFG scale controls how closely the model follows your prompt. Use 0 for maximum creativity and artistic interpretation, 0.5 (default) for balanced results, or 1 for strict adherence to your prompt with less creative variation.

What is the difference between Pro and Standard?

Pro delivers higher fidelity with richer detail, ideal for premium content and marketing. Standard offers good quality with faster processing, suitable for high-volume generation and testing. Both share the same parameters and capabilities.

What are the rate limits for Kling 3?

Rate limits vary by subscription tier. See Rate Limits for current limits and quotas.

How much does Kling 3 cost?

Pricing varies based on model tier (Pro vs Standard) and video duration. See the Pricing page for current rates.

Best practices

Prompt clarity: Write detailed prompts specifying subject, action, camera movement, and atmosphere
Start simple: Begin with single-shot mode before attempting multi-shot sequences
Image quality: For I2V, use high-resolution source images with clear subjects (min 300x300)
Duration planning: For multi-shot, plan scene durations to stay within 15-second total limit
Element consistency: Use pre-registered elements for recurring characters across multiple videos
CFG tuning: Start with 0.5, decrease for more creativity, increase for prompt precision
Production integration: Use webhooks instead of polling for scalable applications
Error handling: Implement retry logic with exponential backoff for 503 errors

Kling 3 Omni: Kling 3 with video reference support for motion/style guidance
Kling 2.6 Pro: Previous generation with motion control capabilities
Kling O1: High-performance video generation
Runway Gen 4.5: Alternative video generation model

Get Started

APIs

Kling 3 integration

Key capabilities

Pro vs Standard

Use cases

Generate videos with Kling 3

POST /v1/ai/video/kling-v3-pro

POST /v1/ai/video/kling-v3-std

GET /v1/ai/video/kling-v3

GET /v1/ai/video/kling-v3/{task-id}

Parameters

Image list item

Multi-prompt item

Frequently Asked Questions

Best practices

Get Started

APIs

Kling 3 integration

​Key capabilities

​Pro vs Standard

​Use cases

​Generate videos with Kling 3

POST /v1/ai/video/kling-v3-pro

POST /v1/ai/video/kling-v3-std

GET /v1/ai/video/kling-v3

GET /v1/ai/video/kling-v3/{task-id}

​Parameters

​Image list item

​Multi-prompt item

​Frequently Asked Questions

​Best practices

​Related APIs

Key capabilities

Pro vs Standard

Use cases

Generate videos with Kling 3

Parameters

Image list item

Multi-prompt item

Frequently Asked Questions

Best practices

Related APIs