Veo is Google's high-fidelity video generation model, capable of generating videos in a wide range of cinematic and visual styles. Veo captures the nuance of your prompts to render intricate details consistently across frames.
This guide shows you how to generate videos with Veo. For tips of writing video prompts, check out the Veo prompt guide.
Veo versions
The Gemini API offers two video generation models: Veo 3 and Veo 2. We recommend using Veo 3, the latest model, for it's superior quality and audio generation capability.
Veo 3 is available in Preview, which may pose limitations for scaled production use. Veo 2 is Stable and offers a better production experience.
For detailed guidance on key feature differences between the models, review the Model version comparison section.
Generating videos from text
The code sample in this section uses Veo 3 to generate videos with integrated audio.
Python
import time
from google import genai
from google.genai import types
client = genai.Client()
operation = client.models.generate_videos(
model="veo-3.0-generate-preview",
prompt="Panning wide shot of a purring kitten sleeping in the sunshine",
config=types.GenerateVideosConfig(
person_generation="allow_all", # "allow_adult" and "dont_allow" for Veo 2 only
aspect_ratio="16:9", # "16:9", and "9:16" for Veo 2 only
),
)
while not operation.done:
time.sleep(20)
operation = client.operations.get(operation)
for n, generated_video in enumerate(operation.response.generated_videos):
client.files.download(file=generated_video.video)
generated_video.video.save(f"video{n}.mp4")
JavaScript
import { GoogleGenAI } from "@google/genai";
import { createWriteStream } from "fs";
import { Readable } from "stream";
const ai = new GoogleGenAI({});
async function main() {
let operation = await ai.models.generateVideos({
model: "veo-3.0-generate-preview",
prompt: "Panning wide shot of a purring kitten sleeping in the sunshine",
config: {
personGeneration: "allow_all",
aspectRatio: "16:9",
},
});
while (!operation.done) {
await new Promise((resolve) => setTimeout(resolve, 10000));
operation = await ai.operations.getVideosOperation({
operation: operation,
});
}
operation.response?.generatedVideos?.forEach(async (generatedVideo, n) => {
const resp = await fetch(`${generatedVideo.video?.uri}&key=GEMINI_API_KEY`); // append your API key
const writer = createWriteStream(`video${n}.mp4`);
Readable.fromWeb(resp.body).pipe(writer);
});
}
main();
Go
package main
import (
"context"
"fmt"
"os"
"time"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
videoConfig := &genai.GenerateVideosConfig{
AspectRatio: "16:9",
PersonGeneration: "allow_all",
}
operation, _ := client.Models.GenerateVideos(
ctx,
"veo-3.0-generate-preview",
"Panning wide shot of a purring kitten sleeping in the sunshine",
nil,
videoConfig,
)
for !operation.Done {
time.Sleep(20 * time.Second)
operation, _ = client.Operations.GetVideosOperation(ctx, operation, nil)
}
for n, video := range operation.Response.GeneratedVideos {
client.Files.Download(ctx, video.Video, nil)
fname := fmt.Sprintf("video_%d.mp4", n)
_ = os.WriteFile(fname, video.Video.VideoBytes, 0644)
}
}
REST
# Use curl to send a POST request to the predictLongRunning endpoint.
# The request body includes the prompt for video generation.
curl "${BASE_URL}/models/veo-3.0-generate-preview:predictLongRunning" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-X "POST" \
-d '{
"instances": [{
"prompt": "Panning wide shot of a purring kitten sleeping in the sunshine"
}
],
"parameters": {
"aspectRatio": "16:9",
"personGeneration": "allow_all",
}
}' | tee result.json | jq .name | sed 's/"//g' > op_name
# Obtain operation name to download video.
op_name=$(cat op_name)
# Check against status of operation.
while true; do
is_done=$(curl -H "x-goog-api-key: $GEMINI_API_KEY" "${BASE_URL}/${op_name}" | tee op_check.json | jq .done)
if [ "${is_done}" = "true" ]; then
cat op_check.json
echo "** Attach API_KEY to download video, or examine error message."
break
fi
echo "** Video ${op_name} has not downloaded yet! Check again after 5 seconds..."
# Wait for 5 seoncds to check again.
sleep 5
done
This code takes about a minute to run, though it may take longer if resources are constrained. Once it's done running, you should see a video of a sleeping kitten like the one we have here.
If you see an error message instead of a video, this means that resources are constrained and your request couldn't be completed. In this case, run the code again.
Generated videos are stored on the server for 2 days, after which they are
removed. If you want to save a local copy of your generated video, you must run
result()
and save()
within 2 days of generation.
Generating videos from images
The following code generates an image using Imagen, then uses the generated image as the starting frame for the generated video.
First, generate an image using Imagen:
Python
prompt="Panning wide shot of a calico kitten sleeping in the sunshine",
imagen = client.models.generate_images(
model="imagen-3.0-generate-002",
prompt=prompt,
config=types.GenerateImagesConfig(
aspect_ratio="16:9",
number_of_images=1
)
)
imagen.generated_images[0].image
JavaScript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({});
const response = await ai.models.generateImages({
model: "imagen-3.0-generate-002",
prompt: "Panning wide shot of a calico kitten sleeping in the sunshine",
config: {
numberOfImages: 1,
},
});
// you'll pass response.generatedImages[0].image.imageBytes to Veo
Go
package main
import (
"context"
"fmt"
"os"
"time"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
config := &genai.GenerateImagesConfig{
AspectRatio: "16:9",
NumberOfImages: 1,
}
response, _ := client.Models.GenerateImages(
ctx,
"imagen-3.0-generate-002",
"Panning wide shot of a calico kitten sleeping in the sunshine",
config,
)
// you'll pass response.GeneratedImages[0].Image to Veo
}
Then, generate a video using the resulting image as the first frame:
Python
operation = client.models.generate_videos(
model="veo-2.0-generate-001",
prompt=prompt,
image = imagen.generated_images[0].image,
config=types.GenerateVideosConfig(
person_generation="dont_allow", # "dont_allow" or "allow_adult"
aspect_ratio="16:9", # "16:9" or "9:16"
number_of_videos=2
),
)
# Wait for videos to generate
while not operation.done:
time.sleep(20)
operation = client.operations.get(operation)
for n, video in enumerate(operation.response.generated_videos):
fname = f'with_image_input{n}.mp4'
print(fname)
client.files.download(file=video.video)
video.video.save(fname)
JavaScript
import { GoogleGenAI } from "@google/genai";
import { createWriteStream } from "fs";
import { Readable } from "stream";
const ai = new GoogleGenAI({});
async function main() {
// get image bytes from Imagen, as shown above
let operation = await ai.models.generateVideos({
model: "veo-2.0-generate-001",
prompt: "Panning wide shot of a calico kitten sleeping in the sunshine",
image: {
imageBytes: response.generatedImages[0].image.imageBytes, // response from Imagen
mimeType: "image/png",
},
config: {
aspectRatio: "16:9",
numberOfVideos: 2,
},
});
while (!operation.done) {
await new Promise((resolve) => setTimeout(resolve, 10000));
operation = await ai.operations.getVideosOperation({
operation: operation,
});
}
operation.response?.generatedVideos?.forEach(async (generatedVideo, n) => {
const resp = await fetch(
`${generatedVideo.video?.uri}&key=GEMINI_API_KEY`, // append your API key
);
const writer = createWriteStream(`video${n}.mp4`);
Readable.fromWeb(resp.body).pipe(writer);
});
}
main();
Go
image := response.GeneratedImages[0].Image
videoConfig := &genai.GenerateVideosConfig{
AspectRatio: "16:9",
NumberOfVideos: 2,
}
operation, _ := client.Models.GenerateVideos(
ctx,
"veo-2.0-generate-001",
"A dramatic scene based on the input image",
image,
videoConfig,
)
for !operation.Done {
time.Sleep(20 * time.Second)
operation, _ = client.Operations.GetVideosOperation(ctx, operation, nil)
}
for n, video := range operation.Response.GeneratedVideos {
client.Files.Download(ctx, video.Video, nil)
fname := fmt.Sprintf("video_with_image_input_%d.mp4", n)
_ = os.WriteFile(fname, video.Video.VideoBytes, 0644)
}
Veo model parameters
(Naming conventions vary by programming language.)
prompt
: The text prompt for the video. When present, theimage
parameter is optional.image
: The image to use as the first frame for the video. When present, theprompt
parameter is optional.negativePrompt
: Text string that describes anything you want to discourage the model from generatingaspectRatio
: Changes the aspect ratio of the generated video."16:9"
: Supported in Veo 3 and Veo 2."9:16"
: Supported in Veo 2 only (defaults to"16:9"
).
personGeneration
: Allow the model to generate videos of people. The following values are supported:- Text-to-video generation:
"allow_all"
: Generate videos that include adults and children. Currently the only availablepersonGeneration
value for Veo 3."dont_allow"
: Veo 2 only. Don't allow the inclusion of people or faces."allow_adult"
: Veo 2 only. Generate videos that include adults, but not children.
- Image-to-video generation: Veo 2 only
"dont_allow"
: Don't allow the inclusion of people or faces."allow_adult"
: Generate videos that include adults, but not children.
- See Limitations.
- Text-to-video generation:
numberOfVideos
: Output videos requested1
: Supported in Veo 3 and Veo 22
: Supported in Veo 2 only.
durationSeconds
: Veo 2 only. Length of each output video in seconds, between5
and8
.- Not configurable for Veo 3, default setting is 8 seconds.
enhancePrompt
: Veo 2 only. Enable or disable the prompt rewriter. Enabled by default.- Not configurable for Veo 3, default prompt enhancer is always on.
See the Model version comparison table for a side-by-side look at parameter differences between Veo 3 and Veo 2.
Specifications
Modalities |
|
Request latency |
|
Variable length generation |
|
Resolution | 720p |
Frame rate | 24fps |
Aspect ratio |
|
Input languages (text-to-video) | English |
Limitations |
Videos created by Veo are watermarked using SynthID, our tool for watermarking and identifying AI-generated content, and are passed through safety filters and memorization checking processes that help mitigate privacy, copyright and bias risks.
Veo prompt guide
This section of the Veo guide contains examples of videos you can create using Veo, and shows you how to modify prompts to produce distinct results.
Safety filters
Veo applies safety filters across Gemini to help ensure that generated videos and uploaded photos don't contain offensive content. Prompts that violate our terms and guidelines are blocked.
Prompt writing basics
Good prompts are descriptive and clear. To get the most out of Veo, start with identifying your core idea, refine your idea by adding keywords and modifiers, and incorporate video-specific terminology into your prompts.
The following elements should be included in your prompt:
- Subject: The object, person, animal, or scenery that you want in your video, such as cityscape, nature, vehicles, or puppies.
- Action: What the subject is doing (for example, walking, running, or turning their head).
- Style: Specify creative direction using specific film style keywords, such as sci-fi, horror film, film noir, or animated styles like cartoon.
- Camera positioning and motion: [Optional] Control the camera's location and movement using terms like aerial view, eye-level, top-down shot, dolly shot, or worms eye.
- Composition: [Optional] How the shot is framed, such as wide shot, close-up, single-shot or two-shot.
- Focus and lens effects: [Optional] Use terms like shallow focus, deep focus, soft focus, macro lens, and wide-angle lens to achieve specific visual effects.
- Ambiance: [Optional] How the color and light contribute to the scene, such as blue tones, night, or warm tones.
- Implicit or explicit audio cues: [Veo 3 only] With Veo 3, you can provide cues for sound effects, ambient noise, and dialogue.
More tips for writing prompts
The following tips help you write prompts that generate your videos:
- Use descriptive language: Use adjectives and adverbs to paint a clear picture for Veo.
- Provide context: If necessary, include background information to help your model understand what you want.
- Reference specific artistic styles: If you have a particular aesthetic in mind, reference specific artistic styles or art movements.
- Utilize prompt engineering tools: Consider exploring prompt engineering tools or resources to help you refine your prompts and achieve optimal results. For more information, visit Introduction to prompt design.
- Enhance the facial details in your personal and group images: Specify facial details as a focus of the photo like using the word portrait in the prompt.
Example prompts and output
This section presents several prompts, highlighting how descriptive details can elevate the outcome of each video.
Integrated audio
These videos demonstrate how you can prompt Veo 3's audio generation with increasing levels of detail.
Prompt | Generated output |
---|---|
More detail A close up of two people staring at a cryptic drawing on a wall, torchlight flickering. "This must be the key," he murmured, tracing the pattern. "What does it mean though?" she asked, puzzled, tilting her head. Damp stone, intricate carvings, hidden symbols. A faint, eerie hum resonates in the background. |
![]() |
Less detail Camping (Stop Motion): Camper: "I'm one with nature now!" Bear: "Nature would prefer some personal space". |
![]() |
Try out these prompts yourself to hear the audio! Try Veo 3
Icicles
This video demonstrates how you can use the elements of prompt writing basics in your prompt.
Prompt | Generated output |
---|---|
Close up shot (composition) of melting icicles (subject) on a frozen rock wall (context) with cool blue tones (ambiance), zoomed in (camera motion) maintaining close-up detail of water drips (action). |
![]() |
Man on the phone
These videos demonstrate how you can revise your prompt with increasingly specific details to get Veo to refine the output to your liking.
Prompt | Generated output |
---|---|
Less detail The camera dollies to show a close up of a desperate man in a green trench coat. He's making a call on a rotary-style wall phone with a green neon light. It looks like a movie scene. |
![]() |
More detail A close-up cinematic shot follows a desperate man in a weathered green trench coat as he dials a rotary phone mounted on a gritty brick wall, bathed in the eerie glow of a green neon sign. The camera dollies in, revealing the tension in his jaw and the desperation etched on his face as he struggles to make the call. The shallow depth of field focuses on his furrowed brow and the black rotary phone, blurring the background into a sea of neon colors and indistinct shadows, creating a sense of urgency and isolation. |
![]() |
Snow leopard
This example demonstrates the output Veo might generate for a simple prompt.
Prompt | Generated output |
---|---|
A cute creature with snow leopard-like fur is walking in winter forest, 3D cartoon style render. |
![]() |
Running snow leopard
This prompt has more detail and demonstrates generated output that might be closer to what you want in your video.
Prompt | Generated output |
---|---|
Create a short 3D animated scene in a joyful cartoon style. A cute creature with snow leopard-like fur, large expressive eyes, and a friendly, rounded form happily prances through a whimsical winter forest. The scene should feature rounded, snow-covered trees, gentle falling snowflakes, and warm sunlight filtering through the branches. The creature's bouncy movements and wide smile should convey pure delight. Aim for an upbeat, heartwarming tone with bright, cheerful colors and playful animation. |
![]() |
Examples by writing elements
These examples show you how to refine your prompts by each basic element.
Subject
This example shows you how to specify a subject description. The description can include a subject, or multiple subjects and actions. Here, our subject is "white concrete apartment building."
Prompt | Generated output |
---|---|
An architectural rendering of a white concrete apartment building with flowing organic shapes, seamlessly blending with lush greenery and futuristic elements |
![]() |
Context
This example shows you how to specify context. The background or context in which the subject will be placed is very important. Try placing your subject in a variety of backgrounds like on a busy street, or in outer space.
Prompt | Generated output |
---|---|
A satellite floating through outer space with the moon and some stars in the background. |
![]() |
Action
This example shows you how to specify action: What is the subject doing like walking, running, or turning their head.
Prompt | Generated output |
---|---|
A wide shot of a woman walking along the beach, looking content and relaxed towards the horizon at sunset. |
![]() |
Style
This example shows you how to specify style. You can add keywords to improve generation quality and steer it closer to intended style, such as shallow depth of field, movie still, minimalistic, surreal, vintage, futuristic, or double-exposure.
Prompt | Generated output |
---|---|
Film noir style, man and woman walk on the street, mystery, cinematic, black and white. |
![]() |
Camera motion
This example shows you how to specify camera motion. Options for camera motion include POV shot, aerial view, tracking drone view, or tracking shot.
Prompt | Generated output |
---|---|
A POV shot from a vintage car driving in the rain, Canada at night, cinematic. |
![]() |
Composition
This example shows you how to specify composition: How the shot is framed (wide shot, close-up, low angle, etc.).
Prompt | Generated output |
---|---|
Extreme close-up of a an eye with city reflected in it. |
![]() |
Create a video of a wide shot of surfer walking on a beach with a surfboard, beautiful sunset, cinematic. |
![]() |
Ambiance
This example shows you how to specify ambiance. Color palettes play a vital role in photography, influencing the mood and conveying intended emotions. Try things like "muted orange warm tones," "natural light," "sunrise" or "sunset". For example, a warm, golden palette can infuse a romantic and atmospheric feel into a photograph.
Prompt | Generated output |
---|---|
A close-up of a girl holding adorable golden retriever puppy in the park, sunlight. |
![]() |
Cinematic close-up shot of a sad woman riding a bus in the rain, cool blue tones, sad mood. |
![]() |
Use reference images to generate videos
You can bring images to life by using Veo's image-to-video capability. You can use existing assets, or try Imagen to generate something new.
Prompt | Generated output |
---|---|
Bunny with a chocolate candy bar. |
![]() |
Bunny runs away. |
![]() |
Negative prompts
Negative prompts can be a powerful tool to help specify elements you don't want in the video. Describe what you want to discourage the model from generating after the phrase "Negative prompt". Follow these tips:
❌ Don't use instructive language or words like no or don't. For example, "No walls" or "don't show walls".
✅ Do describe what you don't want to see. For example, "wall, frame", which means that you don't want a wall or a frame in the video.
Prompt | Generated output |
---|---|
Generate a short, stylized animation of a large, solitary oak tree with leaves blowing vigorously in a strong wind. The tree should have a slightly exaggerated, whimsical form, with dynamic, flowing branches. The leaves should display a variety of autumn colors, swirling and dancing in the wind. The animation should use a warm, inviting color palette. |
![]() |
Generate a short, stylized animation of a large, solitary oak tree
with leaves blowing vigorously in a strong wind. The tree should have a slightly
exaggerated, whimsical form, with dynamic, flowing branches. The leaves should
display a variety of autumn colors, swirling and dancing in the wind. The
animation should use a warm, inviting color palette.
With negative prompt - urban background, man-made structures, dark, stormy, or threatening atmosphere. |
![]() |
Aspect ratios
Gemini Veo video generation supports the following two aspect ratios:
Aspect ratio | Description |
---|---|
Widescreen or 16:9 | The most common aspect ratio for televisions, monitors, and mobile phone screens (landscape). Use this when you want to capture more of the background, like in scenic landscapes. |
Portrait or 9:16 (Veo 2 only) |
Rotated widescreen. This aspect ratio has been popularized by short form video applications, such as Youtube shorts. Use this for portraits or tall objects with strong vertical orientations, such as
buildings, trees, waterfall, or buildings. |
Widescreen
This prompt is an example of the widescreen aspect ratio of 16:9.
Prompt | Generated output |
---|---|
Create a video with a tracking drone view of a man driving a red convertible car in Palm Springs, 1970s, warm sunlight, long shadows. |
![]() |
Portrait
This prompt is an example of the portrait aspect ratio of 9:16. This ratio is only available for Veo 2.
Prompt | Generated output |
---|---|
Create a video highlighting the smooth motion of a majestic Hawaiian waterfall within a lush rainforest. Focus on realistic water flow, detailed foliage, and natural lighting to convey tranquility. Capture the rushing water, misty atmosphere, and dappled sunlight filtering through the dense canopy. Use smooth, cinematic camera movements to showcase the waterfall and its surroundings. Aim for a peaceful, realistic tone, transporting the viewer to the serene beauty of the Hawaiian rainforest. |
![]() |
Model version comparison
We recommend using Veo 3 for the best performance, fidelity, and quality.
The following table describes the differences in features, specifications, and parameters between Veo 2 and the current state of the Veo 3 preview:
Model | Veo 3 | Veo 2 |
---|---|---|
Availability | Preview | Stable |
Audio | Audio with video (Always on) | No audio |
Generation | Text to video | Text and image to video |
Videos per request | 1 | 1 or 2 |
aspectRatio |
16:9 only |
16:9 or 19:6 |
personGeneration |
allow_all only (not configurable) |
allow_adult , dont_allow , or allow_all (text to video only) |
durationSeconds |
Not configurable, 8 seconds only | 5-8 seconds |
enhancePrompt |
Not configurable, always on | Enable (default) or disable |
You can migrate from Veo 2 to Veo 3 by updating the model name to use a Veo 3 model code, with minimal changes to parameters.
What's next
- Gain more experience generating AI videos with the Veo Colab.
- Check out cool examples using Veo 2 on the Google DeepMind site