Skip to content
Image to Video

Image to Video

Updated: 2026-05

1. What You’ll Learn on This Page

It generates short video clips based on still images. Comfy Cloud includes the Wan 2.2 and LTX-2.3 video generation models, which work even on the free plan.

In class, students will gain a firsthand understanding of how video-generating AI works and its current limitations through this hands-on experience. This will then lead to a perspective of using Runway as a tool with a clear understanding of its inner workings.

2. How Video-Generating AI Works (Repost)

As mentioned in “Diffusion Mechanism,” here are the key points:

  • Image version: Removing 2D noise in latent space
  • Video version: Removing 3D noise (including the time axis) in latent space

Wan 2.2 and LTX-2.3 are essentially “3D diffusion models.” Concepts such as prompts, sampling, and CFGs are common to image generation. The intuition you’ve gained from working with images can be applied directly to video.

3. Main Models (Comfy Cloud Built-In)

3.1 Wan 2.2

An open-source video generation tool developed by Tongyi Lab in China.

  • Image-to-Video (i2v): 1 image + text prompt → video
  • Text-to-Video (t2v): text prompt → video
  • Standard 5-second clip, 640×640 or 720p
  • Official estimate: Approx. 11.4 credits per video (including 4-step fast LoRA)

3.2 LTX-2.3

A video generation model developed by Lightricks.

  • Lightweight and fast, with strong performance in lip-syncing and specific workflows
  • Includes LoRA models for talkvid and celebvhq
  • Similar credit consumption (10–15 cr per video)

3.3 Wan 2.2 Animate

A character replacement workflow built into Comfy Cloud templates.

  • Video + character image → Reconstruct the video using that character
  • Change the character’s appearance while preserving poses and movements

4. Basic i2v Workflow (WAN 2.2)

Comfy Cloud Templates > Getting Started > “1.2 Starter: From Image to Video” is an introductory workflow for learning.

Key Components:

  • Load Image — The starting still image
  • Load Wan 2.2 i2v Model
  • Load VAE / CLIP (for Wan)
  • Prompt Encoding
  • K-Sampler (Video Version) — Combined with 4-step fast LoRA
  • VAE Decoding (Video Version)
  • Save Video (mp4 output)

Tip: Some templates include a learning guide, such as “Step 1 - Connect nodes.” Follow the instructions to complete the workflow.

5. Key Parameters (Video-Specific)

Parameter Description Recommended Value
frame_count Total number of frames 81 (5 seconds @ 18 fps)
fps Frame rate 18
resolution Resolution 640×640 (Lightweight) / 1280×720 (Standard)
steps Sampling steps 4 (when used with high-speed LoRA) / 20 (Standard)
prompt Include movement instructions “the woman turns her head slowly to the left”, etc.

The key to writing prompts is to include instructions for “movement” rather than just still images. Examples include “a gentle breeze,” “ripples on the water,” or “a person walking.”

6. Limitations of Video Generation (as of May 2026)

Limits that students should experience firsthand during class:

  • Duration: A realistic target is 5–10 seconds per generation. Anything longer carries a high risk of failure.
  • Facial Consistency: Faces may change to look like different people midway through (Wan 2.2 experiences this relatively less often).
  • Physics: The representation of gravity, inertia, and contact is still unreliable.
  • Multiple Characters: Simultaneous animation of two or more characters is prone to failure.
  • Text: Text within videos (signs, subtitles) is virtually impossible
  • Complex movements: Dance and combat scenes are unstable

While these issues have been partially addressed in top-of-the-line commercial models such as Runway Gen-4 and Sora 2, none of them are perfect. The design intent of this course is to ensure that students understand these limitations before proceeding to the Runway exercises.

7. Credit Usage

What can be done within the time available for class (based on a budget of 400 credits per student per month):

Operation Cost Monthly Limit
Wan 2.2 5-second video (Standard) Approx. 11 cr Approx. 35
LTX-2.3 5-second video Approx. 10–15 cr Approx. 25–40
Wan 2.2 Animate (Character Replacement) Approx. 15–25 cr Approx. 16–26

When designing lessons, limit the number of videos to “one or two per student.” Conduct trial and error at the image stage, and move on to video production only once you are confident in the results.

8. Tips for Writing Prompts

8.1 Describing Movement

  • Bad example: “a beautiful landscape” (like a still image)
  • Good example: “a beautiful landscape, with a gentle breeze blowing through the grass and clouds drifting slowly”

8.2 Specifying Camera Movements

  • “static camera” (fixed camera)
  • “slow zoom in” (slow zoom in)
  • “panning right” (pan to the right)
  • “dolly forward” (dolly forward)

The choice of camera can drastically change the look and feel of a video production.

8.3 Narrowing the Focus

Limiting the moving elements to a single one creates a sense of stability.

  • Instead of “the entire character moving,” have “only the character’s face move.”
  • Instead of “a scene where everything moves,” have “a scene where only the water’s surface ripples.”

9. Exercises (for Class Use)

Exercise A: Animating a Still Image

  • Select one of your images generated in C-1 or A-1
  • Load it into the Wan 2.2 i2v workflow
  • Prompt: Movement appropriate for the scene (wind, water, facial expressions, etc.)
  • Generate for 5 seconds and observe the results

Exercise B: The Effect of Prompts

  • Using the same static image as a starting point, vary only the motion prompts
  • “static, no movement” / “slow zoom in” / “gentle wind, leaves swaying”
  • Compare the differences in the results

Exercise C: Finding the Limits

  • Intentionally try difficult movements
  • Examples: “a person running,” “two people having a conversation,” “text being displayed,” etc.
  • Observe where things break down → Understand the limits

This is an important experience before moving on to the Runway exercise.

10. Saving Videos

  • After the workflow runs, a video player will appear in the Save Video node
  • Click the three-dot menu in the upper-right corner of the player → select Download to save the MP4 file to your computer
  • You can also redownload past videos from the History panel

11. What’s Next

  • Prompt Play — Experiment with fun prompts
  • Algorithm Exposure — Experiments that peek inside the system
  • To Runway — An introduction to the world of video-generating AI and a bridge to Runway exercises