Updated Nov 17, 2026

Image to Video Models

Explore our curated collection of resources. Compare image-to-video models based on 203,134+ human preference votes.

Google Dominates, Alibaba Challenges

In the fast-evolving world of Image-to-Video generation, Google's Veo 3.1 series has secured a commanding lead. The "Audio" variants of Veo 3.1 hold the top two spots with ELO scores of ~1396, highlighting the increasing importance of multimodal consistency in user preference.

However, the monopoly is being challenged. Alibaba's Wan2.5-i2v-preview has surged to Rank #3 (ELO 1341), breaking the streak of proprietary US-based models. This marks a significant shift in the competitive landscape.

While the top of the leaderboard is dominated by proprietary models, the presence of Wan-v2.2 (Rank 20) offers an Apache 2.0 open-weight alternative, though a significant performance gap remains between open and closed frontier models.

#1 Ranked Model
Veo 3.1
Google (Audio Enabled)
Top Challenger
Wan 2.5
Alibaba (Rank 3)
Highest Win Rate
73%
Veo 3.1 Fast Audio
Total Votes
203k+
Crowdsourced Data
Top 5 Models (ELO Score)
Win Rate vs. Field
Top 15 Models by Organization
Global Leaderboard
Rank Model Organization ELO Score Votes License