Winning the 1X World Model Challenge
26 Nov, 2025·
·
0 min read
Aidan Scannell
Architecture: Spatio-temporal Transformer World modelAbstract
In this talk, I’ll share how our team won both the Outstanding Champion and Innovation awards in the ICCV 2025 phase of the 1X World Model Challenge. The competition introduces an open-source benchmark for real-world humanoid interaction, featuring two complementary tracks: sampling, focused on forecasting future image frames, and compression, focused on predicting future discrete latent codes. I’ll discuss how we adapted the video generation foundation model Wan-2.2 TI2V-5B for video-state-conditioned future frame prediction in the sampling track, and how we trained a Spatio-Temporal Transformer from scratch for the compression track—achieving 1st place in both.
Event
Huawei AI Application Workshop
Location
Dublin, Ireland