
Goku: ByteDance's Advanced Video Generative Model
ByteDance, in collaboration with the University of Hong Kong, has introduced a new video generative model named Goku. This model is designed to produce high-quality videos and images from text prompts, utilizing advanced generative algorithms. Goku is built on a rectified flow transformer architecture, which enhances the interaction between video and image tokens, allowing for superior performance in both image and video generation tasks.
Goku's development involved training on a massive dataset comprising 160 million image-text pairs and 36 million video-text pairs. The model employs a shared encoder to compress images and videos into a unified format, processed by a custom transformer. This approach, combined with a specialized generative process called Rectified Flow, enables Goku to deliver consistent and high-quality outputs.
The model supports various generation tasks, including text-to-video, image-to-video, and text-to-image generation. In performance benchmarks, Goku-T2V, the text-to-video variant, achieved a score of 84.85 on VBench, surpassing several leading commercial models. This performance highlights Goku's ability to follow prompts accurately and produce high-quality results.
Goku also features a specialized version, Goku+, optimized for creating realistic advertising content. It can generate videos of humans with natural movements and expressions, potentially transforming advertising production by significantly reducing costs. The model's capabilities extend to generating virtual digital human videos and interactive product showcase videos, showcasing its versatility in digital content creation.
We hope you enjoyed this article.
Consider subscribing to one of several newsletters we publish. For example, in the Daily AI Brief you can read the most up to date AI news round-up 6 days per week.
Also, consider following our LinkedIn page AI Brief.
Subscribe to Daily AI Brief
Daily report covering major AI developments and industry news, with both top stories and complete market updates