- Generate Impressive Videos with Text Instructions: A Review of OpenAI Sora, Stable Diffusion, Lumier...
Generate Impressive Videos with Text Instructions: A Review of OpenAI Sora, Stable Diffusion, Lumiere and Comparable Models
Authors : Enis Karaarslan, Omer Aydin
Pages : 1-18
Doi:10.36227/techrxiv.170862194.43871446/v1
View : 210 | Download : 0
Publication Date : 2024-02-22
Preprint Type : Research
Abstract :Generating videos from text presents a significant challenge in computer science. However, recent technological advancements in text-to-video AI have shown considerable progress in this area. Realistic videos and data-driven physics engines are expected to drive new developments in the field. This study aims to review generative artificial intelligence approaches in text-to-video synthesis, exploring large language models and AI models. The architectures and models of OpenAI Sora, Stable Diffusion, and Lumiere are examined. While these advancements are exciting, numerous challenges must be addressed on the path to achieving Artificial General Intelligence (AGI). Challenges related to the trustworthiness of generated content, data transparency, preventing misuse, and protecting human rights are explored. The need to reduce environmental and computational costs to ensure sustainability is also discussed. Text-to-video AI holds the potential to revolutionize various creative sectors, including filmmaking, advertising, graphic design, and game development, as well as industries such as social media, influencer marketing, and educational technology. Its potential and future implications are also considered.Keywords : Generative Artificial Intelligence, Large Language Model, text-to-video AI, space-time transformer, diffusion model