Hypothesis

Leitersdorf thinks the consistency issue might be partially solved in the model's next version, which will allow users to start generating worlds based on a video of an environment rather than an image.

大多数人认为AI世界模型应该从文本或简单图像生成复杂场景，但作者暗示未来发展方向是基于视频输入生成环境。这一观点挑战了当前AI生成的主流范式，暗示视频可能比静态图像更适合作为世界模型的基础输入，这违背了行业对文本作为主要输入的共识。

non-consensus ai-input-paradigm future-directions

Tags

Annotators

URL