Hypothesis

Diffusion models also waste resources when the desired output is only a few tokens long. They have to do a lot more parallel work to whittle down to, say, five tokens that an autoregressive model does from beginning to end in just five steps.

文章客观地指出了扩散模型在短文本生成时的局限性，显示了平衡的观点。这值得深入了解扩散模型在不同任务长度下的效率表现，以及Google是否针对这一局限性进行了优化。

model-limitation balanced-view

Tags

Annotators

URL