Hypothesis

The recurrent structure is optimized for iterative composition — running a reasoning chain forward — but does not inherently improve the storage of rote facts. This maps to an observable characteristic of Mythos: it reasons exceptionally well about novel problems it has never seen, but its factual recall can be inconsistent.

这一发现揭示了循环模型的一个关键局限性：它们在推理方面表现出色，但在记忆方面可能不如传统Transformer。这一反直觉的观察表明，不同架构可能适用于不同类型的任务，挑战了'通用架构解决所有问题'的观点。这也解释了为什么Mythos在某些事实性任务上表现不佳。

memorization-vs-reasoning architecture-limitation

Tags

Annotators

URL