The depth of recursion becomes a tunable compute axis at inference time, requiring no retraining. A small model, by reading itself, can iterate toward answers that neither it nor any of its workers could reach in a single pass.
大多数人认为模型的能力受其规模和训练数据的限制,需要更大模型或重新训练才能提升性能。但作者提出小模型通过自我递归调用可以在推理时动态扩展能力,无需重新训练就能达到单个模型无法企及的高度。这挑战了规模即能力的行业共识,暗示小模型可能通过自省机制实现突破性能力。