Hypothesis

GLM-5.1 did not plateau after 50 or 100 submissions, but continued to find meaningful improvements over 600+ iterations with 6,000+ tool calls, ultimately reaching 21.5k QPS—roughly 6× the best result achieved in a single 50-turn session.

令人惊讶的是：GLM-5.1在向量数据库优化任务中能够持续改进600多次迭代，性能提升达到原来的6倍，这打破了传统模型很快达到性能瓶颈的局限。这种长时间持续优化的能力在AI模型中极为罕见，展示了模型在长期任务处理上的突破性进步。

surprising long-horizon-optimization ai-capabilities

Tags

Annotators

URL