21 Matching Annotations
  1. Last 7 days
    1. the robustness of these reasoning behaviors remains underexplored

      「推理行为的鲁棒性尚未被充分探索」——这句话是整个推理模型研究领域的集体盲点声明。过去两年,测试时计算(test-time compute)、长思维链(CoT)、o1/R1 类推理模型吸引了巨大关注,但几乎所有评测都在「孤立问题」环境下进行。在真实 Agent 部署场景中,「能否保持推理深度」这个最基本的可靠性问题,直到这篇论文才开始被系统研究。

    1. we seek a posteriori error estimators whose constants do not blow up as 𝜀→0.

      「ε→0 时常数不爆炸」这个需求揭示了传统方法的致命弱点:大多数能量估计方法在对流占主导(扩散系数 ε 趋于零)时,误差估计常数会以 ε⁻¹ 或更高阶发散,使估计器在实际问题中完全失效。本文的关键贡献正是构造了在整个对流-扩散谱(从抛物型到双曲型)上均匀有效的估计器——这在偏微分方程数值分析中是一个长期未解决的难题。

  2. Aug 2023
    1. In computing, the robustness principle is a design guideline for software that states: "be conservative in what you do, be liberal in what you accept from others". It is often reworded as: "be conservative in what you send, be liberal in what you accept". The principle is also known as Postel's law, after Jon Postel, who used the wording in an early specification of TCP.

      https://en.wikipedia.org/wiki/Robustness_principle

      Robustness principle: be conservative in what you do, be liberal in what you accept from others.

  3. May 2023
    1. This is insightful application of Postel's law en.wikipedia.org/wiki/Robustness_principle. It remains wrong to write software that assumes local parts of email addresses are case-insensitive, but yes, given that there is plenty of wrong software out there, it is also less than robust to require case sensitivity if you are the one accepting the mail.
    1. A flaw can become entrenched as a de facto standard. Any implementation of the protocol is required to replicate the aberrant behavior, or it is not interoperable. This is both a consequence of applying the robustness principle, and a product of a natural reluctance to avoid fatal error conditions. Ensuring interoperability in this environment is often referred to as aiming to be "bug for bug compatible".
  4. Feb 2022
    1. Deepti Gurdasani. (2022, January 29). Going to say this again because it’s important. Case-control studies to determine prevalence of long COVID are completely flawed science, but are often presented as being scientifically robust. This is not how we can define clinical syndromes or their prevalence! A thread. [Tweet]. @dgurdasani1. https://twitter.com/dgurdasani1/status/1487366920508694529

  5. Jan 2021
  6. Oct 2020
  7. Sep 2020
  8. Aug 2020
    1. Ray, E. L., Wattanachit, N., Niemi, J., Kanji, A. H., House, K., Cramer, E. Y., Bracher, J., Zheng, A., Yamana, T. K., Xiong, X., Woody, S., Wang, Y., Wang, L., Walraven, R. L., Tomar, V., Sherratt, K., Sheldon, D., Reiner, R. C., Prakash, B. A., … Consortium, C.-19 F. H. (2020). Ensemble Forecasts of Coronavirus Disease 2019 (COVID-19) in the U.S. MedRxiv, 2020.08.19.20177493. https://doi.org/10.1101/2020.08.19.20177493

  9. Jun 2020
  10. Feb 2019
    1. Are All Layers Created Equal?

      Google的这文2个 idea 很简单:一个是在 trained 网络各层的参数分别换回训练前的初始参数而观察相应各层的鲁棒性;另一个是把上一个 idea 基础上把那套初始参数再从某分布中随机取一次瞅效果。此 paper 的严谨的验证试验过程是最值得学习的~[并不简单]

  11. Nov 2018
    1. Is Robustness the Cost of Accuracy? -- A Comprehensive Study on the Robustness of 18 Deep Image Classification Models

      这文帅了~ 信息丰富 超多的图~ 让人眼前一亮~

      探讨了18个模型的鲁棒性和准确率。结论很多,如模型构架是影响鲁棒性和准确率的重要因素(似乎是废话);相似模型构架基础上增加“深度”对鲁棒性的提升很微弱;有些模型(Vgg类)的表现出很强的对抗样本迁移性。。。