8 Matching Annotations
  1. May 2026
    1. AI Village gives multiple AI agents their own computer environments and a shared group chat, then tasks them with open-ended real-world goals like fundraising, organizing events, making games, and gaining subscribers.

      这个案例展示了开放世界评估的实际应用,每年约5万美元的成本表明这种评估需要相当大的资源投入。相比传统基准测试,这种评估方式更接近真实应用场景,但也因此成本更高,难以大规模实施。

  2. Apr 2026
    1. The model reportedly scored 93.9% on SWE-bench Verified and 77.8% on SWE-bench Pro, but its strongest signal came from real-world results, including uncovering a 27-year-old flaw in OpenBSD, a 16-year-old vulnerability in FFmpeg, and autonomously chaining Linux kernel exploits without human input.

      这些惊人的安全漏洞发现能力表明AI已经超越了传统安全工具,能够自主发现几十年未被发现的漏洞。特别是能够自主链接Linux内核漏洞的能力,展示了AI在网络安全领域的革命性潜力,这可能彻底改变安全研究和漏洞修复的方式。

  3. Feb 2022
    1. Deepti Gurdasani. (2022, January 10). Lots of people dismissing links between COVID-19 and all-cause diabetes. An association that’s been shown in multiple studies- whether this increase is due to more diabetes or SARS2 precipitating diabetic keto-acidosis allowing these to be diagnosed is not known. A brief look👇 [Tweet]. @dgurdasani1. https://twitter.com/dgurdasani1/status/1480546865812840450

  4. Jan 2022
    1. Olson, S. M., Newhams, M. M., Halasa, N. B., Price, A. M., Boom, J. A., Sahni, L. C., Pannaraj, P. S., Irby, K., Walker, T. C., Schwartz, S. P., Maddux, A. B., Mack, E. H., Bradford, T. T., Schuster, J. E., Nofziger, R. A., Cameron, M. A., Chiotos, K., Cullimore, M. L., Gertz, S. J., … Randolph, A. G. (2022). Effectiveness of BNT162b2 Vaccine against Critical Covid-19 in Adolescents. New England Journal of Medicine, 0(0), null. https://doi.org/10.1056/NEJMoa2117995

  5. Dec 2021
  6. May 2021
  7. Jan 2018