30 Matching Annotations
  1. Jun 2025
  2. May 2025
    1. n short, the argument for a nonzero risk of a paperclip maximizer scenario rests on assumptions that may or may not be true, and it is reasonable to think that research can give us a better idea of whether these assumptions hold true for the kinds of AI systems that are being built or envisioned. For these reasons, we call it a ‘speculative’ risk, and examine the policy implications of this view in Part IV.

      This isn't a real objection

    2. including being able to examine the internals of the target system (how useful this advantage is depends on how the system is designed and how much we invest in interpretability techniques)

      Okay but interpretability is really hard

    3. But, in the normal technology view, deception is a mere engineering problem, albeit an important one, to be addressed during development and throughout deployment. Indeed, it is already a standard part of the safety evaluation of powerful AI models.74

      bruh

    4. Long before a system would be granted access to consequential decisions, it would need to demonstrate reliable performance in less critical contexts. Any system that interprets commands over-literally or lacks common sense would fail these earlier tests.

      I think this is a fair enough point, but the general assumption is that AGI and ASI could persuade their way to this

    5. In the view of AI as normal technology, catastrophic misalignment is (by far) the most speculative of the risks that we discuss. But what is a speculative risk—aren’t all risks speculative? The difference comes down to the two types of uncertainty, and the correspondingly different interpretations of probability.

      "we just decided to define this risk as unimportant out of our random assumption"

    6. Therefore, there is no straightforward reason to expect arms races between countries. Note that, since our concern in this section is accidents, not misuse, cyberattacks against foreign countries are out of scope.

      I think I need to just figure out whether or not AI progress will actually work

    7. Despite shrill U.S.-China arms race rhetoric, it is not clear that AI regulation has slowed down in either country.60 60. Matt Sheehan. 2023. China’s AI regulations and how they get made. https://carnegieendowment.org/2023/07/10/china-s-ai-regulations-and-how-they-get-made-pub-90117. In the U.S., 700 AI-related bills were introduced in state legislatures in 2024 alone, and dozens of them have passed.61 61. Heather Curry, 2024. 2024 state summary on AI. BSA TechPost (October 2024). https://techpost.bsa.org/2024/10/22/2024-state-summary-on-ai/.

      But governments aren't currently incentivized to react to the future possibility of AGI. Right now, they're just listening to public complaints about dumb stuff like AI art. Public perception does not generally view AI as an arms race yet, which is why the government hasn't responded like that. However, if it did begin to seem more like an arms race, it is totally imaginable that the government would change its direction.

    8. In short, AI arms races might happen, but they are sector specific, and should be addressed through sector-specific regulations.

      Bruh they just misinterpreted "arms race" and explained why their misinterpretation wasn't problematic

    9. One important caveat: We explicitly exclude military AI from our analysis, as it involves classified capabilities and unique dynamics that require a deeper analysis, which is beyond the scope of this essay.

      I'm pretty sure this is the whole reason why people are actually worried

    10. rutiny, and it remains to be seen how much its safety attitude will cost the company.53 53. Jonathan Stempel. 2024. Tesla must face vehicle owners’ lawsuit over self-driving claims. Reuters (May 2024). https://www.reuters.com/legal/tesla-must-face-vehicle-owners-lawsuit-over-self-driving-claims-2024-05-15/. We think that these correlations are causal. Cruise’s license being revoked was a big part of the reason that it fell behind Waymo, and safety was also a factor in Uber’s self-driving failure.54

      I feel like this paper might just be a series of bad analogies

    11. We often hear that all that is needed to build AGI is scaling, or generalist AI agents, or sample-efficient learning.

      On twitter? Who is this a response to, exactly?

    12. Now that large language models can arguably pass it while only weakly meeting the expectations behind the test, its significance has waned.

      "When we find that an AI has passed a benchmark, we can either assume the benchmark wasn't any good, or we can concede that the AI is actually getting smarter." It sounds like they're just generalizing various forms of benchmarks.

    13. It is more likely that we will continue to see a gradual increase in the role of automation in AI development than a singular, discontinuous moment when recursive self-improvement is achieved.

      I think this just misunderstands a lot of the predictions

    14. articularly Graphics Processing Units. Computational and cost limits continue to be relevant to new paradigms, including inference-time scaling. New slowdowns may emerge: Recent signs point to a shift away from the culture of open knowledge sharing in the industry.

      Argument: we might get bottlenecked on tech. I don't think so but idk. This isn't really a probability estimate, it's just a vague phrase. I guess the paper isn't really trying to do much realistic forecasting though

    15. but also organizations and institutions, can adapt to technology. This is a trend that we have also seen for past general-purpose technologies: Diffusion occurs over decades, not year

      I guess I agree with generally slow diffusion for mainstream society, but I can imagine certain corporations skyrocketing in capability as they learn more and more about it.

    16. afety-critical areas, AI adoption is slower than popular accounts would suggest. For example, a study made headlines due to the finding that, in August 2024, 40% of U.S. adults used generative AI.13 13. Alexander Bick, Adam Blandin, and David J. Deming. 2024. The Rapid Adoption of Generative AI. National Bureau of Economic Research. But, because most people used it infrequently, this only translated to 0.5%-3.5%

      I wonder what they think about the possibility of an arms race

    17. These limits are often enforced through regulation, such as the FDA’s supervision of medical devices, as well as newer legislation such as the EU AI Act, which puts strict requirements on high-risk AI.10 10. Jamie Bernardi et al. 2024. Societal adaptation to advanced AI. arXiv: May 2024. Retrieved from http://arxiv.org/abs/2405.10295; Center for Devices and Radiological Health. 2024. Regulatory evaluation of new artificial intelligence (AI) uses for improving and automating medical practices. FDA (June 2024). https://www.fda.gov/medical-devices/medical-device-regulatory-science-research-programs-conducted-osel/regulatory-evaluation-new-artificial-intelligence-ai-uses-improving-and-automating-medical-practices; “Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying down Harmonised Rules on Artificial Intelligence and Amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act) (Text with EEA Relevance),” June 2024, http://data.europa.eu/eli/reg/2024/1689/oj/eng. In fact, there are (credible) concerns that existing regulation of high-risk AI is so onerous that it may lead to “runaway bureaucracy”.11 11. Javier Espinoza. 2024. Europe’s rushed attempt to set the rules for AI. Financial Times (July 2024). https://www.ft.com/content/6cc7847a-2fc5-4df0-b113-a435d6426c81; Daniel E. Ho and Nicholas Bagley. 2024. Runaway bureaucracy could make common uses of ai worse, even mail delivery. The Hill (January 2024). https://thehill.com/opinion/technology/4405286-runaway-bureaucracy-could-make-common-uses-of-ai-worse-even-mail-delivery/. Thus, we predict that slow diffusion will continue to be the norm in high-consequence tasks

      To adequately measure this, we need to understand the rate at which AI limits are being imposed, and I don't know that an opinion piece from a year ago is going to fully do that.

    18. n the case of generative AI, even failures that seem extremely obvious in hindsight were not caught during testing. One example is the early Bing chatbot “Sydney” that went off the rails during extended conversations; the developers evidently did not anticipate that conversations could last for more than a handful of turns.8 8. Kevin Roose. 2023. A Conversation With Bing’s Chatbot Left Me Deeply Unsettled. The New York Times (February 2023). https://www.nytimes.com/2023/02/16/technology/bing-chatbot-microsoft-chatgpt.html. Similarly, the Gemini image generator was seemingly never tested on historical figures.9 9. Dan Milmo and Alex Hern. 2024. ‘We definitely messed up’: why did Google AI tool make offensive historical images? The Guardian (March 2024). https://www.theguardian.com/technology/2024/mar/08/we-definitely-messed-up-why-did-google-ai-tool-make-offensive-historical-images Fortunately, these were not highly consequential applicatio

      Is the whole argument just going to be "AI has made mistakes in the past, and it will continue to do so"?

    19. Interpretability and auditing methods will no doubt improve so that we will get much better at catching these issues, but we are not there yet.

      Not there yet? bruh. 2021.

    20. In other words, in this broad set of domains, AI diffusion lags decades behind innovation. A major reason is safety—when models are more complex and less intelligible, it is hard to anticipate all possible deployment conditions in the testing and validation process. A good example is Epic’s sepsis prediction tool which, despite having seemingly high accuracy when internally validated, performed far worse in hospitals, missing two thirds of sepsis cases and overwhelming physicians with false alerts.6 6

      Sure. But an AGI that can start any new task and perform extremely well at it with little training data (definition of AGI) could be integrated extremely quickly, even while current LLMs are weak. The article cited here was from 2021, meaning AI can already be integrated much more effectively

    21. the impact of AI is materialized not when methods and capabilities improve, but when those improvements are translated into applications and are diffused through productive sectors of the economy.

      But surely the impact happens in some areas much faster than others

    22. This essay has the unusual goal of stating a worldview rather than defending a proposition. The literature on AI superintelligence is copious. We have not tried to give a point-by-point response to potential counter arguments, as that would make the paper several times longer. This paper is merely the initial articulation of our views; we plan to elaborate on them in various follow ups.

      So... you have no argument?

    23. It rejects technological determinism, especially the notion of AI itself as an agent in determining its future.

      Technological determinism is an interesting concept. Should we reject technological determinism? I would imagine that Yudkowsky would respond that an AI could persuade its creators to do anything. The AI box experiment is proof of this. Unless the authors can give adequate reasons to reject technological determinism, I don't know that I can agree with this paper. Particularly, I think the history of women's rights kinda show technological determinism.

    1. However, when it comes down to it, it is rarely outstanding achievement or strength of character that gives a person high status in a social environment.

      Test annotation