34 Matching Annotations
  1. Mar 2024
  2. Jan 2024
    1. Hubinger, et. al. "SLEEPER AGENTS: TRAINING DECEPTIVE LLMS THAT PERSIST THROUGH SAFETY TRAINING". Arxiv: 2401.05566v3. Jan 17, 2024.

      Very disturbing and interesting results from team of researchers from Anthropic and elsewhere.

  3. Oct 2023
    1. Wu, Prabhumoye, Yeon Min, Bisk, Salakhutdinov, Azaria, Mitchell and Li. "SPRING: GPT-4 Out-performs RL Algorithms byStudying Papers and Reasoning". Arxiv preprint arXiv:2305.15486v2, May, 2023.

    1. Zecevic, Willig, Singh Dhami and Kersting. "Causal Parrots: Large Language Models May Talk Causality But Are Not Causal". In Transactions on Machine Learning Research, Aug, 2023.

    1. "The Age of AI has begun : Artificial intelligence is as revolutionary as mobile phones and the Internet." Bill Gates, March 21, 2023. GatesNotes

    1. Feng, 2022. "Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis"

      Shared and found via: Gowthami Somepalli @gowthami@sigmoid.social Mastodon > Gowthami Somepalli @gowthami StructureDiffusion: Improve the compositional generation capabilities of text-to-image #diffusion models by modifying the text guidance by using a constituency tree or a scene graph.

    1. Training language models to follow instructionswith human feedback

      Original Paper for discussion of the Reinforcement Learning with Human Feedback algorithm.

    1. LaMDA: Language Models for Dialog Application

      "LaMDA: Language Models for Dialog Application" Meta's introduction of LaMDA v1 Large Language Model.

  4. Jul 2023
    1. Daniel Adiwardana Minh-Thang Luong David R. So Jamie Hall, Noah Fiedel Romal Thoppilan Zi Yang Apoorv Kulshreshtha, Gaurav Nemade Yifeng Lu Quoc V. Le "Towards a Human-like Open-Domain Chatbot" Google Research, Brain Team

      Defined the SSI metric for chatbots used in LAMDA paper by google.

  5. Apr 2023
    1. The Annotated S4 Efficiently Modeling Long Sequences with Structured State Spaces Albert Gu, Karan Goel, and Christopher Ré.

      A new approach to transformers

    1. Efficiently Modeling Long Sequences with Structured State SpacesAlbert Gu, Karan Goel, and Christopher R ́eDepartment of Computer Science, Stanford University

    1. Bowman, Samuel R.. "Eight Things to Know about Large Language Models." arXiv, (2023). https://doi.org/https://arxiv.org/abs/2304.00612v1.


      The widespread public deployment of large language models (LLMs) in recent months has prompted a wave of new attention and engagement from advocates, policymakers, and scholars from many fields. This attention is a timely response to the many urgent questions that this technology raises, but it can sometimes miss important considerations. This paper surveys the evidence for eight potentially surprising such points: 1. LLMs predictably get more capable with increasing investment, even without targeted innovation. 2. Many important LLM behaviors emerge unpredictably as a byproduct of increasing investment. 3. LLMs often appear to learn and use representations of the outside world. 4. There are no reliable techniques for steering the behavior of LLMs. 5. Experts are not yet able to interpret the inner workings of LLMs. 6. Human performance on a task isn't an upper bound on LLM performance. 7. LLMs need not express the values of their creators nor the values encoded in web text. 8. Brief interactions with LLMs are often misleading.

      Found via: Taiwan's Gold Card draws startup founders, tech workers | Semafor

    1. It was only by building an additional AI-powered safety mechanism that OpenAI would be able to rein in that harm, producing a chatbot suitable for everyday use.

      This isn't true. The Stochastic Parrots paper outlines other avenues for reining in the harms of language models like GPT's.

  6. Mar 2023
    1. Ganguli, Deep, Askell, Amanda, Schiefer, Nicholas, Liao, Thomas I., Lukošiūtė, Kamilė, Chen, Anna, Goldie, Anna et al. "The Capacity for Moral Self-Correction in Large Language Models." arXiv, (2023). https://doi.org/https://arxiv.org/abs/2302.07459v2.


      We test the hypothesis that language models trained with reinforcement learning from human feedback (RLHF) have the capability to "morally self-correct" -- to avoid producing harmful outputs -- if instructed to do so. We find strong evidence in support of this hypothesis across three different experiments, each of which reveal different facets of moral self-correction. We find that the capability for moral self-correction emerges at 22B model parameters, and typically improves with increasing model size and RLHF training. We believe that at this level of scale, language models obtain two capabilities that they can use for moral self-correction: (1) they can follow instructions and (2) they can learn complex normative concepts of harm like stereotyping, bias, and discrimination. As such, they can follow instructions to avoid certain kinds of morally harmful outputs. We believe our results are cause for cautious optimism regarding the ability to train language models to abide by ethical principles.

    1. Dass das ägyptische Wort p.t (sprich: pet) "Himmel" bedeutet, lernt jeder Ägyptologiestudent im ersten Semester. Die Belegsammlung im Archiv des Wörterbuches umfaßt ca. 6.000 Belegzettel. In der Ordnung dieses Materials erfährt man nun, dass der ägyptische Himmel Tore und Wege hat, Gewässer und Ufer, Seiten, Stützen und Kapellen. Damit wird greifbar, dass der Ägypter bei dem Wort "Himmel" an etwas vollkommen anderes dachte als der moderne westliche Mensch, an einen mythischen Raum nämlich, in dem Götter und Totengeister weilen. In der lexikographischen Auswertung eines so umfassenden Materials geht es also um weit mehr als darum, die Grundbedeutung eines banalen Wortes zu ermitteln. Hier entfaltet sich ein Ausschnitt des ägyptischen Weltbildes in seinem Reichtum und in seiner Fremdheit; und naturgemäß sind es gerade die häufigen Wörter, die Schlüsselbegriffe der pharaonischen Kultur bezeichnen. Das verbreitete Mißverständnis, das Häufige sei uninteressant, stellt die Dinge also gerade auf den Kopf.

      Google translation:

      Every Egyptology student learns in their first semester that the Egyptian word pt (pronounced pet) means "heaven". The collection of documents in the dictionary archive comprises around 6,000 document slips. In the order of this material one learns that the Egyptian heaven has gates and ways, waters and banks, sides, pillars and chapels. This makes it tangible that the Egyptians had something completely different in mind when they heard the word "heaven" than modern Westerners do, namely a mythical space in which gods and spirits of the dead dwell.

      This is a fantastic example of context creation for a dead language as well as for creating proper historical context.

    2. In looking at the uses of and similarities between Wb and TLL, I can't help but think that these two zettelkasten represented the state of the art for Large Language Models and some of the ideas behind ChatGPT

    1. Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. FAccT ’21. New York, NY, USA: Association for Computing Machinery, 2021. https://doi.org/10.1145/3442188.3445922.

      Would the argument here for stochastic parrots also potentially apply to or could it be abstracted to Markov monkeys?

  7. Feb 2023
  8. Jan 2023
  9. Dec 2022
    1. natural-language processing is going to force engineers and humanists together. They are going to need each other despite everything. Computer scientists will require basic, systematic education in general humanism: The philosophy of language, sociology, history, and ethics are not amusing questions of theoretical speculation anymore. They will be essential in determining the ethical and creative use of chatbots, to take only an obvious example.
    1. Houston, we have a Capability Overhang problem: Because language models have a large capability surface, these cases of emergent capabilities are an indicator that we have a ‘capabilities overhang’ – today’s models are far more capable than we think, and our techniques available for exploring the models are very juvenile. We only know about these cases of emergence because people built benchmark datasets and tested models on them. What about all the capabilities we don’t know about because we haven’t thought to test for them? There are rich questions here about the science of evaluating the capabilities (and safety issues) of contemporary models. 
  10. Apr 2021
  11. Aug 2020
    1. It might be instructive to think about what it would take to create a program which has a model of eighth grade science sufficient to understand and answer questions about hundreds of different things like “growth is driven by cell division”, and “What can magnets be used for” that wasn’t NLP led. It would be a nightmare of many different (probably handcrafted) models. Speaking somewhat loosely, language allows for intellectual capacities to be greatly compressed. From this point of view, it shouldn’t be surprising that some of the first signs of really broad capacity- common sense reasoning, wide ranging problem solving etc., have been found in language based programs- words and their relationships are just a vastly more efficient way of representing knowledge than the alternatives.

      DePonySum ask us to consider what you would need to program to be able to answer a wide range of eight grade science level questions (e.g. What can magnets be used for.) The answer is you would need a whole slew of separately trained and optimized models.

      Language, they say, is a way to compress intellectual capacities.

      It is then no surprise that common sense reasoning, and solving a wide range of problems, is first discovered through language models. Words and their relationships are probably a very efficient way of representing knowledge.