- Feb 2023
-
arxiv.org arxiv.org
-
- Generate instruction via llm
- on gpt3
- with good experiment data
-
- Sep 2022
-
www.zylstra.org www.zylstra.org
- Aug 2022
-
escapingflatland.substack.com escapingflatland.substack.com
-
https://web.archive.org/web/20220810205211/https://escapingflatland.substack.com/p/gpt-3
Blogged a few first associations at https://www.zylstra.org/blog/2022/08/communicating-with-gpt-3/ . Prompt design for narrative research may be a useful experience here. 'Interviewing' GPT-3 a Luhmann-style conversation with a system? Can we ditch our notes for GPT-3? GPT-3 as interface to the internet. Fascinatiing essay, need to explore.
-
- Oct 2020
-
www.youtube.com www.youtube.com
-
i forget a few weeks back i forget if i'm even getting the acronym right or wrong but somebody brought up this gpl3 or some sort of ai library and there was an interesting article that started floating around that was 01:48:14 entirely written by this uh algorithm i don't know if others have read it but it's it's kind of interesting because it was an ai trying to convince you in its own writing that it's will never be capable of say taking over 01:48:27 the world or something like that i'll try and find it and share it if anyone's interested
-
- Aug 2020
-
slatestarcodex.com slatestarcodex.com
-
A machine learning researcher writes me in response to yesterday’s post, saying: I still think GPT-2 is a brute-force statistical pattern matcher which blends up the internet and gives you back a slightly unappetizing slurry of it when asked.
What a machine learning researcher wrote to Scott Alexander.
-
But this should be a wake-up call to people who think AGI is impossible, or totally unrelated to current work, or couldn’t happen by accident. In the context of performing their expected tasks, AIs already pick up other abilities that nobody expected them to learn. Sometimes they will pick up abilities they seemingly shouldn’t have been able to learn, like English-to-French translation without any French texts in their training corpus. Sometimes they will use those abilities unexpectedly in the course of doing other things. All that stuff you hear about “AIs can only do one thing” or “AIs only learn what you program them to learn” or “Nobody has any idea what an AGI would even look like” are now obsolete.
Scott Alexander claims that the results shown by GPT-2 render statements like "AI can only do 1 thing", "AI can only learn what you teach it" and "No one knows what AGI looks like" obsolete.
-
Wittgenstein writes: “The limits of my language mean the limits of my world”. Maybe he was trying to make a restrictive statement, one about how we can’t know the world beyond our language. But the reverse is also true; language and the world have the same boundaries. Learn language really well, and you understand reality. God is One, and His Name is One, and God is One with His Name. “Become good at predicting language” sounds like the same sort of innocent task as “become good at Go” or “become good at Starcraft”. But learning about language involves learning about reality, and prediction is the golden key. “Become good at predicting language” turns out to be a blank check, a license to learn every pattern it can.
Because language is an isomorphic mapping to the world, learning to predict language means you're learning to predict patterns that occur in the world.
-
Imagine you prompted the model with “What is one plus one?” I actually don’t know how it would do on this problem. I’m guessing it would answer “two”, just because the question probably appeared a bunch of times in its training data. Now imagine you prompted it with “What is four thousand and eight plus two thousand and six?” or some other long problem that probably didn’t occur exactly in its training data. I predict it would fail, because this model can’t count past five without making mistakes. But I imagine a very similar program, given a thousand times more training data and computational resources, would succeed. It would notice a pattern in sentences including the word “plus” or otherwise describing sums of numbers, it would figure out that pattern, and it would end up able to do simple math. I don’t think this is too much of a stretch given that GPT-2 learned to count to five and acronymize words and so on.
This is also borne out in my own tests. Easy calculations, the likes of which the model must have seen or easily learnt, it does well on. More exotic ones not so much.
What is interesting is that what predicts whether or not GPT3 is able to do the calculation is not the difficulty of the calculation, but the likelihood it occurred in its training.
-
Again, GPT-2 isn’t good at summarizing. It’s just surprising it can do it at all; it was never designed to learn this skill. All it was designed to do was predict what words came after other words. But there were some naturally-occurring examples of summaries in the training set, so in order to predict what words would come after the words tl;dr, it had to learn what a summary was and how to write one.
Whatever is naturally occurring in GPT2/3's dataset it will learn how to do, whether it be summarization, translation to French etc.
-
A very careless plagiarist takes someone else’s work and copies it verbatim: “The mitochondria is the powerhouse of the cell”. A more careful plagiarist takes the work and changes a few words around: “The mitochondria is the energy dynamo of the cell”. A plagiarist who is more careful still changes the entire sentence structure: “In cells, mitochondria are the energy dynamos”. The most careful plagiarists change everything except the underlying concept, which they grasp at so deep a level that they can put it in whatever words they want – at which point it is no longer called plagiarism.
When you plagiarize a piece of text and you change everything about it except the underlying concept — it is no longer plagiarism.
-
-
deponysum.com deponysum.com
-
It might be instructive to think about what it would take to create a program which has a model of eighth grade science sufficient to understand and answer questions about hundreds of different things like “growth is driven by cell division”, and “What can magnets be used for” that wasn’t NLP led. It would be a nightmare of many different (probably handcrafted) models. Speaking somewhat loosely, language allows for intellectual capacities to be greatly compressed. From this point of view, it shouldn’t be surprising that some of the first signs of really broad capacity- common sense reasoning, wide ranging problem solving etc., have been found in language based programs- words and their relationships are just a vastly more efficient way of representing knowledge than the alternatives.
DePonySum ask us to consider what you would need to program to be able to answer a wide range of eight grade science level questions (e.g. What can magnets be used for.) The answer is you would need a whole slew of separately trained and optimized models.
Language, they say, is a way to compress intellectual capacities.
It is then no surprise that common sense reasoning, and solving a wide range of problems, is first discovered through language models. Words and their relationships are probably a very efficient way of representing knowledge.
-
-
julian.digital julian.digital
-
With a strong enough NLP engine behind the command line interface, the possibilities become endless: Add that New York Times article to your Pocket queue or send it directly to your Kindle to read it later Re-assign Jira tickets directly from Superhuman or send them to your to-do listPay invoices or send money to a friend
Julian Lehr offers an interesting idea. If you can process emails directly, without needing to open them, and if you can do so with a text-based user interface powered with an NLP engine —you've got something very powerful on your hands.
This is especially interesting because with the advent of GPT3 this is actually becoming closer to a reality.
-