- Nov 2024
-
interconnected.org interconnected.org
-
That development time acceleration of 4 days down to 20 minutes… that’s equivalent to about 10 years of Moore’s Law cycles. That is, using generative AI like this is equivalent to computers getting 10 years better overnight. That was a real eye-opening framing for me. AI isn’t magical, it’s not sentient, it’s not the end of the world nor our saviour; we don’t need to endlessly debate “intelligence” or “reasoning.” It’s just that… computers got 10 years better.
To [[Matt Webb]] the project using GPT3 extracting data from web pages saved him 4d of work (compared to 20 mins coding up the GPT-3 instructions, and ignoring GPT-3 then ran overnight). Saying that's about 10yrs of Moore's law happening to him all at once. 'computers got 10yrs better' an enticing thought and framing. It depends on the use case probably, others will lose 10 yrs of their time making sense of generated nonsense. (Vgl the #pke24 experiments I did w text generation, none of it was usable bc enough was wrong to not be able to trust anything). Sticking to specific niches probably true : [[Waar AI al redelijk goed in is 20201226155259]], turning the issue into the time needed to spot those niches for yourself.
-
I was one of the first people to use gen-AI for data extraction instead of chatbots
[[Matt Webb]] used gpt-3 in Feb 23 to extract data from a bunch of webpages. Suggests it's the kernel for programmatic AI idea among SV hackers. Vgl Google AI [[Ed Parsons]] at [[Open Geodag 20241107100937^aiunstructdata]] last week where he mentioned using AI to turn unstructured (geo) data into structured. Page found via [[Frank Meeuwsen]] https://frankmeeuwsen.com/2024/11/11/vertragen-en-verdiepen.html
-
- Feb 2023
-
arxiv.org arxiv.org
-
- Generate instruction via llm
- on gpt3
- with good experiment data
-
- Sep 2022
-
www.zylstra.org www.zylstra.org
- Aug 2022
-
escapingflatland.substack.com escapingflatland.substack.com
-
https://web.archive.org/web/20220810205211/https://escapingflatland.substack.com/p/gpt-3
Blogged a few first associations at https://www.zylstra.org/blog/2022/08/communicating-with-gpt-3/ . Prompt design for narrative research may be a useful experience here. 'Interviewing' GPT-3 a Luhmann-style conversation with a system? Can we ditch our notes for GPT-3? GPT-3 as interface to the internet. Fascinatiing essay, need to explore.
-
- Oct 2020
-
www.youtube.com www.youtube.com
-
i forget a few weeks back i forget if i'm even getting the acronym right or wrong but somebody brought up this gpl3 or some sort of ai library and there was an interesting article that started floating around that was 01:48:14 entirely written by this uh algorithm i don't know if others have read it but it's it's kind of interesting because it was an ai trying to convince you in its own writing that it's will never be capable of say taking over 01:48:27 the world or something like that i'll try and find it and share it if anyone's interested
-
- Aug 2020
-
slatestarcodex.com slatestarcodex.com
-
A machine learning researcher writes me in response to yesterday’s post, saying: I still think GPT-2 is a brute-force statistical pattern matcher which blends up the internet and gives you back a slightly unappetizing slurry of it when asked.
What a machine learning researcher wrote to Scott Alexander.
-
But this should be a wake-up call to people who think AGI is impossible, or totally unrelated to current work, or couldn’t happen by accident. In the context of performing their expected tasks, AIs already pick up other abilities that nobody expected them to learn. Sometimes they will pick up abilities they seemingly shouldn’t have been able to learn, like English-to-French translation without any French texts in their training corpus. Sometimes they will use those abilities unexpectedly in the course of doing other things. All that stuff you hear about “AIs can only do one thing” or “AIs only learn what you program them to learn” or “Nobody has any idea what an AGI would even look like” are now obsolete.
Scott Alexander claims that the results shown by GPT-2 render statements like "AI can only do 1 thing", "AI can only learn what you teach it" and "No one knows what AGI looks like" obsolete.
-
Wittgenstein writes: “The limits of my language mean the limits of my world”. Maybe he was trying to make a restrictive statement, one about how we can’t know the world beyond our language. But the reverse is also true; language and the world have the same boundaries. Learn language really well, and you understand reality. God is One, and His Name is One, and God is One with His Name. “Become good at predicting language” sounds like the same sort of innocent task as “become good at Go” or “become good at Starcraft”. But learning about language involves learning about reality, and prediction is the golden key. “Become good at predicting language” turns out to be a blank check, a license to learn every pattern it can.
Because language is an isomorphic mapping to the world, learning to predict language means you're learning to predict patterns that occur in the world.
-
Imagine you prompted the model with “What is one plus one?” I actually don’t know how it would do on this problem. I’m guessing it would answer “two”, just because the question probably appeared a bunch of times in its training data. Now imagine you prompted it with “What is four thousand and eight plus two thousand and six?” or some other long problem that probably didn’t occur exactly in its training data. I predict it would fail, because this model can’t count past five without making mistakes. But I imagine a very similar program, given a thousand times more training data and computational resources, would succeed. It would notice a pattern in sentences including the word “plus” or otherwise describing sums of numbers, it would figure out that pattern, and it would end up able to do simple math. I don’t think this is too much of a stretch given that GPT-2 learned to count to five and acronymize words and so on.
This is also borne out in my own tests. Easy calculations, the likes of which the model must have seen or easily learnt, it does well on. More exotic ones not so much.
What is interesting is that what predicts whether or not GPT3 is able to do the calculation is not the difficulty of the calculation, but the likelihood it occurred in its training.
-
Again, GPT-2 isn’t good at summarizing. It’s just surprising it can do it at all; it was never designed to learn this skill. All it was designed to do was predict what words came after other words. But there were some naturally-occurring examples of summaries in the training set, so in order to predict what words would come after the words tl;dr, it had to learn what a summary was and how to write one.
Whatever is naturally occurring in GPT2/3's dataset it will learn how to do, whether it be summarization, translation to French etc.
-
A very careless plagiarist takes someone else’s work and copies it verbatim: “The mitochondria is the powerhouse of the cell”. A more careful plagiarist takes the work and changes a few words around: “The mitochondria is the energy dynamo of the cell”. A plagiarist who is more careful still changes the entire sentence structure: “In cells, mitochondria are the energy dynamos”. The most careful plagiarists change everything except the underlying concept, which they grasp at so deep a level that they can put it in whatever words they want – at which point it is no longer called plagiarism.
When you plagiarize a piece of text and you change everything about it except the underlying concept — it is no longer plagiarism.
-
-
deponysum.com deponysum.com
-
It might be instructive to think about what it would take to create a program which has a model of eighth grade science sufficient to understand and answer questions about hundreds of different things like “growth is driven by cell division”, and “What can magnets be used for” that wasn’t NLP led. It would be a nightmare of many different (probably handcrafted) models. Speaking somewhat loosely, language allows for intellectual capacities to be greatly compressed. From this point of view, it shouldn’t be surprising that some of the first signs of really broad capacity- common sense reasoning, wide ranging problem solving etc., have been found in language based programs- words and their relationships are just a vastly more efficient way of representing knowledge than the alternatives.
DePonySum ask us to consider what you would need to program to be able to answer a wide range of eight grade science level questions (e.g. What can magnets be used for.) The answer is you would need a whole slew of separately trained and optimized models.
Language, they say, is a way to compress intellectual capacities.
It is then no surprise that common sense reasoning, and solving a wide range of problems, is first discovered through language models. Words and their relationships are probably a very efficient way of representing knowledge.
-
-
julian.digital julian.digital
-
With a strong enough NLP engine behind the command line interface, the possibilities become endless: Add that New York Times article to your Pocket queue or send it directly to your Kindle to read it later Re-assign Jira tickets directly from Superhuman or send them to your to-do listPay invoices or send money to a friend
Julian Lehr offers an interesting idea. If you can process emails directly, without needing to open them, and if you can do so with a text-based user interface powered with an NLP engine —you've got something very powerful on your hands.
This is especially interesting because with the advent of GPT3 this is actually becoming closer to a reality.
-