Traditionally, a major challenge for building language models was figuring out the most useful way of representing different words—especially because the meanings of many words depend heavily on context. The next-word prediction approach allows researchers to sidestep this thorny theoretical puzzle by turning it into an empirical problem. It turns out that if we provide enough data and computing power, language models end up learning a lot about how human language works simply by figuring out how to best predict the next word. The downside is that we wind up with systems whose inner workings we don’t fully understand. Tim Lee was on staff at Ars from 2017 to 2021. He recently launched a new newsletter, Understanding AI. It explores how AI works and how it's changing our world. You can subscribe to his newsletter here. Sean Trott is an Assistant Professor at University of California, San Diego, where he conducts research on language understanding in humans and large language models. He writes about these topics, and others, in his newsletter The Counterfactual.
Final annotation - one question that came to mind when reading this article - Will developers end up giving new AI personality and voice to speak to members/users of ChatGPT? and what else do developers of AI plan to program AI for?