- Aug 2023
-
Local file Local file
-
Some may not realize it yet, but the shift in technology represented by ChatGPT is just another small evolution in the chain of predictive text with the realms of information theory and corpus linguistics.
Claude Shannon's work along with Warren Weaver's introduction in The Mathematical Theory of Communication (1948), shows some of the predictive structure of written communication. This is potentially better underlined for the non-mathematician in John R. Pierce's book An Introduction to Information Theory: Symbols, Signals and Noise (1961) in which discusses how one can do a basic analysis of written English to discover that "e" is the most prolific letter or to predict which letters are more likely to come after other letters. The mathematical structures have interesting consequences like the fact that crossword puzzles are only possible because of the repetitive nature of the English language or that one can use the editor's notation "TK" (usually meaning facts or date To Come) in writing their papers to make it easy to find missing information prior to publication because the statistical existence of the letter combination T followed by K is exceptionally rare and the only appearances of it in long documents are almost assuredly areas which need to be double checked for data or accuracy.
Cell phone manufacturers took advantage of the lower levels of this mathematical predictability to create T9 predictive text in early mobile phone technology. This functionality is still used in current cell phones to help speed up our texting abilities. The difference between then and now is that almost everyone takes the predictive magic for granted.
As anyone with "fat fingers" can attest, your phone doesn't always type out exactly what you mean which can result in autocorrect mistakes (see: DYAC (Damn You AutoCorrect)) of varying levels of frustration or hilarity. This means that when texting, one needs to carefully double check their work before sending their text or social media posts or risk sending their messages to Grand Master Flash instead of Grandma.
The evolution in technology effected by larger amounts of storage, faster processing speeds, and more text to study means that we've gone beyond the level of predicting a single word or two ahead of what you intend to text, but now we're predicting whole sentences and even paragraphs which make sense within a context. ChatGPT means that one can generate whole sections of text which will likely make some sense.
Sadly, as we know from our T9 experience, this massive jump in predictability doesn't mean that ChatGPT or other predictive artificial intelligence tools are "magically" correct! In fact, quite often they're wrong or will predict nonsense, a phenomenon known as AI hallucination. Just as with T9, we need to take even more time and effort to not only spell check the outputs from the machine, but now we may need to check for the appropriateness of style as well as factual substance!
The bigger near-term problem is one of human understanding and human communication. While the machine may appear to magically communicate (often on our behalf if we're publishing it's words under our names), is it relaying actual meaning? Is the other person reading these words understanding what was meant to have been communicated? Do the words create knowledge? Insight?
We need to recall that Claude Shannon specifically carved semantics and meaning out of the picture in the second paragraph of his seminal paper:
Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.
So far ChatGPT seems to be accomplishing magic by solving a small part of an engineering problem by being able to explore the adjacent possible. It is far from solving the human semantic problem much less the un-adjacent possibilities (potentially representing wisdom or insight), and we need to take care to be aware of that portion of the unsolved problem. Generative AIs are also just choosing weighted probabilities and spitting out something which is prone to seem possible, but they're not optimizing for which of many potential probabilities is the "best" or the "correct" one. For that, we still need our humanity and faculties for decision making.
Shannon, Claude E. A Mathematical Theory of Communication. Bell System Technical Journal, 1948.
Shannon, Claude E., and Warren Weaver. The Mathematical Theory of Communication. University of Illinois Press, 1949.
Pierce, John Robinson. An Introduction to Information Theory: Symbols, Signals and Noise. Second, Revised. Dover Books on Mathematics. 1961. Reprint, Mineola, N.Y: Dover Publications, Inc., 1980. https://www.amazon.com/Introduction-Information-Theory-Symbols-Mathematics/dp/0486240614.
Shannon, Claude Elwood. “The Bandwagon.” IEEE Transactions on Information Theory 2, no. 1 (March 1956): 3. https://doi.org/10.1109/TIT.1956.1056774.
We may also need to explore The Bandwagon, an early effect which Shannon noticed and commented upon. Everyone seems to be piling on the AI bandwagon right now...
-
- Oct 2022
-
stevenberlinjohnson.com stevenberlinjohnson.com
-
I would put creativity into three buckets. If we define creativity as coming up with something novel or new for a purpose, then I think what AI systems are quite good at the moment is interpolation and extrapolation.
Demis Hassabis, the founder of DeepMind, classifies creativity in three ways: interpolation, extrapolation, and "true invention". He defines the first two traditionally, but gives a more vague description of the third. What exactly is "true invention"?
How can one invent without any catalyst at all? How can one invent outside of a problem's solution space? outside of the adjacent possible? Does this truly exist? Or doesn't it based on definition.
-
- Apr 2022
-
-
three steps required to solve the all-importantcorrespondence problem. Step one, according to Shenkar: specify one’s ownproblem and identify an analogous problem that has been solved successfully.Step two: rigorously analyze why the solution is successful. Jobs and hisengineers at Apple’s headquarters in Cupertino, California, immediately got towork deconstructing the marvels they’d seen at the Xerox facility. Soon theywere on to the third and most challenging step: identify how one’s owncircumstances differ, then figure out how to adapt the original solution to thenew setting.
Oded Shenkar's three step process for effective problem solving using imitation: - Step 1. Specify your problem and identify an analogous problem that has been successfully solved. - Step 2. Analyze why the solution was successful. - Step 3. Identify how your problem and circumstances differ from the example problem and figure out how to best and most appropriately adapt the original solution to the new context.
The last step may be the most difficult.
The IndieWeb broadly uses the idea of imitation to work on and solve a variety of different web design problems. By focusing on imitation they dramatically decrease the work and effort involved in building a website. The work involved in creating new innovative solutions even in their space has been much harder, but there, they imitate others in breaking the problems down into the smallest constituent parts and getting things working there.
Link this to the idea of "leading by example".
Link to "reinventing the wheel" -- the difficulty of innovation can be more clearly seen in the process of people reinventing the wheel for themselves when they might have simply imitated a more refined idea. Searching the state space of potential solutions can be an arduous task.
Link to "paving cow paths", which is a part of formalizing or crystalizing pre-tested solutions.
-