7 Matching Annotations
  1. Oct 2020
    1. To have, but maybe not to read. Like Stephen Hawking’s “A Brief History of Time,” “Capital in the Twenty-First Century” seems to have been an “event” book that many buyers didn’t stick with; an analysis of Kindle highlights suggested that the typical reader got through only around 26 of its 700 pages. Still, Piketty was undaunted.

      Interesting use of digital highlights--determining how "read" a particular book is.

  2. Nov 2019
    1. From this perspective, GPT-2 says less about artificial intelligence and more about how human intelligence is constantly looking for, and accepting of, stereotypical narrative genres, and how our mind always wants to make sense of any text it encounters, no matter how odd. Reflecting on that process can be the source of helpful self-awareness—about our past and present views and inclinations—and also, some significant enjoyment as our minds spin stories well beyond the thrown-together words on a page or screen.

      And it's not just happening with text, but it also happens with speech as I've written before: Complexity isn’t a Vice: 10 Word Answers and Doubletalk in Election 2016 In fact, in this mentioned case, looking at transcripts actually helps to reveal that the emperor had no clothes because there's so much missing from the speech that the text doesn't have enough space to fill in the gaps the way the live speech did.

    2. The most interesting examples have been the weird ones (cf. HI7), where the language model has been trained on narrower, more colorful sets of texts, and then sparked with creative prompts. Archaeologist Shawn Graham, who is working on a book I’d like to preorder right now, An Enchantment of Digital Archaeology: Raising the Dead with Agent Based Models, Archaeogaming, and Artificial Intelligence, fed GPT-2 the works of the English Egyptologist Flinders Petrie (1853-1942) and then resurrected him at the command line for a conversation about his work. Robin Sloan had similar good fun this summer with a focus on fantasy quests, and helpfully documented how he did it.

      Circle back around and read this when it comes out.

      Similarly, these other references should be an interesting read as well.

    3. For those not familiar with GPT-2, it is, according to its creators OpenAI (a socially conscious artificial intelligence lab overseen by a nonprofit entity), “a large-scale unsupervised language model which generates coherent paragraphs of text.” Think of it as a computer that has consumed so much text that it’s very good at figuring out which words are likely to follow other words, and when strung together, these words create fairly coherent sentences and paragraphs that are plausible continuations of any initial (or “seed”) text.

      This isn't a very difficult problem and the underpinnings of it are well laid out by John R. Pierce in An Introduction to Information Theory: Symbols, Signals and Noise. In it he has a lot of interesting tidbits about language and structure from an engineering perspective including the reason why crossword puzzles work.

      close reading, distant reading, corpus linguistics

  3. Sep 2019
    1. He is now intending to collaborate with Bourne on a series of articles about the find. “Having these annotations might allow us to identify further books that have been annotated by Milton,” he said. “This is evidence of how digital technology and the opening up of libraries [could] transform our knowledge of this period.”
  4. Apr 2019
    1. Digital sociology needs more big theory as well as testable theory.

      I can't help but think here about the application of digital technology to large bodies of literature in the creation of the field of corpus linguistics.

      If traditional sociology means anything, then a digital incarnation of it should create physical and trackable means that can potentially be more easily studied as a result. Just the same way that Mark Dredze has been able to look at Twitter data to analyze public health data like influenza, we should be able to more easily quantify sociological phenomenon in aggregate by looking at larger and richer data sets of online interactions.

      There's also likely some value in studying the quantities of digital exhaust that companies like Google, Amazon, Facebook, etc. are using for surveillance capitalism.