17 Matching Annotations
  1. Last 7 days
    1. For those not familiar with GPT-2, it is, according to its creators OpenAI (a socially conscious artificial intelligence lab overseen by a nonprofit entity), “a large-scale unsupervised language model which generates coherent paragraphs of text.” Think of it as a computer that has consumed so much text that it’s very good at figuring out which words are likely to follow other words, and when strung together, these words create fairly coherent sentences and paragraphs that are plausible continuations of any initial (or “seed”) text.

      This isn't a very difficult problem and the underpinnings of it are well laid out by John R. Pierce in An Introduction to Information Theory: Symbols, Signals and Noise. In it he has a lot of interesting tidbits about language and structure from an engineering perspective including the reason why crossword puzzles work.

      close reading, distant reading, corpus linguistics

  2. Jun 2018
    1. The first part introduces what Marjorie Perloff calls “differential reading,” which positions close and distant reading practices as both subjective and objective methodologies.

      Is New Historicism close or distant reading? The latter, right? But nonetheless deeply human, perhaps more so than "close reading" criticized as privileging text over lived reality.

  3. Oct 2017
    1. While one could manually “count” references across a novel or ouvre, or attempt to estimate relative occurrence, a text analysis tool like Voyant can more easily provide textual evidence necessary to support an essay’s claim, or, if the evidence proves the writer “wrong,” help the writer re-evaluate her argument accordingly.

      Just a tool of efficiency or for noticing unrecognized patterns through a different means of analysis. Both, IMO.

  4. Jun 2017
    1. Don’t we have to actually read the books, before saying what the patterns discovered in them mean?

      Yes, of course. But it's ironic that this three post tirade begins with a rather distant reading of the MLA program.

    2. But does the data point inescapably in that direction?

      In the above performance of close reading, is the evidence more "inescapable"? Isn't is always in the fullness of the argumentation no matter where the data comes from?

  5. Aug 2016
    1. Page 8

      Jockers talking about the old approach in the 1990s to anecdotal evidence:

      … in the 1990s, gathering literary evidence meant reading books, noting "things" (a phallic symbol here, a bibliographical reference there, a stylistic flourish, an allusion, and so on) and then interpreting: making sense and arguments out of those observations. Today, in the age of digital libraries and large-scale book-digitization projects, the nature of the "evidence" available to us has changed, radically. Which is not to say that we should no longer read books looking for, or noting, random "things," but rather to emphasize that massive digital corpora offer is unprecedented access to literally record an invite, even demand, a new type of evidence gathering and meaning making. The literary scholar of the 21st-century can no longer be content with anecdotal evidence, with random "things" gathered from a few, even "representative," text. We must strive to understand the things we find interesting in the context of everything else, including a massive possibly "uninteresting" text.

    2. Pages 7 and 8

      Jockers is talking here about Ian Watt’s method in Rise of the Novel

      What are we to do with the other three to five thousand works of fiction published in the eighteenth century? What of the works that Watt did not observe and account for with his methodology, and how are we to now account for works not penned by Defoe, by Richardson, or by Fielding? Might other novelists tell a different story? Can we, in good conscience, even believe that Defoe, Richardson, and Fielding are representative writers? Watt’s sampling was not random; it was quite the opposite. But perhaps we only need to believe that these three (male) authors are representative of the trend towards "realism" that flourished in the nineteenth century. Accepting this premise makes Watts magnificent synthesis into no more than a self-fulfilling project, a project in which the books are stacked in advance. No matter what we think of the sample, we must question whether in fact realism really did flourish. Even before that, we really ought to define what it means "to flourish" in the first place. Flourishing certainly seems to be the sort of thing that could, and ought, to be measured. Watt had no yardstick against which to make such a measurement. He had only a few hundred texts that he had read. Today things are different. The larger literary record can no longer be ignored: it is here, and much of it is now accessible.

    3. Jockers, Matthew L. 2013. Macroanalysis: Digital Methods and Literary History. Topics in the Digital Humanities. Urbana, IL: University of Illinois Press.

  6. Jul 2016
    1. Page 16

      One benefit of traditional hermeneutical practices such as close reading is that the trained reader need not install anything, run any software, wrestle with settings, or wait for results. The experienced reader can just enjoy iteratively reading, thinking, and rereading. Similarly the reader of another person's interpretation, if the book being interpreted is at hand, can just pick it up, follow the references, and recapitulate the reading. To be as effective as close reading, analytical methods have to be significantly easier to apply and understand. They have to be like reading, or, better yet, a part of reading. Those invested in the use of digital analytics need to think differently about what is shown and what is hidden: the rhetorical presentation of analytics matters. Further, literary readers of interpretive works want to learn about the interpretation. Much of the literature in journals devoted to humanities computing suffers from being mostly about the computing; it is hard to find scholarship that is addressed to literary scholars and is based in computing practices.

    1. p. 6

      Retrieval methods designed for small databases decline rapidly in effectiveness as collections grow...

      This is an interesting point that is missed in the Distant reading controversies: its all very well to say that you prefer close reading, but close reading doesn't scale--or rather the methodologies used to decide what to close read were developed when big data didn't exist. How to you combine that when you can read everything. I.e. You close read Dickins because he's what survived the 19th C as being worth reading. But now, if we could recover everything from the 19th C how do you justify methodologically not looking more widely?

  7. www.informatik.uni-leipzig.de www.informatik.uni-leipzig.de
    1. On Close and Distant Reading in Digital Humanities:A Survey and Future Challenges

      Jänicke, S., G. Franzini, M. F. Cheema, and G. Scheuermann. n.d. “On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges.”

  8. Dec 2015
    1. “distant reading”: understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data.

      Nothing against this, but it's not the game I'm in.

      Question is, though, can the same tool be used to do both distant reading and close reading?

  9. Nov 2013
    1. In a Literary Lab project on 18th-century novels, English students study a database of nearly 2,000 early books to tease out when “romances,” “tales” and “histories” first emerged as novels, and what the different terms signified.

      This may be a reference to the Eighteenth Century Collection Online-Text Creation Partnership (ECCO-TCP) project, which transcribed and marked up in XML ~2,200 eighteenth-century books from the Eighteenth Century Collections Online database (ECCO). The ECCO-TCP corpus is in the public domain and available for anyone to use: http://www.textcreationpartnership.org/tcp-ecco/