Hypothesis

3 Matching Annotations

Jun 2019
www.themacroscope.org www.themacroscope.org

Basic Text Mining: Word Clouds, their Limitations, and Moving Beyond

3
1. AlexBala 05 Jun 2019
  
  in Public
  
  Yet with such a visualization the main downside becomes clear: we lose context. Who are the protagonists? Who are the villains? As adjectives are separated from other concepts, we lose the ability to derive meaning. For example, a politician speaks of “taxes” frequently: but from a word cloud, it is difficult to learn whether they are positive or negative references. With these shortcomings in mind, however, historians can find utility in word clouds.
  
  I had this idea in mind the entire time I read this section on Word cloud. I think this is a really important consideration and which does show the limitations of wordcloud.
  
  #DH8900
2. AlexBala 03 Jun 2019
  
  in Public
  
  It also represents the inversion of the traditional historical process: rather than looking at documents that we think may be important to our project and pre-existing thesis, we are looking at documents more generally to see what they might be about. With Big Data, it is sometimes important to let the sources speak to you, rather than looking at them with pre-conceptions of what you might find.
  
  I found this statement striking since it serves the purpose of telling historians how to make sense of the data they are working with.
  
  #DH8900
3. AlexBala 03 Jun 2019
  
  in Public
  
  In brief, they are generated through the following process. First, a computer program takes a text and counts how frequent each word is. In many cases, it will normalize the text to some degree, or at least give the user options: if “racing” appears 80 times and “Racing” appears 5 times, you may want it to register as a total of 85 times to that term.
  
  I really like the way this is all broken down. I found this page very explanatory in a user friendly way for someone with a limited knowledge of digital functions.
  
  #DH8900
Visit annotations in context

Tags

#DH8900

Annotators

AlexBala

URL

themacroscope.org/

Tags

Annotators

URL