30 Matching Annotations
  1. Jul 2019
    1. we are looking at documents more generally to see what they might be about. With Big Data, it is sometimes important to let the sources speak to you, rather than looking at them with pre-conceptions of what you might find.

      This represents a reversal of our traditional methodology as historians. However, these tools can easily lead us down and endless rabbit hole. Where do we draw the line?

    1. Yet we do need to realize that these tools shape our research: they can occasionally occlude context, or mislead u

      A very important point. Unlike the methodology of "close reading" using these tools to textually analyze large volume corpora can easily lead us to misinterpret the data to suit our question.

    1. The first tool that we sometimes use is called ‘Outwit Hub,’ available at http://www.outwit.com/.

      I found this tool to be very useful as this scraper looks directly at the html code to scrape what you need. Being unfamiliar with HTML, it took sometime to acclimate myself with how to best use this, but with a little trial and error Outwit Hub is very accessible for the layperson. Unfortunately, this scarper is no longer free and the subscription cost is quite high how often you may use it.

      Data Miner, which is available for free in the Google Chrome store, may not be as powerful as Outwit; yet it performs basic scraping functions quite well. Additionally, the functionality of this tool is quite easy to use and with a little patience anyone can learn to use this to automatically grab multiple web pages from a a single site.

    2. Using APIs is not overly difficult (the Programming Historian explains how to use the Zotero reference and research environment API)

      Zotero does make using API's easier. #DH8900

    3. The Programming Historian has a number of lessons explaining how to do this in Python. Here, we describe a way of using three different pieces of software to achieve the same thing. While we would certainly recommend that learning how to do this using Python should be on your to-do list, it certainly is easier, working with students or colleagues, to point to useful browser based tools that can do this.

      There is a lot of value in using Python to obtain your data, however, the learning curve and comfort level of using Python was a determent to me and other grad students considering this method. #DH8900

    1. For historians, however, computational history became associated with demographic, population, and economic histories. For a time in the 1970s, it looked like history might move wholesale into quantitative histories, with the widespread application of math and statistics to the understanding of the past

      Why is this a bad thing? Sometimes I feel that we historians stray too much away from statistical evidence!

    1. As historians, we often find ourselves writing historiographies

      Digital tools could provide the element of "distant reading" to provide clarification on specific language and methodologies used over a historiography's temporal framework.

    2. [11]

      The work by Higher and Deeper exemplify the attributes of DH. I suggest watching the video listed in the notes.

    3. OCR

      For historians that rely upon close reading methodologies, advancements in OCR technology would be extremely beneficial in creating large corpora of digitized documents for textual analysis.

    4. work

      This communal approach to historical scholarship is most encouraging.

    5. The work helps confirm the hypothesis of others, which sees a “civilizing process” as a:

      Conformation!

    1. Students and instructors should also keep a keen eye on digitalhumanitiesnow.org

      This a great resource. While the discussions on programming are out of my purview, there are many examples on how to use these digital tools to enhance historical research.

    2. 6 Leave a comment on paragraph 6 0 Stack Overflow: http://stackoverflow.com/

      This is another good source of information. However, an intimate knowledge of programming languages is required to understand the posts.

    1. open-source program wget,

      I found that wget was difficult to load and run on a Windows platform. While this will take up more space of your computers hard drive, I find copy the website page in Zotero to be more accessible to those that aren't comfortable using command prompts. #DH8900

    1. As history becomes digitized in ever-increasing scales, historians without the ability to research both micro- and macroscopically may be in danger of becoming mired in evidence or lost in the noise.

      The tireless and unrecognized work of those transcribe these sources for digitization should be heralded for their efforts. Compared to European and Australian efforts to maximize digital content, the U.S. current lack of funding for digitization hampers our digital excavations.

    2. A historian’s macroscope offers a complementary, but very different, path to knowledge. It allows you to begin with the complex and winnow it down until a narrative emerges from the cacophony of evidence.

      This is an interesting way to approach the act of "doing history". While combing through hundreds of letters from the Superintendent of the Freedman's Bureau in Memphis, TN, I would have liked to employ this digital methodology. #DH8900

  2. Jun 2019
    1. The difference between explicit and implicit networks is not a hard binary; it can be difficult or impossible to determine (or not a reasonable question) whether the network you are analyzing is one that has its roots in some physical, objective system of connections. It is still important to keep these in mind, as metrics that might imply historical agency in some cases (e.g. a community connected by letters they wrote) do not imply agency in others (e.g. a community connected by appearing in the same list of references).

      Is this a continuation of Benedict Anderson's work on imagined communities into digital form?

    2. Taking the most recent example, this means you can draw edges from books to authors, but not between authors or between books.

      What about topic modeling within in a network?

    3. The directionality of an edge can have huge repercussions to a network’s structure, and so algorithms made for undirected networks to find local communities or node importance might produce very unlikely results on a directed network. Be careful that you only use algorithms made specifically for directed networks when analyzing them.

      Can this be overcome digitally? It seems like "close reading" is a necessary methodology in these instances.

    4. directed networks.

      Power dynamics!

    5. d triadic closure, can

      Funny, this is the basis of LinkedIn. Which should prove to be years of fun for future historians!

    6. Leave a comment on paragraph 17 0 The smallest unit of meaningful analysis in a network is a dyad, or a pair of nodes connected by an edge between them (figure 6.4). Without any dyads, there would be no network, only a series of disconnected isolates. Dyads are, unsurprisingly, generally discussed in terms of the nature of the relationship between two nodes. The dyad of two medieval manuscripts may be strong or weak depending on the similarity of their content, for example. Study of dyads in large networks most often revolve around one of two concepts: reciprocity, whether a connection in one direction is reciprocated in the other direction, or assortativity, whether similar nodes tend to have edges between them.

      I'm glad to see my past scientific background is useful in digital history. :-)!

    7. edges (figure 6.2),

      Why "edges"? This is a confusing term to describe linear relationships.

    1. Ben Fry, Visualizing Data: Exploring and Explaining Data with the Processing Environment (Sebastopol, CA: O’Reilly Media, 2007).

      This should be an interesting read.

    2. There is a tendency when using graphs to become smitten with one’s own data.

      Of course, it becomes an evidentiary defense.

    3. The entities being connected can be articles, people, social groups, political parties, archaeological artefacts, stories, and cities; citations, friendships, people, affiliations, locations, keywords, and ship’s routes can connect them. The results of a network study can be used as an illustration, a research aid, evidence, a narrative, a classification scheme, and a tool for navigation or understanding.

      It is easy to see the boundless value of these applications, yet I also realize their potential to become an endless maze.

    4. Trade networks

      Agree, but the maps they produce are hard to follow.

    5. The authors connected nearly a hundred 15th century Florentine elite families via nine types of relations, including family ties, economic partnerships, patronage relationships, and friendships.

      I may getting ahead of this chapter, but I very interested in their methodology and how much digital information was available to them. Otherwise, this project had to be very arduous before a network analysis could have been run.

    6. As much as networks reveal communities, they also obscure more complex connections that exist outside of the immediate data being analyzed.

      I'm wondering if this level of network analysis could be useful to U.S. labor history to demonstrate the emergence and eventual collapse of a working-class consciousness? Also, how would choose what labor organizations to use for our data set? Would including several different industries muddle our results?

    7. We need to be extremely careful when analyzing networks not to read power relationships into data that may simply be imbalanced.

      This concept has been a major theme throughout this text. Our findings are only as credible as our data. In other words, network analysis can be easily manipulated to show relationships that are not statistically significant.