289 Matching Annotations
  1. Aug 2019
    1. After digitizing and processing eleven years worth of registrations lists, I found that of the around 2000 unique names (out of a total of around 5000), the person with the longest history of participation was the librarian Jane Williamson (none of the names in red below are academics).

      The image below this sentence does not appear.

    2. This makes for a messy looking dataviz and that is precisely my point.

      Again, the image above this text does not appear.

    3. Click to enlarge

      This visualization is difficult to read, as much of the text runs over itself.

    4. Polatnick, also a participant in women’s liberation, wrote specifically to recover the women who had been left out of histories because they “fought” for women’s liberation “in other contexts.”

      Part of women's liberation and women's movement exclusively vs. part of black liberation movement, etc. Apparent desire for women's liberation to be an "exclusive" history among earlier historians.

    5. Using the BYU Corpus Interface for Google Books I scraped the metadata for any references to “Redstockings Manifesto” or “A Historical and Critical Essay for Black Women”  to create the visualization below.

      I am unable to see said visualization, despite reloading the page multiple times. When I attempted to "go to image location" I received a "404 page not found" error.

    6. The Redstockings Manifesto, a widely circulated articulation of the principles that would become known as radical feminism,  suggested to me, following other work I’ve done on feminist manifestos, to look at the piece attributed to Robinson,  “A Historical and Critical Essay for Black Women” as a  manifesto would, I argue, function as one.   If in 1970 the two pieces are of equal influence, at least as measured by inclusion in these anthologies, what happens over time, which after all is what historians are interested in?

      I've seen the Red Stocking Manifesto referenced before, while I've never heard of "A Critical Essay for Black Women," so my purely anecdotal assumption would be that the Redstockings Manifesto has been more influential in the long term.

    7. f the 435 names, the common network reduces to this.

      Image does not appear.

    1. The modern perception of private correspondence was one that simply did not exist in the early modern period. instead, epistolary conventions implicated multiple parties in the composition, transmission, and reception of letters.

      Letters not private as we would define it, but what we would call at least semi-public.

    1. Carolee Schneemann’s correspondence, once explored for its content rather than for connections between authors, pointed to female circles around “proto-feminist” art movements in the late 1950s and early 1960s, as well the US feminist artist movement of the 1970s and 1980s.

      Digital analysis revealed connections not obvious from reading the letter or even examining who the letter were written to.

    2. Analysis of the periodicals indicated that references to manifestos made up a relatively small portion of their overall content and that appearances of the word “manifesto” decreased over time, especially after 1979.46Subsequent analysis was therefore restricted to manifestos included in Deepwell’s anthology that were composed before 1980.

      Documents did not contain what was expected, limiting availability of usable sources.

    1. That is, we recognize the institutionalization of public data as it becomes known, read, and put in context, but in comparison, how private data is secured, (almost assuredly) sold to data brokers and possibly eventually leaked is obscured.

      I'm really not sure what the author is trying to say here. Are they implying that it isn't widely known that private data is collected and sold by Google, Facebook, Twitter, Youtube, etc.? "Google is watching" has been a common internet joke since the 2010s. I very much question how "obscured" this is.

    1. View an instance of Ink loaded with Voltaire's correspondence. In combines mapped letters with a timeline stacked bar chart and a relationship viewer.

      Unfortunately this link does not seem to work anymore. The project on Algarotti himself sounds interesting though. I am always interested in projects that emphasize the interconnected of Eastern Europe with the rest of the world, whether other parts of Europe or elsewhere.

  2. Jul 2019
    1. Extensive excavations carried out by the National Museum of Ireland between 1962 and 1981 revealed a wealth of evidence for the post-917 settlement. The single most important result of these excavations was the information they provided about town layout in the tenth and eleventh centuries. A series of fenced plots or tenements was unearthed and could be traced over a dozen successive building levels. Apart from this, there was also important evidence for house building, and for a succession of waterfronts from between the tenth and thirteenth centuries.
    2. The Vikings settled in Dublin from 841 AD onwards. During their reign Dublin became the most important town in Ireland as well as a hub for the western Viking expansion and trade. It is in fact one of the best known Viking settlements.
    1. Therefore, we must shift the focus from debates over their appropriateness or utility of them to discussion of how our research practices require rethinking in light of them.

      Effects on methods, historical questions and possibilities.

    2. One line centres the human relationships and experiences that are mediated by these environments: individuals who are exposed to a wider audience when analogue archives move online, the people who created digitised materials and the infrastructure that facilitates access, and finally, persons who are obfuscated, misrepresented, or entirely absent in digital archival environments.

      Digital mediator.

    3. However, this very abundance may occlude the issue of absences.

      In a word inundated with information, it is key to understand that somethings are still forgotten

    4. Along with these benefits, however, digitisation has wrought concomitant problems. In a consumer-driven academic environment, funding for digitisation may be tied not only to concerns about access and preservation, but also to the need to increase visibility to ensure viability
    5. , I offer three questions researchers should consider before consulting materials in a digital archive. Have the individuals whose work appears in these materials consented to this? Whose labour was used and how is it acknowledged? What absences must be attended to among an abundance of materials? Finally, I suggest that researchers should draw on the existing body of scholarship about these issues by librarians and archivists.
    1. The case of Mario Gonzalez which has tested an individual’s right to have elements of this past removed from easy reach of the public raises ethical questions about the rights of the ‘forgotten’ individuals from the past now being ‘discovered’ within digital archives.

      Key to understanding how digitization complicates our understanding of archives as locations or institutions of power.

    1. we are looking at documents more generally to see what they might be about. With Big Data, it is sometimes important to let the sources speak to you, rather than looking at them with pre-conceptions of what you might find.

      This represents a reversal of our traditional methodology as historians. However, these tools can easily lead us down and endless rabbit hole. Where do we draw the line?

    2. While this is fraught with issues – words change meaning over time, different terms are used to describe similar concepts, and we still face the issues outlined above – we can arguably still learn something from this.
    3. It also represents the inversion of the traditional historical process: rather than looking at documents that we think may be important to our project and pre-existing thesis, we are looking at documents more generally to see what they might be about. With Big Data, it is sometimes important to let the sources speak to you, rather than looking at them with pre-conceptions of what you might find.
    1. Yet we do need to realize that these tools shape our research: they can occasionally occlude context, or mislead u

      A very important point. Unlike the methodology of "close reading" using these tools to textually analyze large volume corpora can easily lead us to misinterpret the data to suit our question.

    2. A key rule to remember is that there is no ‘right’ or ‘wrong’ way to do these forms of analysis: they are tools and for most historians, the real lifting will come once you have the results. Yet we do need to realize that these tools shape our research: they can occasionally occlude context, or mislead us. These questions are at the forefront of this chapter.
    1. The first tool that we sometimes use is called ‘Outwit Hub,’ available at http://www.outwit.com/.

      I found this tool to be very useful as this scraper looks directly at the html code to scrape what you need. Being unfamiliar with HTML, it took sometime to acclimate myself with how to best use this, but with a little trial and error Outwit Hub is very accessible for the layperson. Unfortunately, this scarper is no longer free and the subscription cost is quite high how often you may use it.

      Data Miner, which is available for free in the Google Chrome store, may not be as powerful as Outwit; yet it performs basic scraping functions quite well. Additionally, the functionality of this tool is quite easy to use and with a little patience anyone can learn to use this to automatically grab multiple web pages from a a single site.

    2. Using APIs is not overly difficult (the Programming Historian explains how to use the Zotero reference and research environment API)

      Zotero does make using API's easier. #DH8900

    3. The Programming Historian has a number of lessons explaining how to do this in Python. Here, we describe a way of using three different pieces of software to achieve the same thing. While we would certainly recommend that learning how to do this using Python should be on your to-do list, it certainly is easier, working with students or colleagues, to point to useful browser based tools that can do this.

      There is a lot of value in using Python to obtain your data, however, the learning curve and comfort level of using Python was a determent to me and other grad students considering this method. #DH8900

    1. For historians, however, computational history became associated with demographic, population, and economic histories. For a time in the 1970s, it looked like history might move wholesale into quantitative histories, with the widespread application of math and statistics to the understanding of the past

      Why is this a bad thing? Sometimes I feel that we historians stray too much away from statistical evidence!

    1. As historians, we often find ourselves writing historiographies

      Digital tools could provide the element of "distant reading" to provide clarification on specific language and methodologies used over a historiography's temporal framework.

    2. [11]

      The work by Higher and Deeper exemplify the attributes of DH. I suggest watching the video listed in the notes.

    3. OCR

      For historians that rely upon close reading methodologies, advancements in OCR technology would be extremely beneficial in creating large corpora of digitized documents for textual analysis.

    4. work

      This communal approach to historical scholarship is most encouraging.

    5. The work helps confirm the hypothesis of others, which sees a “civilizing process” as a:

      Conformation!

    1. Students and instructors should also keep a keen eye on digitalhumanitiesnow.org

      This a great resource. While the discussions on programming are out of my purview, there are many examples on how to use these digital tools to enhance historical research.

    2. 6 Leave a comment on paragraph 6 0 Stack Overflow: http://stackoverflow.com/

      This is another good source of information. However, an intimate knowledge of programming languages is required to understand the posts.

    1. open-source program wget,

      I found that wget was difficult to load and run on a Windows platform. While this will take up more space of your computers hard drive, I find copy the website page in Zotero to be more accessible to those that aren't comfortable using command prompts. #DH8900

    1. At least two runic inscriptions carved into the marble walls of the Hagia Sophia may have been engraved by members of the Varangian Guard.
    2. A half-century later, the Vikings would be recruited to defend Constantinople instead of attacking it. When Byzantine Emperor Basil II faced an internal uprising in 987, Vladimir the Great gave him 6,000 Viking mercenaries known as Varangians to differentiate the native Scandinavians from the Rus who by the middle of the 10th century had assimilated with the native Slavs and lost their distinct identity. Impressed by the ferocity with which the Vikings battled the rebels, the emperor established the elite Varangian Guard to protect Constantinople and serve as his personal bodyguards. With no local ties or family connections that could divide their loyalties and an inability to speak the local language, the Varangians proved far less corruptible than Basil’s Greek guards.
    3. It is not known when the Rus first reached Constantinople, but it was before 839 when Rus representatives arrived at the Frankish court as part of a Byzantine diplomatic mission.
    4. The epic voyages of the Vikings to the British Isles, Iceland, North America and points west tend to obscure the fact that the Scandinavian warriors also ventured far to the east across Europe and parts of Asia. While the Danes and Norwegians sailed west, Swedish fighters and traders traveled in the opposite direction, enticed initially by the high-quality silver coins minted by the Abbasid Caliphate that sprawled across the Middle East.
    1. Throughout the 8th and 9th century, Vikings began traveling south from Scandinavia to raid the monasteries and towns of what is today France. By 911, they were so present, and ferocious, that the French king was forced to cede part of northern France to them. Some Vikings settled there permanently, eventually becoming known as the Normans—Norse men—of Normandy. Later, the same Viking spirit saw them traveling throughout the continent, on expeditions to the United Kingdom and southern Italy.
    2. These hulking skeletons are believed to have been the descendants of Vikings who colonized northern France and, later, southern Italy and Sicily.
    1. A visualization is a decision you make based on what you want your audience to learn. That said, there are a great many wrong visualizations.
    2. they represent a unique way of presenting visualizations that can be extremely compelling and effective. They embody the idea that simple visualizations can be more powerful than complex ones, and that multiple individual visualizations can often be more easily understood than one incredibly dense visualization.
    3. Maps are not necessarily always the most appropriate visualizations for the job, but when they are used well, they can be extremely informative.
    4. While few people can label every U.S. state or European country on a map accurately, we know the shape of the world enough that we can take some liberties with geographic visualizations that we cannot take with others. Cartograms are maps that distort the basic spatial reference system of latitude and longitude in order to represent some statistical value.
    5. These visualizations are good for directly comparing absolute values to one another, when geographic region size is not particularly relevant.
    6. Choropleth maps should be used for ratios and rates rather than absolute values, otherwise larger areas may be disproportionately colored darker due merely to the fact that there is more room for people to live.
    7. As content is added to a map, it may gain a layer or layers of information visualization. One of the most common geographic visualizations is the choropleth, where bounded regions are coloured and shaded to represent some statistical variable
    8. Our decisions on how to encode our data and which data to present deeply influence the understanding readers take away from a visualization.
    9. Rectangles are sized proportionally to the amount of money received per category in 2013, and coloured by the percentage that amount had changed since the previous fiscal year.

      It is a visually appealing map but I don't know what half the funds are used for due to the blocks being so small.

    1. visualizations still have an important role to play in translating complex data relationships into easily digestible units.
    2. . You may begin your research with a dataset and some preconceptions of what it means and what it implies, but without a well-formed thesis to be argued. The exploratory visualization allows you to notice trends or outliers that you may not have noticed otherwise, and those trends or outliers may be worth explaining or discussing in further detail.
    3. When first obtaining or creating a dataset, visualizations can be a valuable aid in understanding exactly what data are available and how they interconnect. In fact, even before a dataset is complete, visualizations can be used to recognize errors in the data collection process.
    4. Notice how in the chart in figure 5.3, it can easily be noticed that whomever entered the data on book publication dates accidentally typed “1909” rather than “1990” for one of the books.

      Like last week when "Metadata" was the most frequent word in the Colored Conference documents.

    5. Visualization is a method of deforming, compressing, or otherwise manipulating data in order to see it in new and enlightening ways.

      definition

    6. You may begin your research with a dataset and some preconceptions of what it means and what it implies, but without a well-formed thesis to be argued. The exploratory visualization allows you to notice trends or outliers that you may not have noticed otherwise, and those trends or outliers may be worth explaining or discussing in further detail.
    7. The use of visualizations to show the distribution of words or topics in a document is an effective way of getting a sense for the location and frequency of your query in a corpus, and it represents only one of the many uses of information visualization.
    8. Visualization is a method of deforming, compressing, or otherwise manipulating data in order to see it in new and enlightening ways.
    1. There are two main command-line interfaces, or ‘shells,’ that many digital historians use. On OS X or many Linux installations, the shell is known as bash, or the ‘Bourne-again shell.’
  3. www.themacroscope.org www.themacroscope.org
    1. This is a generative approach: big data for the humanities is not only about justifying a story about the past, but generating new stories, new perspectives, given our new vantage points and tools.
    2. an approach to big data for the historian (we argue) needs to be a public approach
    3. This volume represents our view of what some of the most useful of these developing approaches are, how to use them, what to be wary of, and the kinds of questions and new perspectives our macroscope opens up.
    1. A macroscope is a bit like a microscope or a telescope, but instead of allowing you to see things that are small or far away, the macroscope makes it easier to grasp the incredibly large. It does so through a process of compression
  4. www.themacroscope.org www.themacroscope.org
    1. AntConc is an invaluable way to carry out some forms of textual analysis on data sets. While it does not scale to the largest datasets terribly well, if you have somewhere in the ballpark of 500 or even 1,000 newspaper-length articles you should be able to crunch data and receive tangible results.
    1. Topic modeling allows us to quantify and visualize this pattern, a pattern not immediately visible to a human reader.
    2. The question remains, how does a reader (computer or human) recognize and conceptualize the recurrent themes that run through nearly 10,000 entries?
    3. hort, content-driven entries that usually touch upon a limited number of topics appear to produce remarkably cohesive and accurate topics. In some cases (especially in the case of the EMOTION topic), MALLET did a better job of grouping words than a human reader.
    1. We cannot rely only on the computer-driven groups to use in analyzing texts.  The next step is to look at the texts that contain repeating word patterns and conduct a close reading to see what we can learn about the topic. Plotting the topic over time enables us to locate trends in how important the topic was to the author, or when we compare them with other authors, we can investigate differences in the ways that two authors valued these topics or the different ways that they expressed themselves.
    1. The ‘model’ in a topic model is the idea of how texts get written: authors compose texts by selecting words from a distribution of words (or ‘bag of words’ or ‘bucket of words’) that describe various thematic topics.
    1. Back to my spreadsheet I went where I discovered that variations in the title of the essay and in the author credits obfuscated the connections, as well as those for the other two essays by black women, Frances M. Beal and Maryanne Weathers.

      Shows the importance of double checking data and the hazards of working with others data.

    2. Almetrics

      Non-traditional bibliometrics, seen as alternatives or compliments to more traditional citation metrics. Can include people, journals, books, data sets, presentations, videos, source code repositories, web pages, etc.

    1. The most important aspect of choosing an appropriate graphic variable is to know the nature of your data variables.
    1. One popular service is colorbrewer (http://colorbrewer2.org/), which allows you to create a color scheme that fits whatever set of parameters you may need.
    1. Adobe Photoshop and Illustrator, as well as the free Inkscape and Gimp, are all good tools for creating legends.

      Tools to use.

    1. Formal networks are mathematical instantiations of the idea that entities and connections between them exist in consort. They embody the idea that connectivity is key in understanding how the world works, both at an individual and a global scale.
    1. Historians will want note when their networks are explicit / physically instantiated, and when they are implicit / derived. An explicit network could be created from letters between correspondents, or roads that physically exist between cities. A derived network might be that of the subjectively-defined similarity between museum artefacts or the bibliographic coupling network connecting articles together if they reference similar sources.

      Explicit & Natural vs. Implicit & Derived

    2. A directed edge is one that is part of an asymmetrical relationship, and an easy way of thinking about them is by imagining arrow tips on the edges.
    3. Transitivity is the concept that when A is connected to B and C, B and C will also be connected. Some networks, like those between friends, feature a high degree of transitivity; others do not.

      Transitivity definition.

    4. Assortativity, also called homophily, is the measure of how much like attracts like among dyads in a network. On the web, for instance, websites (nodes) tend to link (via edges) to topically similar sites. When dyads connect assortatively, a network is considered assortatively mixed. Networks can also experience disassortative mixing, for example when people from isolated communities with strong family ties seek dissimilarity in sexual partners.

      Assortative vs Disassortative mixing.

    5. Aggregate all of the data into one giant network representing the entire span of time, whether it is a day, a year, or a century. The network is static. Slowly build the network over time, creating snapshots that include the present moment and all of the past. Each successive snapshot includes more and more data, and represents each moment of time as an aggregate of everything that led up to it. The network continues to grow over time. Create a sliding window of time, e.g. a week or five years, and analyze successive snapshots over that period. Each snapshot then only contains data from that time window. The network changes drastically in form over time.

      Ways to implement networks.

    6. Edges

      Edges definition

    7. Nodes

      Node definition

    8. Entities are called nodes (figure 6.1) and the relationships between them are called edges (figure 6.2), and everything about a network pivots on these two building blocks.

      Nodes & Edges explanations

    9. Despite their name, networks are dichotomous, in that they only give you two sorts of things to work with: entities and relationships. Entities are called nodes (figure 6.1) and the relationships between them are called edges (figure 6.2), and everything about a network pivots on these two building blocks.
    1. 23. kafliSighvatr Sturluson bjó í Hjarðar-holti nökkura vetr. Síðan keypti hánn Sauðafell, ok fór þangat Nautfellis-vetr, ok bjó þar. Hann görðisk mikill höfðingi ok vinsæll við sína menn. Meðr þeim Kolbeini Tumasyni var in mesta vinátta með tengðum. Kolbeinn réð þá mestu fyrir norðan land, ok hafði öll goðorð fyrir vestan Öxnadals-heiði til mótz við Ávellinga-goðorð; en Þorsteinn Ívarsson gaf Snorra Sturlusyni Ávellinga-goðorð, þat er hann átti. En Mel-menn áttu sínn hluta goðorðz. Fyrir norðan Öxnadalsheiði áttu þeir goðorð, Ögmundr sneis, ok Hallr Kleppjárnsson. Þorvaldr son Guðmundar ins Dýra fékk Sigurði Ormssyni þau goðorð er hann hafði átt. Sigurðr gaf þau goðorð Tuma, syni Sighvatz; ok komzk Sighvatr svá at þeim síðan.
    1. Ultimately, the purpose of this map is to encourage and aid new readings of the sagas. As Franco Moretti writes: “Placing a literary phenomenon in its specific place — mapping it — is not the conclusion of geographical work; it’s the beginning. After which begins in fact the most challenging part of the whole enterprise: one looks at the map, and thinks” (Atlas of the European Novel 1800-1900
    2. Exploring at first hand how the events that the ͍slendingasögur describe are mapped onto and around the landscape and commemorated in place-names was a compelling approach to this remarkable body of literature. No less illuminating was witnessing for myself how modern Icelanders continue to engage with their local and national saga heritage.
    1. We, however, do believe that this all needs to be both contextualized in terms of the ethical implications and the field’s historical development, and nuanced with respects to the knowledge claims that can be made.
  5. Jun 2019
    1. This article will demonstrate how the mathematical tools employed by network scientists offer valuable ways of understanding the development of underground religious communities in the sixteenth century, as well as providing different approaches for historians and literary scholars working in archives.

      need for proposal

    2. he vast majority of network analysis deals with incomplete networks in the real world, and any statistical treatment of biases has to make assumptions about the distribution of missing links or nodes.

      Important to note absences in research - in this case, that many letters were most likely lost or discarded due to "lack" or importance

    3. Letters were the method by which people sought patronage, garnered favor, and engineered their social mobility;

      Historian's Macroscope - Reciprocity in letters illustrates a social network.

    1. A good “sanity check” is to see if the algorithm you’re using puts the people or entities you know should be grouped together in the same community, and leaves the ones that do not belong out. If it works on the parts of the data you know, you can be more certain it works on the parts of the data you do not.

      Helpful advice

    2. Network analysis aids in finding these unusually connective entities en masse and with great speed, leaving the historian with more time to explore the meaning behind this connection.

      Again, network analysis can act as a tool in identifying the question rather than the answer.

    3. If a historical network exhibits a long-tail distribution, with very prominent hub nodes, the structure itself is not particularly noteworthy. What is worthwhile is figuring out which nodes made it to the top, and why. Why do all roads lead to Rome? How did Mersenne and Hartlib develop such widespread correspondence networks in early modern Europe? The answers to these questions can cut to the heart of the circumstances of a historical period, and their formalization in networks can help guide us toward an answer.

      Network analysis can produce fruitful specific questions as well as illuminate broader trends.

    4. Social networks, article citation networks, airline travel networks, and many others feature a significantly more skewed degree distribution. A few hub nodes have huge numbers of edges, a handful more have a decent amount of edges, but significantly fewer, and most nodes are connected by very few edges.

      I found this to be true in my work for Dr. Johnson on citation patterns in chemical journals

    5. These global metrics are most useful when measured in comparison to other networks; early modern and present day social networks both exhibit scale-free properties, but the useful information is in how those properties differ from one another.

      In other words, don't draw your conclusions solely from the network of a single dataset but from the comparison between two networks. This is also a good way to determine if you have a two datasets which can be effectively put into conversation with eachother. If the networks which historically seem ripe for comparison are radically different, then one of your corpus may be incomplete, skewed, or otherwise aberrant. On the other hand, if they are identical, your data sets aren't capturing the features you wish to distinguish.

    6. perform math incorrectly if they are included

      A serious concern!

    7. When two articles reference a common third work in their bibliographies, they get an edge drawn between then; if those two articles share 10 references between them, the edge gets a stronger weight of 10. A bibliographic coupling network, then, connects articles that reference similar material, and provides a snapshot of a community of practice that emerges from the decisions authors make about whom to cite.

      I read some research on this when working for Dr. Johnson my first year. My reading focused on scholarly networks between chemical journals, specifically which journals cited which other journals and the degree to which that relationship was unidirectional (seeking validation, legitimation through a more prestigious journal) or reciprocal (trading on each other's established prestige for mutual benefit).Very interesting way of determining the degree and source of a journal's influence within the field and how that influence shifts over time.

    8. The directionality of an edge can have huge repercussions to a network’s structure, and so algorithms made for undirected networks to find local communities or node importance might produce very unlikely results on a directed network. Be careful that you only use algorithms made specifically for directed networks when analyzing them.

      Also important to note that visualizations can't demonstrate the degree of directionality an edge has. There is no good way to visualize a relationship that is highly directional as opposed to one that is mildly directional. Since weight and directionality can be, and often are, two independent variables, a heavily weighted arrow implies no more directionality than a lightly weighted arrow. This may be a point of confusion however for those reading a network diagram, unless made clear in the text.

    9. Transitivity is the concept that when A is connected to B and C, B and C will also be connected. Some networks, like those between friends, feature a high degree of transitivity; others do not.

      Networks with a single hierarchical power structure (unipolar) would thus tend to have less Transitivity, while those where power is more diffuse (multipolar) would have more transitivity.

    10. vertices, actors, agents, or points

      Standardized terminology is sorely lacking among historians. Good to know digital history is no exception.

    11. The strength of weak ties has been hypothesized as a driving force behind the flourishing of science in 17th century Europe.[3] Political exile, religious diaspora, and the habit of young scholars to travel extensively, combined with a relatively inexpensive and fast postal system, created an environment where every local community had weak ties extending widely across political, religious, and intellectual boundaries. This put each community, and every individual, at higher risk for encountering just the right serendipitous idea or bit of data they needed to set them on their way. Weak ties are what make the small community part of the global network.

      history example

    12. Careful readers will have noted that the definition of a weak tie is curiously similar to that of a bridge. This dichotomy, the weakness of a connection alongside the importance of a bridge, has profound effects on network dynamics.

      interesting to note

    13. a hub is a node without which the path between its neighbours would be much larger, and a bridge is an edge which connects two otherwise unconnected communities.

      reference definition

    14. The difference between explicit and implicit networks is not a hard binary; it can be difficult or impossible to determine (or not a reasonable question) whether the network you are analyzing is one that has its roots in some physical, objective system of connections. It is still important to keep these in mind, as metrics that might imply historical agency in some cases (e.g. a community connected by letters they wrote) do not imply agency in others (e.g. a community connected by appearing in the same list of references).

      Is this a continuation of Benedict Anderson's work on imagined communities into digital form?

    15. Taking the most recent example, this means you can draw edges from books to authors, but not between authors or between books.

      What about topic modeling within in a network?

    16. The directionality of an edge can have huge repercussions to a network’s structure, and so algorithms made for undirected networks to find local communities or node importance might produce very unlikely results on a directed network. Be careful that you only use algorithms made specifically for directed networks when analyzing them.

      Can this be overcome digitally? It seems like "close reading" is a necessary methodology in these instances.

    17. directed networks.

      Power dynamics!

    18. d triadic closure, can

      Funny, this is the basis of LinkedIn. Which should prove to be years of fun for future historians!

    19. Leave a comment on paragraph 17 0 The smallest unit of meaningful analysis in a network is a dyad, or a pair of nodes connected by an edge between them (figure 6.4). Without any dyads, there would be no network, only a series of disconnected isolates. Dyads are, unsurprisingly, generally discussed in terms of the nature of the relationship between two nodes. The dyad of two medieval manuscripts may be strong or weak depending on the similarity of their content, for example. Study of dyads in large networks most often revolve around one of two concepts: reciprocity, whether a connection in one direction is reciprocated in the other direction, or assortativity, whether similar nodes tend to have edges between them.

      I'm glad to see my past scientific background is useful in digital history. :-)!

    20. edges (figure 6.2),

      Why "edges"? This is a confusing term to describe linear relationships.

    21. The further back in history one goes, the less the globe looks like a small world network. This is because travel and distance constraints prevented short connections between disparate areas.

      technology (for communication) can affect what a historic network looks like

    22. A historian may wish to see the evolution of transitivity across a social network to find the relative importance of introductions in forming social bonds.

      final project: a useful way to look at Burrough's letters of introductions to Morton

    23. These global metrics are most useful when measured in comparison to other networks; early modern and present day social networks both exhibit scale-free properties, but the useful information is in how those properties differ from one another.

      important to note

    24. As opposed to bibliographic coupling networks, co-citation networks connect articles not by the choices their authors make, but by the choices future authors make about them.

      so this pairing is less about content and more about use. ex- you can frequently cite opposing articles about a particular topic together

    25. In an evolving network of correspondents, if Alice writes to Bob, and Bob to Carol, we can ask what the likelihood is that Alice will eventually write to Carol (thus again closing the triangle). This tendency, called triadic closure, can help measure the importance of introductions and knowing the right people in a letter-writing community.

      final project: example of Morton, Burroughs, and Maclure

    26. The smallest unit of meaningful analysis in a network is a dyad, or a pair of nodes connected by an edge between them (figure 6.4). Without any dyads, there would be no network, only a series of disconnected isolates. Dyads are, unsurprisingly, generally discussed in terms of the nature of the relationship between two nodes. The dyad of two medieval manuscripts may be strong or weak depending on the similarity of their content, for example. Study of dyads in large networks most often revolve around one of two concepts: reciprocity, whether a connection in one direction is reciprocated in the other direction, or assortativity, whether similar nodes tend to have edges between them.

      reference definitions

    27. Create a sliding window of time, e.g. a week or five years, and analyze successive snapshots over that period. Each snapshot then only contains data from that time window. The network changes drastically in form over time.

      reference for final project: this might be a useful way to look at the Morton letters (to examine those by year)

    28. f individual attributes

      customization features for your fixed points of study (other aspects of data to include)

    29. Edges connect nodes

      reference: nodes are fixed points and edges connect them

    30. Entities are called nodes (figure 6.1) and the relationships between them are called edges (figure 6.2), and everything about a network pivots on these two building blocks.

      definitions!

    31. Despite their name, networks are dichotomous, in that they only give you two sorts of things to work with: entities and relationships

      important to note

    32. Historians can draw a number of inferences from small world networks, including the time it might have taken for materials to circulate within and across communities, and the relative importance of individual actors in shaping the past.

      Could a material culture analysis of a commodity that was traded yield a network?

    33. When two articles reference a common third work in their bibliographies, they get an edge drawn between then; if those two articles share 10 references between them, the edge gets a stronger weight of 10. A bibliographic coupling network, then, connects articles that reference similar material, and provides a snapshot of a community of practice that emerges from the decisions authors make about whom to cite.

      Helps visualize scholarly relationships.

    34. Because citations go to previous work rather than contemporary similar research, it winds up being difficult for algorithms to find communities of practice in the network.

      Hard to track this form of network.

    35. when it is invoked appropriately, it should be treated with a healthy respect to the many years of research that have gone into building its mathematical and conceptual framework.

      Credit, understanding the labor, problem, analysis, and answers.

    36. Hubs and bridges help connect the local with the global because they are individual metrics that are defined by how they interact with the rest of the network. On its own, a single marriage between two families might seem unremarkable, but if these are royal families marrying into some until-then disconnected foreign power, or two people marrying across faiths in a deeply religious community, one simple bridge takes on new meaning. Network analysis aids in finding these unusually connective entities en masse and with great speed, leaving the historian with more time to explore the meaning behind this connection.

      another good example of why this would be impt for historians

    37. Scale free networks usually have a few very central hubs, which themselves tend to have high betweenness centralities. That is, they sit on the shortest path between many pairs of nodes. These nodes are vulnerable points in the network; without them, information takes longer to spread or travel takes quite a bit longer. A hub in a network of books connected by how similar their content is to one another would have a different meaning entirely: the most central node would likely be an encyclopaedia, because it covers such a wide range of subjects. The meaning of these terms always change based on the dataset at hand.

      useful caution as with everything in history context is crucial

    38. a hub is a node without which the path between its neighbours would be much larger, and a bridge is an edge which connects two otherwise unconnected communities.

      think social movements

    39. A small world network is one in which the shortest path between any two random nodes grows proportionally to the logarithm of the number of nodes in the network. This means that even if the network has many, many nodes, the average shortest path between them is quite small. These networks also have a high global clustering coefficient, which is the average local clustering coefficient across all nodes.

      how to use this term properly

    1. Graphs have a tendency of making a data set look sophisticated and important, without having solved the problem of enlightening the viewer.[13]

      Graphs as a form of legitimization rather than presentation, a cohesive and comprehensive picture of something which is anything but.

    2. creating structural holes and becoming the link between communities

      How would network analysis identify whether or not the Medici's themselves created these holes or they were created by some other factor? Was this agency attributed by the network analysis itself or a secondary form of historical research? Can agency be identified in a network analysis at all?

    3. As much as networks reveal communities, they also obscure more complex connections that exist outside of the immediate data being analyzed.

      The limitations of distant reading. You may see everything within the corpus, but nothing else. These complex connections are only visible when viewed individually within its own specific context, which is why digital history must always be in communication with other methods of historical analysis to capture the full richness of what is being studied.

    4. We need to be extremely careful when analyzing networks not to read power relationships into data that may simply be imbalanced.

      Like other forms of digital historical analysis, the conclusions drawn from network analysis apply only to the corpus studied, not the historical moment itself and thus reflect first and foremost trends in the corpus itself. Selecting a balanced corpus and defending it as representative is essential before these trends can be tied to historical arguments. This is not unique to digital history, although the burden of proof may often be placed higher with digital historians' broad scope and greater perceived legitimacy of its mathematical and statistical methods. All historians must be equally fastidious in defending their source selection as representative of their historical moment before extending their arguments any further than as a representation of their own source base.

    5. There is a tendency when using graphs to become smitten with one’s own data.

      Of course, it becomes an evidentiary defense.

    6. The entities being connected can be articles, people, social groups, political parties, archaeological artefacts, stories, and cities; citations, friendships, people, affiliations, locations, keywords, and ship’s routes can connect them. The results of a network study can be used as an illustration, a research aid, evidence, a narrative, a classification scheme, and a tool for navigation or understanding.

      It is easy to see the boundless value of these applications, yet I also realize their potential to become an endless maze.

    7. Trade networks

      Agree, but the maps they produce are hard to follow.

    8. The authors connected nearly a hundred 15th century Florentine elite families via nine types of relations, including family ties, economic partnerships, patronage relationships, and friendships.

      I may getting ahead of this chapter, but I very interested in their methodology and how much digital information was available to them. Otherwise, this project had to be very arduous before a network analysis could have been run.

    9. As much as networks reveal communities, they also obscure more complex connections that exist outside of the immediate data being analyzed.

      I'm wondering if this level of network analysis could be useful to U.S. labor history to demonstrate the emergence and eventual collapse of a working-class consciousness? Also, how would choose what labor organizations to use for our data set? Would including several different industries muddle our results?

    10. The following chapter, beyond teaching the basics of what networks are and how to use them, will also cover some of the many situations where networks are completely inappropriate solutions to a problem.

      remember sometimes a dataviz is not what you actually need

    11. The entities being connected can be articles, people, social groups, political parties, archaeological artefacts, stories, and cities; citations, friendships, people, affiliations, locations, keywords, and ship’s routes can connect them. The results of a network study can be used as an illustration, a research aid, evidence, a narrative, a classification scheme, and a tool for navigation or understanding.

      variety of use for this method/result

    12. In this case, networks were the subject of study rather than used as evidence, in an effort to see the effects of political change on power structures.

      important to note network analysis is not always just the result.

    13. Studies of this sort pave the way for more exploratory network analyses; if the analysis corroborates the consensus, then it is more likely to be trustworthy in situations where there is not yet a consensus.

      hypothesis before network analysis! not the other way around!

    14. Their study looks in 280 letters written by Cicero; the network generated was not that of whom Cicero corresponded with, but of information generated from reading the letters themselves.

      interesting idea. the body of data is the content of letters not just the sending information

    15. Network approaches can be particularly useful at disentangling the balance of power, either in a single period or over time. A network, however, is only as useful as its data are relevant or complete. We need to be extremely careful when analyzing networks not to read power relationships into data that may simply be imbalanced.

      need to know what the source base is and how it could be limited before making claims about the data results. can't just blindly accept a dataviz/ network analysis

    16. A citation analysis by White and McCann looking at an eighteenth-century chemistry controversy took into account the hierarchical structure of scientific specialties.[2] The authors began with an assumption that if two authors both contributed to a field, the less prominent author would always get cited alongside the more prominent author, while the more prominent author would frequently be cited alone. One scientist is linked to another if they tend to be subordinate to (only cited alongside of) that other author.

      need some contextual or background knowledge when looking at the network analysis of citations to make sense of this hierarchy

    17. The results of a network study can be used as an illustration, a research aid, evidence, a narrative, a classification scheme, and a tool for navigation or understanding.

      which can make the findings clearer

    18. We need to be extremely careful when analyzing networks not to read power relationships into data that may simply be imbalanced.

      & note these absences. As Historians Macroscope stated earlier, Digital Historians should always specify the limits of their data analysis.

    19. Following the entailogram over time reveals conflict and eventual resolution.

      Networks can be used to show the growth of a specific topic and some of the key figures involved in its conception.

    20. Lineage studies with networks are not limited to those of kinship. Sigrist and Widmer[12] used a thousand eighteenth century botanists, tracing a network of between masters and disciples, to show how botany both grew autonomous from medical training and more territorial in character over a period of 130 years.

      Different look at lineage and relationships overtime!

    21. dynasty.

      What was the data set?

    22. For research on organizations, network analysis can provide insight on large-scale community structure that would normally take years of careful study to understand. As much as networks reveal communities, they also obscure more complex connections that exist outside of the immediate data being analyzed.

      Making connections can obscure information.

    23. We need to be extremely careful when analyzing networks not to read power relationships into data that may simply be imbalanced.

      Key note.

    1. What we really want here of course is a visualization that combines all the things, but I’ve resisted creating one for now. The complex historical questions of who gets counted when we count in histories of women’s liberation exists because data reduces people’s lived experiences to columns on a spreadsheet.

      ethical considerations of digital history

    2. I took some important titles, starting in 1975 running up to 1981, digitized the acknowledgements, and pulled the names by hand into a spreadsheet. Of the 435 names, the common network reduces to this.

      method

    3. Turning to a print expression of the movement that is far more grass roots than mass market anthologies, I decided to look for Robinson in the many periodicals that developed out of women’s liberation. Reveal Digital has provided me with uncorrected OCR, machine-readable corpora consisting of over 6000 issues of periodicals from the Left. Here I searched all variations of Robinson’s name. I visualized this metadata and then for comparison repeated the process with the names of the other two black women who overlapped anthologies to visualize the spread of their writing.

      method

    4. Back to my spreadsheet I went where I discovered that variations in the title of the essay and in the author credits obfuscated the connections, as well as those for the other two essays by black women, Frances M. Beal and Maryanne Weathers. I share this not to reveal my own sloppy data, but to highlight the difficulties of doing this kind of visualization.

      visualization helped reveal issues/flaw in data set

    5. visualization below.

      is this referring to the network plot further below or is it missing?

    6. Using the BYU Corpus Interface for Google Books I scraped the metadata for any references to “Redstockings Manifesto” or “A Historical and Critical Essay for Black Women”  to create the visualization below.

      describing method and software- important

    7. Scraping from online catalogues and then digitizing when I had to, I took the table of contents, separated titles and authors, and put them into a spreadsheet that I then pulled into Palladio where I explored the relationships between both authors and essays as they overlapped.

      outlining method

    8. The complex historical questions of who gets counted when we count in histories of women’s liberation exists because data reduces people’s lived experiences to columns on a spreadsheet.  

      KEY

    9. This makes for a messy looking dataviz and that is precisely my point. If I had not known my sources so well, I would have drawn erroneous conclusions. In addition to know your data, an important point in working digitally, I want to also ponder what the messiness of my data means for the marginalized. It seems strange to me that there is such variation in their names (although I will note that this happens to white women as well). However the consequences for writing history of marginalized women are more disastrous because the numbers are already so low.

      Exploring the data and pin pointing limitations for digital and statistical analysis provides insight and fuel for crafting historical questions. This article makes that concept clear.

    1. chartjunk

      love the name and how true it is to what this actually means

    2. Most visualization software do not automatically create legends, and so they become a neglected afterthought.

      legends are necessary for deciphering visualizations, its a shame the software often does not include a legend automatically

    3. If choosing the data to go into a visualization is the first step, picking a general form the second, and selecting appropriate visual encoding the third, the final step for putting together an effective information visualization is in following proper aesthetic design principles.

      general template

    4. Keep in mind that most projectors in classrooms still do not have as high a resolution as a piece of printed paper, so creating a printout for students or attendees of a lecture may be more effective than projecting your visualization on a screen.

      Important in classrooms.

    1. Your audience should influence your choice of color palette, as readers will always come to a visualization with preconceived notions of what your graphic variables imply.

      audience important to design decisions, cannot assume familiarity

    2. About a tenth of all men and a hundredth of all women have some form of color blindness. There are many varieties of color blindness; some people see completely in monochrome, others have difficulty distinguishing between red and green, or between blue and green, or other combinations besides. To compensate, visualizations may encode the same data in multiple variables.

      Data Visualizations are useful for simplifying data to communicate meaning, and reflect the importance of the use of multiple texts and visuals in teaching, To help students with varying skill levels and abilities, making visuals that bear in mind ability and disability is essential.

    1. These three variables should be used to represent different variable types. Except in one circumstance, discussed below, hue should only ever be used to represent nominal, qualitative data. People are not well-equipped to understand the quantitative difference between e.g. red and green. In a bar chart showing the average salary of faculty from different departments, hue can be used to differentiate the departments. Saturation and value, on the other hand, can be used to represent quantitative data. On a map, saturation might represent population density; in a scatterplot, saturation of the individual data points might represent somebody’s age or wealth. The one time hue may be used to represent quantitative values is when you have binary diverging data. For example, a map may show increasingly saturated blues for states that lean more Democratic, and increasingly saturated reds for states that lean more Republican. Besides this special case of two opposing colors, it is best to avoid using hue to represent quantitative data.

      important to note

    2. The nature of each of these data types will dictate which graphic variables may be used to visually represent them. The following section discusses several possible graphic variables, and how they relate to the various scales of measure.

      guidelines certain types of data for certain visualizations

    3. The art of visual encoding is in the ability to match data variables and graphic variables appropriately. Graphic variables include the color, shape, or position of objects in the visualization, whereas data variables include what is attempting to be visualized (e.g. temperature, height, age, country name, etc.)

      visual encoding

    1. There is no right visualization. A visualization is a decision you make based on what you want your audience to learn. That said, there are a great many wrong visualizations. Using a scatterplot to show average rainfall by country is a wrong decision; using a bar chart is a better one. Ultimately, your choice of which type of visualization to use is determined by how many variables you are using, whether they are qualitative or quantitative, how you are trying to compare them, and how you would like to present them. Creating an effective visualization begins by choosing from one of the many appropriate types for the task at hand, and discarding inappropriate types as necessary.

      guidelines

    2. It is important to remember that stylistic choices can deeply influence the message taken from a visualization. Horizontal and radial trees can represent the same information, but the former emphasizes change over time, whereas the latter emphases the centrality of the highest rung on the hierarchy. Both are equally valid, but they send very different messages to the reader.

      medium maters

    3. Whereas the previous types of visualizations dealt with data that were some combination of categorical, quantitative, and geographic, some data are inherently relational, and do not lend themselves to these sorts of visualizations. Hierarchical and nested data are a variety of network data, but they are a common enough variety that many visualizations have been designed with them in mind specifically. Examples of this type of data include family lineages, organizational hierarchies, computer subdirectories, and the evolutionary branching of species.

      hierarchical data visualizations for showing relational values

    4. Leave a comment on paragraph 47 0 In the humanities, map visualizations will often need to be of historical or imagined spaces. While there are many convenient pipelines to create custom data overlays of maps, creating new maps entirely can be a gruelling process with few easy tools to support it. It is never as simple as taking a picture of an old map and scanning it into the computer; the aspiring cartographer will need to painstakingly match points on an old scanned map to their modern latitude and longitude, or to create new map tiles entirely.

      good point. physical space and locations of towns or landmarks do not go unchanged over time. look at the sand creek example. the old map had the creek in a completely different location as nature had shifted the path of the creek over several decades.

    5. Keep in mind that often, even if you plan on representing geographic information, the best visualizations may not be on a map. In this case, unless you are trying to show that the higher density of populous areas is in the Eastern U.S., you may be better served by a bar chart, with bar heights representative of population size. That is, the latitude and longitude of the cities is not particularly important in conveying the information we are trying to get across.

      it seems like there are always alternative ways to visualize data

    6. Even these seemingly straightforward representations are loaded with significant choices, as laying two-dimensional coordinates onto a 3D world means making complicated choices around what map projection to use.

      maps still need interpretation and cannot be accepted as absolute

    7. Statistical charts are likely those that will be most familiar to any audience. When visualizing for communication purposes, it is important to keep in mind which types of visualizations your audience will find legible. Sometimes the most appropriate visualization for the job is the one that is most easily understood, rather than the one that most accurately portrays the data at hand. This is particularly true when representing many abstract variables at once: it is possible to create a visualization with color, size, angle, position, and shape all representing different aspects of the data, but it may become so complex as to be illegible.

      audience is very important when choosing a medium

    8. Our taxonomy is influenced by visualizing.org, a website dedicated to cataloguing interesting visualizations, but we take examples from many other sources as well

      reference site

    9. Often, because of change blindness, dynamic visualizations may be confusing and less informative than sequential static visualizations. Interactive visualizations have the potential to overload an audience, especially if the controls are varied and unintuitive. The key is striking a balance between clarity and flexibility.

      one model does not fit every audience. need to consider the audience when constructing a visualization!

    10. interactive visualizations allow the user to manipulate the graphical variables themselves in real-time

      interactive visualization definition. also sounds interesting with real time manipulation

    11. dynamic visualizations are short animations which show change, either over time or across some other variable

      dynamic visualization definition

    12. Static visualizations are those which do not move and cannot be manipulated

      static visualization definition

    13. A truly “objective” visualization, where the data speak for themselves, is impossible.

      need text and context! a visualization needs some explanation

    14. An information visualization differs from a scientific visualization in the data it aims to represent, and in how that representation is instantiated. Scientific visualizations maintain a specific spatial reference system, whereas information visualizations do not.

      !! important to note

    15. information visualization is the mapping of abstract data to graphic variables in order to make a visual representation.

      information visualization definition

    16. Microsoft Excel has a built-in sparkline feature for just such a purpose.

      Excel is beautiful.

    17. “Cartography is as much an art as it is a science.” While many of these choices are outside the scope of our book, they are significant.

      Cross section of art, history, and statistics in communicating meaning.

    18. Sometimes the most appropriate visualization for the job is the one that is most easily understood, rather than the one that most accurately portrays the data at hand.

      Primary function is to communication meaning over precision?

    1. In a public world that values quantification so highly, visualizations may lend an air of legitimacy to a piece of research which it may or may not deserve.

      means that like topic models without context or key explanatory features, this can be misleading.

    2. The right visualization can replace pages of text with a single graph and still convey the same amount of information.

      really?

    3. Uses of information visualization generally fall into two categories: exploration and communication.

      important reference 2 types

    4. This approach to distant reading– that is, seeing where in a text the object of inquiry is densest– has since become so common as to no longer feel like a visualization. Amazon’s Kindle has a search function called X-Ray (figure 5.2) which allows the reader to search for a series of words, and see the frequency with which those words appear in a text over the course of its pages.

      antconc also has a feature somewhat like this, the bar graph of frequency throughout a document or documents. I wonder how different the results are or if antconc is more limited.

    5. Visualizations can also lie, confuse, or otherwise misrepresent if used poorly.

      just like topic models!

    6. Visualization is a method of deforming, compressing, or otherwise manipulating data in order to see it in new and enlightening ways

      visualization definition

    7. It is also common for visualizations to be used to catch the eye of readers or peer reviewers, to make research more noticeable, memorable, or publishable. In a public world that values quantification so highly, visualizations may lend an air of legitimacy to a piece of research which it may or may not deserve. We will not comment on the ethical implications of such visualizations, but we do note that such visualizations are increasingly common and seem to play a role in successfully analyzing data, proving your case for peer review, or help make your work accessible to a general public. Whether the ends justify the means is a decision we leave to our readers.

      Draws attention to marketing and business in history..

    8. When first obtaining or creating a dataset, visualizations can be a valuable aid in understanding exactly what data are available and how they interconnect. In fact, even before a dataset is complete, visualizations can be used to recognize errors in the data collection process.

      Crafting the visualization helps with making sense of the data, process ad product are key to understanding.

    1. The downside to this is that there have been many analyses and visualizations that have used the tools and metaphors of network analysis without any real appreciation of the dangers and limitations.

      Acknowledging and factoring in limitations are key to analysis!