- Jun 2019
-
www.themacroscope.org www.themacroscope.org
-
A good “sanity check” is to see if the algorithm you’re using puts the people or entities you know should be grouped together in the same community, and leaves the ones that do not belong out. If it works on the parts of the data you know, you can be more certain it works on the parts of the data you do not.
Helpful advice
-
Network analysis aids in finding these unusually connective entities en masse and with great speed, leaving the historian with more time to explore the meaning behind this connection.
Again, network analysis can act as a tool in identifying the question rather than the answer.
-
If a historical network exhibits a long-tail distribution, with very prominent hub nodes, the structure itself is not particularly noteworthy. What is worthwhile is figuring out which nodes made it to the top, and why. Why do all roads lead to Rome? How did Mersenne and Hartlib develop such widespread correspondence networks in early modern Europe? The answers to these questions can cut to the heart of the circumstances of a historical period, and their formalization in networks can help guide us toward an answer.
Network analysis can produce fruitful specific questions as well as illuminate broader trends.
-
Social networks, article citation networks, airline travel networks, and many others feature a significantly more skewed degree distribution. A few hub nodes have huge numbers of edges, a handful more have a decent amount of edges, but significantly fewer, and most nodes are connected by very few edges.
I found this to be true in my work for Dr. Johnson on citation patterns in chemical journals
-
These global metrics are most useful when measured in comparison to other networks; early modern and present day social networks both exhibit scale-free properties, but the useful information is in how those properties differ from one another.
In other words, don't draw your conclusions solely from the network of a single dataset but from the comparison between two networks. This is also a good way to determine if you have a two datasets which can be effectively put into conversation with eachother. If the networks which historically seem ripe for comparison are radically different, then one of your corpus may be incomplete, skewed, or otherwise aberrant. On the other hand, if they are identical, your data sets aren't capturing the features you wish to distinguish.
-
perform math incorrectly if they are included
A serious concern!
-
When two articles reference a common third work in their bibliographies, they get an edge drawn between then; if those two articles share 10 references between them, the edge gets a stronger weight of 10. A bibliographic coupling network, then, connects articles that reference similar material, and provides a snapshot of a community of practice that emerges from the decisions authors make about whom to cite.
I read some research on this when working for Dr. Johnson my first year. My reading focused on scholarly networks between chemical journals, specifically which journals cited which other journals and the degree to which that relationship was unidirectional (seeking validation, legitimation through a more prestigious journal) or reciprocal (trading on each other's established prestige for mutual benefit).Very interesting way of determining the degree and source of a journal's influence within the field and how that influence shifts over time.
-
The directionality of an edge can have huge repercussions to a network’s structure, and so algorithms made for undirected networks to find local communities or node importance might produce very unlikely results on a directed network. Be careful that you only use algorithms made specifically for directed networks when analyzing them.
Also important to note that visualizations can't demonstrate the degree of directionality an edge has. There is no good way to visualize a relationship that is highly directional as opposed to one that is mildly directional. Since weight and directionality can be, and often are, two independent variables, a heavily weighted arrow implies no more directionality than a lightly weighted arrow. This may be a point of confusion however for those reading a network diagram, unless made clear in the text.
-
Transitivity is the concept that when A is connected to B and C, B and C will also be connected. Some networks, like those between friends, feature a high degree of transitivity; others do not.
Networks with a single hierarchical power structure (unipolar) would thus tend to have less Transitivity, while those where power is more diffuse (multipolar) would have more transitivity.
-
vertices, actors, agents, or points
Standardized terminology is sorely lacking among historians. Good to know digital history is no exception.
-
-
www.themacroscope.org www.themacroscope.org
-
Graphs have a tendency of making a data set look sophisticated and important, without having solved the problem of enlightening the viewer.[13]
Graphs as a form of legitimization rather than presentation, a cohesive and comprehensive picture of something which is anything but.
-
creating structural holes and becoming the link between communities
How would network analysis identify whether or not the Medici's themselves created these holes or they were created by some other factor? Was this agency attributed by the network analysis itself or a secondary form of historical research? Can agency be identified in a network analysis at all?
-
As much as networks reveal communities, they also obscure more complex connections that exist outside of the immediate data being analyzed.
The limitations of distant reading. You may see everything within the corpus, but nothing else. These complex connections are only visible when viewed individually within its own specific context, which is why digital history must always be in communication with other methods of historical analysis to capture the full richness of what is being studied.
-
We need to be extremely careful when analyzing networks not to read power relationships into data that may simply be imbalanced.
Like other forms of digital historical analysis, the conclusions drawn from network analysis apply only to the corpus studied, not the historical moment itself and thus reflect first and foremost trends in the corpus itself. Selecting a balanced corpus and defending it as representative is essential before these trends can be tied to historical arguments. This is not unique to digital history, although the burden of proof may often be placed higher with digital historians' broad scope and greater perceived legitimacy of its mathematical and statistical methods. All historians must be equally fastidious in defending their source selection as representative of their historical moment before extending their arguments any further than as a representation of their own source base.
-