1,068 Matching Annotations
  1. Sep 2021
  2. Jun 2019
    1. For the same reason, the algorithm can’t take the multiple meanings of words into consideration, so words such as “well” and “like” are often marked as positive, even when they’re used in neutral ways.

    1. I find the analogy of a microscope to that of a macro scope used in this text interesting. In order for historians to see the bigger picture of an object they have to break down the complexities set aside. It speaks to say how much information can be discovered looking at things from a large perspective

    1. I first took a statistically significant sample of issues form my collection. I then helped design a program to overlay a grid onto each image of a newspaper page.

      A similar approach to laying down grids is used when design posters/ads. The grid is one of the most crucial tools used by graphic designers.

    2. National metropolises such as San Francisco, Baltimore, or Boston were surprisingly muted in the�Houston Daily Post�relative to their populations, especially in comparison to the midwestern cities of Chicago, St. Louis, and Kansas City

      I don't think this was a great idea only because information shoulndt be kept but rather spread to similar growing cities.

    3. The second period, 1894-1901, marked Houston’s final years as a minor commercial city before its ascension to the energy capital of the United States during the twentieth century.

      This is interesting because this was just around the time of the war, time when energy is most in need.

  3. May 2019
    1. Run through its various steps so that you end up with a json file of results

      not sure what this means? Json file of results?

    1. Crowdsourcing is becoming more widespread, and thus, it is important to understand exactly how, and if, it works. It is a viable and cost-effective strategy only if the task is well facilitated, and the institution or project leaders are able to build up a cohort of willing volunteers.

      This summer a colleague of mine and I are working on creating digital brushes for industrial design students. We have began to develop some brushes for specific platforms but the idea of creating a platform to crowdsource brushes is very unique. From a business standpoint this allows for a wider variety of brushes and expertise as designers would be able to post on the platform their brushes and potentially sell it. The platform would take royalty on each brush set sold similar to how shopify and amazon run. Relating back to the course this allows for the "true" goal to be achieved (in my case the ideal brush sets).

    2. In short, while Bentham’s manuscripts comprise material of potentially great significance for a wide range of disciplines, much of the collection – far from being even adequately studied – is virtually unknown.

      Referring back to one of the my previous annotations, T-level knowledge is key in todays work force. To see that Bentham's manuscripts cover a wide rang of disciplines is reassuring for our future approach to industrialization.

    3. Crowdsourcing is an increasingly popular and attractive option for archivists, librarians, scientists, and scholarly editors working with large collections in need of tagging, annotating, editing, or transcribing.

      This is interesting because a similar method of crowd sourcing was used with IBM developed an AI software to "argue with humans". This method is similar to some of the readings done in the last week where historians try and crowdsource their information to come closer to the truth.

    1. A book I wrote was recently published. It is on making and selling of satirical prints in Britain – mostly London – during the late eighteenth and early nineteenth centuries. It has been on my mind, across my desk, and in my Dropbox for a long time. Indeed the postdoctoral fellowship that started the research was my first proper foray into ‘digital’ history. As the fellowship application stated in 2012, my plan was to:

      I understand this in a historian perspective, because sometimes historians fill in the blanks of missing information with their predictions and theories as to what is the context behind the artifact, message, piece of writing or other. The information presented is not always intrepreted in a single way, there are multiple meanings and understandings behind it. That is why Digital History is built to include these details that might solve the issue of blank information in the future.

    2. For those who made satirical prints, these contacts included suppliers of raw materials, individuals and businesses who could undertake out work, and groups whose trustworthiness could be guaranteed. Relationships within these networks were established and maintained by direct, indirect, environmental, and community ties.

      I find this interesting because it it similar to what we find today online with ecommerce sites. Shopify has bascially integrated all of the contacts/help needed for distribution and supply of the products you order.

    1. logged into


    2. THIS TEXT


    3. To access our virtual computer, the DH Box, you will need to use Carleton's VPN service.

      I had difficulty installing the VPN in chrome. had to consult IT for help, they said to use firefox.

    4. Each week you will upload a fail log to you GitHub account. You will include a link to your 3 most important Hypothes.is annotations and reflect on how those 3 annotations were meaningful to your learning. In the first week you will also reflect on your worries for the course, any technical issues you encountered, etc.

      More clear on what to write in the fail logs. How do I reconfigure the text sizing etc for my published part? do the # represent the titles?

    5. THIS TEXT


    6. THIS TEXT

      First annotation

    7. THIS TEXT and leave an annotation!


    8. emember then: make your annotations in our HIST3814o group!

      Is this how to post to the group? add a tag?

    1. .  I try to publish open access as frequently as possible and share that work online.

      It's clear that digital history is a passion of hers. Here she displays how open data can actually work

    1. The commodification of ideas as currency in academia means that writing is often concealed until publication,

      I really appreciate the way the author has articulated this. This statement comprehensively encompasses the issue with data or notes that are not open while also explaining the need for open data particularly in appreciating and enriching the writing process.

    2. The commodification of ideas as currency in academia means that writing is often concealed until publication, leaving the interim versions in the struggle towards a publishable version unseen.  These processes often leave the academic writer isolated. Writing in public counters this.

      This is worrying yet hopeful to read. In design, we are always taught to test and show your initial work, feedback is crucial to developing a good product that doesn't turn obsolete in a few months. At first no one wants to share their designs as we worry our concepts might get stolen but a bad stolen idea is still a bad idea. It is even more crucial writing becomes opened sourced soon so that the validity of information is more precise. It is good to read that things are changing though.

    1. it’s our data, we collected it, and if somebody else wants the data, they should collect it themselves.

      I think this is a huge sentiment that historians and scholars in all fields should and will begin to overcome for the benefit of all by adding convenience to research, increase efficiency, and encourage engagement in the community.

    2. As data management plans become mandatory components of research proposals, maybe we should start looking out for what historians will be doing with their notes and research data? It’d be a potential to really kickstart historical research, speed up some research, increase efficiency (time for me to duck), and help decrease PhD completion times. Not a magic bullet, but … maybe 10% of one?

      efficiency is important, as we are always competing. digitization could allow for Canada to compete internationally in research!

    3. As data management plans become mandatory components of research proposals, maybe we should start looking out for what historians will be doing with their notes and research data? It’d be a potential to really kickstart historical research, speed up some research, increase efficiency (time for me to duck), and help decrease PhD completion times. Not a magic bullet, but … maybe 10% of one?

      I found managing research notes and information was crucial when looking into for my 4th year capstone project. The initial phase of research was group based (i was in the water purification group) and all our information was presented in one file to keep track of any repetitive information being provide to our overall research. This shows that Milligan is open to the idea of "openness".

    4. I know I’m looking forward to depositing my data. Maybe other historians will cite it, I’ll put it on my annual review – or maybe even my CV – and we can slowly start having a professional shift here in Canada.

      This definitely shows his viewpoint, which aligns with mine. If we are able to deposit sources, and create a more efficient and helpful research platform, it will help the majority of people out.

    1. Sharing is something that tends to make scholars, qua scholars, happy; presumably it’s why we are in the business of writing, speaking, and teaching in the first place.

      While the idea of an "open notebook" seems scary it certainly furthers academic exchange. Especially as a student, the opportunity to learn from other scholars notes, methods, or failures would greatly contribute to the learning and teaching process.

    2. Hyperlinks would not have solved the other weakness of Phillips’s notebook: its inability to track, at a fine-grained level, changes to a page or to his thinking over time. Digital notebooks, however, could overcome this challenge as well. The solution here is version control, a technology familiar to the open-source software world and embedded (behind the scenes) in many of the tools historians already use. Microsoft Word’s “track changes” feature is essentially a version of version control, a way of seeing precisely how a text has been modified at a particular moment of time. Wikipedia’s “history” pages provide a more powerful version of the same feature. And as Konrad Lawson has shown in a recent Profhacker series on Github, programs like Git provide the most powerful version control systems of all, allowing their users exceedingly fine-grained views of when and how files were changed.

      I found this entire paragraph interesting because it takes into account the method of checking, in this case "version control". Version control reminds me of micromanagement; the devil's in the detail. I think that is where the problem lies, because historian have so much data to synthesize, the ability to trace back sources for that information can be so difficult. By simplifying interfaces to help others help you (similar to how Microsoft word has done), this task of micromanaging your sources no longer exists allowing historians do what they do best which is tell our history.

    3. These notifications could range from one that says, essentially, “if it isn’t in the notebook others can assume that you haven’t done it,” to more limited notifications that say clearly “others cannot assume that if it isn’t in the notebook you haven’t done it.”

      I think that this is a good idea because it leaves some discretion to the researcher as to what they want to be published. On the other hand however, I can see how this may not be the best idea, because as readers using the source/research, it would be hard to understand what is and isn't included.

    4. The truth is that we often don’t realize the value of what we have until someone else sees it

      I really like this quote. It's a good reminder that a lot of the time we undermine our own work, when in reality it is actually good and useful for others. We don't tend to see the hard work and time put into our own research until others point it out.

    5. Open Notebook

      It is clear that McDaniel is supportive of a really wide form of openness through the use of open notebooks.

    6. all content is shared immediately or “without significant delay.”

      While I think this is a cool idea and could be great for research, I'm wondering as a student how useful it could really be to us. Many profs ask for sources to be scholarly and peer reviewed. I can see this working well for those wanting to conduct their own original research but not so much for a student trying to write a paper.

    1. otherwise change how they do their work?

      This is an important insight that is not really expanded on throughout this article, as it exceeds its scope. However, I think that increasing "openness" could certainly change the way historians do their work whether it be more careful, or simply making research easier. The work of the historian will look very different.

    2. you are only a click away from scans of many of the declassified primary sources Suri used to develop his argument.

      This is very new and important development in research today. Not only does it allow the reader to better understand the sources behind a historians argument but also to view the actual source being referenced from their own point of view.

    3. This kind of double checking doesn’t happen that often largely because it is so time consuming

      digitization could help reduce the spread of fake information. This is becoming more important in recent years.

    4. So, the question is, when it takes 15 seconds instead of 15 hours to fact check a source do we think historians will start to write differently, or otherwise change how they do their work?

      It is clear that Owens is in favour of "openness". His idea of linking footnotes is a great way to get easy access to the sources to check the legitimacy of the content.

    5. How many people would retrace a historians footsteps through archives scattered around the world to double check each citation?

      I assume this question is a rhetorical one, although I don't believe that it should be with the technology we have today. With AI being able to understand discrepancies in hardware and software, the ease of validity should be at our finger tips by now. It is almost crucial to be able to do this with so much information on the internet.

    6. every professional and amature historian will be able to end their papers with. “You can find the documents cited in this paper @ Zotero Commons.”

      I think it is pretty clear that Owens is in favour of an idea of openness. His idea of having linked footnotes is very interesting and different from the other articles, I think this may have to do with his experience as a librarian.

    7. You might think the linked citations I just mentioned are something that will never happen

      Obviously this article was written in 2008 and I think it can be said that in 2019 there has been a greater move towards this kind of a thing.

    1. git push origin experiment

      This should be git push -u origin experiment as in the video

    2. fork to make a copy of someone else's repo. clone to copy a repo online onto your own computer.

      So the difference between these two is that fork makes a copy of a repo onto your github page, while clone copies a repo onto your computer, right?

    3. link outwards to four websites that are relevant to your piece.

      I tried the two methods of inserting image - Inline style and reference style. But the output of both seemed to be the same. I wonder what I have done wrong here?

    4. Markdown syntax

      I tried some Markdown syntax in Dillinger. Seems like some groups of symbols can achieve the same thing, for example, '-' and '' both create bullet list format, '>' and enclosing '' both make italic font style. Are those groups of symbols used interchangably? Or they actually have unique meanings?

    5. Click on the file name, and it will download to your computer where you can open it with a text editor

      Make sure to have downloaded a text editor before hand

    6. Carleton students: If you are on campus or are logged into Carleton's systems via a VPN, go to the sign up page at Other folks: Go to the DH Box sign up page at http://dhbox.org/signup. These are two separate installations of DH Box; whichever one you start with, continue to use

      I was wondering why I was not able to log in, that is because there are two different sites.

    7. See how easy that was? Don't worry about submitting this.... yet.

      It was actually easy, not too frustrating at all!

    8. we merge with $ git merge experiment.

      Don't forget to use the actual title of your branch if you didn't use the name experiment.. just spent 10 minutes messing around to realize I named my branch testing not experiment!

    9. Open your readme.md file again

      Was a little confused on how to do this as I thought the nano command was only for creating and not only editing!

    10. Let's take a look inside that new file you created.

      I created the file, but my MackBook will not allow me to open it which it is downloaded.

    11. Grab at least two Creative Commons images

      FYI Had an issue with this, it didn't work when I copied the link straight from the address bar. It worked when I right clicked and copied web address.

    12. because copying and pasting preserves a whole lot of extra gunk that messes up your materials

      Ctrl+shift+V gets rid of all the gunk 9/10 times. It pastes what you've copied without the formatting from where you copied it from!

    13. the right side shows you what your text will look like if you converted the text to HTML.

      Is HTML a markdown language?

    1. And some focused thinking about the ways we communicate with those publics is in order, I would suggest, because many of our fields are facing crises that we cannot solve on our own

      Thing in our world today require multiple sets of skill and knowledge. To start your own engineering consultancy you must be an engineer but you also must know how to run a business. It is because of this there are many fields getting more and more specialized within specific sectors and need an acquired set of skills. By open sourcing peoples work this form of skill share allows us to solve problems that we never could have on our own.

    2. everything in their educations to that point had prepared them for interrogating and unpacking, demystifying and subverting, all of the most important critical acts of reading against the grain, but too little emphasis had been placed on the acts of paying attention, of listening, of reading with rather than reading against.

      This quote reveals a lot about education and expectations from students. Throughout my high school years, all readings were so focused on symbolism and pointing out all of these parts of books and articles that in the grand scheme, didn't make a difference to the point of the article. When I arrived for my first year of university it was a shift I had to make, because reading was no longer so focused on these minute details but rather more about understanding the content that was being presented.

    3. Someone particularly visible makes a publicly disparaging remark about what a student is going to do with that art-history degree; commentators reinforce the sense that humanities majors are worth less than pre-professional degrees with the presumption of clearly defined career paths

      While I'm not a humanities major, I feel like similar sentiments are felt for Poli Sci. But Carleton in terms of Poli Sci has become innovative for this. Carleton offers a master's in Political Management which focuses on providing Poli Sci majors the opportunity to study and practice politics in the face of real life experiences. The program is trying to fill this gap and create a program that can offer students a "clearer" career path.

    1. For now, I am stuck in the middle, and all I want is for this project to be called done.

      This is definitely frustrating, when you think it will go a certain way, and you're excited, to it not being anything like the way you planned and it is no longer fun. It is certainly relatable in many aspects, but I think it is important to see this coming from someone who is an academic higher up in the field. It is a strong reminder to push through, even when things may get tough.

    1. # H1

      Hey Class, remember that it is important to leave a space after the # in order for the command to work.

    1. how to work with GitHub to foster collaboration

      This is great! This will help international research around the world become more feasible. It is great to find new ways to communicate and interact online when conduction projects. It will save time, money, and resources.

    1. Open Data

      As a Poli Sci major I've used some the data that was mentioned above and I do agree that Canada is making a great effort to make more data open. Canada's Access to Information has also become a really great tool if the information you get isn't totally redacted!

    1. By measuring trends, ideas, and institutions against each other over time, scholars will be able to take on a much larger body of texts than they normally do. For example, in applying Paper Machines to a hand-curated text corpus of large numbers of bureaucratic texts on global land reform from the twentieth century, it has been possible to trace the conversations in British history from the local stories at their points of origin forward, leaping from micro-historical research in British archives into a longue-durée synthesis of policy trends on a worldwide scale.

      The increase in big data allows for historians to have a better look at trends and patterns rather than moments events

    2. Big data tend to drive the social sciences towards larger and larger problems, which in history are largely those of world events and institutional development over longer and longer periods of time. Projects about the long history of climate change, the consequences of the slave trade, or the varieties and fates of western property law make use of computational techniques, in ways that simultaneously pioneer new frontiers of data manipulation and make historical questions relevant to modern concerns.

      This is significant as it changes the nature of questions historians may pose or seek to answer

    3. Over the last decade, the emergence of the digital humanities as a field has meant that a range of tools are within the grasp of anyone, scholar or citizen, who wants to try their hand at making sense of long stretches of time.

      I think that it is important that history is becoming digitized, as the whole world is heading in that direct from business to government. It is crucial that history follows suit.

  4. www.themacroscope.org www.themacroscope.org
    1. Historians must be open to the digital turn, thanks to the astounding growth of digital sources and an increasing technical ability to process them on a mass scale

      Discussing the digital world and history in the same thought has never really occurred to me. This is an interesting and important statement made by the author as it points to the importance of technology for the historian and the historians work. While the digital world brings great opportunities for vast and expansive research, it also brings challenges for historians whose sources are now drastically different than before.

    2. the more recent shift from “humanities computing” to the “digital humanities.”

      Interested to see what this shift means and also the how and why

    1. As history becomes digitized in ever-increasing scales, historians without the ability to research both micro- and macroscopically may be in danger of becoming mired in evidence or lost in the noise.

      I find this interesting because similar to the design process for developing products, it is important to look at the small goals (the detailed features in the products) and the large scale goals of the product far from when it is launched. It is only when both are taken into account when you can really come up with something legitimate.

    2. Microhistory involves the rigorous and in-depth study of a single story or moment in history, whereas macrohistory susses out long-term trends and eddies,

      I find this interesting because science in todays world is like the microhistory while science fiction in today's world is like macrohistory where we can see how far we can go in the future with the little we have now.

    3. good historians, like good detectives, test their merit through expansion: the ability to extract complex knowledge from the smallest crumbs of evidence that history has left behind. By tracing the trail of these breadcrumbs, a historian might weave together a narrative of the past.

      I think this is amazing and hope we can do some of this in the course. It is like solving a complicated real life puzzle.

    4. Often, macroscopes produce textual abstractions or data visualizations in lieu of direct images.[1]

      Not sure what they mean by "produce textual abstractions"

    1. The Digital Humanities—and by inclusion, Digital History—cannot be a playground for the privileged. Letting it become so will undo decades of important work done in the humanities to listen for and amplify the voices of those who are too often ignored. The instrument of the digital historian, a macroscope, is just as able to obscure the context of violence as it is to highlight that violence.

      Not too sure what they mean by privileged, is there a way to measure this?

    2. By not explicitly pointing out tools and approaches that embrace feminist values and diverse outlooks, we risk perpetuating incongruities, barriers, and biases in DH research

      I completely agree with this as you literally double the about of people researching one topic.

    3. We wrote it in the open, inviting the world to contribute their edits, ideas, and advice for our final draft. We engaged with our readers, and were heartened and encouraged when the Macroscope began appearing on syllabi; when students started leaving comments, we were overjoyed.

      This is great to read as user feedback can contribute a lot to user research when building or writing something catered towards a specific user.

    4. I find this line interesting because the once your remove the tangible product it immediately loses its ability to become obsolete. This allows us to make changes yes but from a perspective of valid information, this gives opportunity for biased information or wrong information to be published illegitimately.

    1. Historians rarely use phrases like ‘abstraction’ and ‘data models’, but these are things we do and make all the time in our research, just in less formal ways and in formats that are less easy to process as data, to run algorithms against, to visualise, to tabulate, and to reproduce.

      Really interesting to think about. Both if these reading by Baker highlight the fact that these things that are seen as only relative to the tech world do overlap in other fields like history. I think this demonstrates that if we maximize our use of technology we can achieve greater results and understandings of what we wish to study

    1. . A blogging platform is a good way of doing that. FYI, here’s mine: Electric Archaeology; I’ve tried another variation on the model here.

      I am a bit confused about the blog post thing. Are we supposed to create a blog post every week? and where/how?

    1. I was curious to figure out if I could train the computer to write an Indiana Jones script.

      This seems like it will be difficult to learn to do. I am worried about coding.

    1. The project thus leaves a legacy to future researchers, to enable them to point their macroscope toward the trials, to make sense of that exhaustive dataset of 127 million words.

      This is so interesting because I find that when reading scholarly journals or even just learning about topics that involve big data I never think about how this data is actually collected.

    2. Data Mining with Criminal Intent

      I will like to explore more about the "Data Mining with Criminal Intent project" and how it relates to the criminal system today. Has it changed over time?

    3. Tackling a dataset of this size, however, requires specialized tools. Once digitized, it was made available to the public through keyword searches. Big data methodologies, however, offered new opportunities to make sense of this very old historical material

      It was really interesting to see how the preservation of history has taken its root in modernizing with humanity. Providing the access to historical information to everyone. Making it in the term of a right and privilege to all to have access to the bases of history.

  5. Jun 2018
    1. The grid system

      I never knew such a concept could be implemented literally almost everywhere we look. Awesome.

    2. I would have missed a huge amount of content that proved foundational to how a newspaper produced space for its readers

      I think it was important that he didn't limit the data and took a chance at using a non traditional method, ultimately it worked in his favour, although it is always risky!

    3. computing allows us to access and make sense of otherwise incomprehensibly vast amounts of information

      Just learning how important this is now!! we can take big sets of data and analyze it much quicker and categorize/ organize it to make sense to us

    4. I wrote a computer program to track how frequently a newspaper printed specific geographic place-names to re-create how it produced space

      sorda like what we did in RedExr, we used a formula to track all the (,) in out piece

    5. distant reading

      can anyone explain what this means?

    6. newspapers were cheap and widely available

      so much of a change from now!! Newspapers to day are somewhat outdated and often hard to come by. In addition they aren't cheep!!! its easier to read it online for free then it would be to go to the store and purchase one

    7. Switching from one perspective to the other demonstrates just how much had changed over half a century.

      It is very important to show both perspectives in research to eliminate as much bias as possible, in this case they are showing the amount of change from both views

    8. Houston Daily Post

      The Huston Post was a news paper that originated in Texas!

    9. imbuing different neighborhoods with different meanings

      I think this is very true, different neighbourhoods all over Ottawa share very different meanings, and look very different based on it. It give history and uniqueness!

  6. May 2018
    1. Crowdsourcing

      new term for me, i'm sure the reading will explain but i googled it can came up with, "enlisting services of a large number of people

    2. eam has successfully promoted its project

      so much work and money was put into this project!

    3. ranscribe Bentham has certainly made an impact on the academic community and libraries and archives profession;

      could be used for our reading questions!

    4. the cost of printing and postage for which was around £360

      omg!!!! not sure when this paper was written but postage is a lot more expensive in the past 2 years, image the cost now!

    5. source material is a huge collection of complex manuscripts

      this seems like it is more beneficial with the huge collection of data that it is able to gather, i feel like this advancement was and could be very helpful

    6. The Bentham Project was founded in 1958, and since then 20,000 folios have been transcribed and twenty-nine volumes have been published

      at first reading this i thought that since 1958 way more should have been published, upon re reading it i notice the word "volumes", i realized this is quite a big difference to what i had originally thought

    7. A project like Galaxy Zoo, for example, has successfully built up a community of more than 200,000 users who have classified over 100 million galaxies

      truly amazing!! im shocked at the large data build up

    1. 90% of the data in the world today has been created in the last two years alone.

      I think that this is very interesting. The use of technology has been a growing phenomena in the past few yeas, however I would have never guessed that 90% has been in the last two years

    1. But when checking sources becomes as simple as clicking a link what do we think will turn up everyone else’s footnotes?

      I think transparency is crucial when it comes to conducting solid research, when you pull from sources that can be checked it only strengthens the argument.

    2. you are only a click away from scans of many of the declassified primary sources Suri used to develop his argument. This gives the reader a radically transparent view into the source material supporting the case Suri argues.

      This is a game changer because with such easy access to a historians sources it almost forces them to be more diligent in terms of ensuring the research that they have done and and the content used to back their claims is credible.

    3. radical transparency

      I like the author's use of the word radical. I think few people realize that history making is a political act, and that transparency/open access is ~democratizing~ power in an interesting way.

    1. Let’s get more data!

      woot woot!!! the more the merrier

    2. In general, you’d be right: most open data releases tend to do with scientific, technical, statistical, or other applications

      before reading this i did not not know even what open data was!

    1. These crises, I must acknowledge, are not life-threatening, not world-historical, not approaching the kind or degree of the highly volatile political situation we face both at home and in the world, living as we do at a moment

      I find this so innocent, the research is not such a heavy top topics that we are used to hearing about, but rather a important topic that isnt life threatinging

    1. which events mark watershed moments in their history, and which are merely part of a larger pattern.

      It's amazing that the ability exists to conduct such a large scale evaluation of history in order to better understand these watershed moments. I've read about the tendency that we have in the 21st century to view the current political climate and state of the world as getting worse. In reality, many studies suggest that the situation and climate are actually improving over time, but just that it is easy to have a "grass is greener" attitude toward time periods you have not lived in. A lot of this is of course relative, but I am interested in seeing how big data might be able to track progress over time in this way.

    2. the emergence of the digital humanities as a field has meant that a range of tools are within the grasp of anyone, scholar or citizen, who wants to try their hand at making sense of long stretches of time

      In seems that in many fields, the digital age is bridging the gap between the abilities of a scholar and the abilities of a citizen to engage with different types of knowledge. On one hand this is very exciting because it suggests new opportunities and a greater sense of equality, but there are of course many challenged that come with this too. It seems that anybody can be an "expert" today, which is on one hand good for society, but can also challenge the authority of formally educated experts.

    3. societies were feeling overwhelmed about their abilities to synthesise the past and peer into the future

      It seems that the field of digital humanities serves as a way to work with and around the digital age rather than trying to combat it as many might consider when the digital age poses many challenges and can lead to information overload.

    1. rather I needed a proxy for difference

      Reading this really helped me to understand just how digital tools can be used to examine big data in a way that doesn't require the human researcher to do more than in possible. By setting up a proxy for difference, the researcher is able to compare this proxy against all of his prints in an efficient manner.

    1. I was accused of suppressing the digital, of providing a bad example that played into old habits and prejudices.

      While I think that ultimately what a researcher chooses to publish is their own decision, I would have to side with the student (though not necessarily his method) who opted for including the soft data. It seems highly important in digital history to show your data and progress in order to present the whole picture of how your ideas came to fruition and ultimately resulted in your argument. While presenting a precise argument is paramount, it would likely be more helpful for readers to show exactly how you arrived at that conclusion.

    1. long-term trends and eddies

      Being about to examine long-term trends by using a macroscope would be extremely useful in applying historical trends and data to the present day.

    2. A historian’s macroscope offers a complementary, but very different, path to knowledge. It allows you to begin with the complex and winnow it down until a narrative emerges from the cacophony of evidence.

      This macro approach seems to counter the tendency in academia to focus on a highly specific and micro topic. I am interested in seeing how this different approach will be explored in an academic setting.

    3. By tracing the trail of these breadcrumbs, a historian might weave together a narrative of the past.

      This breadcrumb analogy really interests me. Using this approach would allow for taking hard historical facts and turning them into more holistic and easily digestible narratives.

    1. I am interested in your progress.

      This is a good sentiment since not everyone learns at the same pace

    2. You push yourself until you get to the point where you are stumped

      I can see this being both frustrating and stimulating. Pushing myself, getting stumped and being okay with it.

    1. there is no recipe I can give you that will enable you to ‘do’ digital history

      This makes the class so interesting to me. Knowing that there's no exact way to 'do' digital history. It's all about how i meaningfully interpret and analyze the data.

  7. Sep 2017
    1. She can endow them with mental power by not frittering away her own powers of mind in foolish reading or careless methods of study. By her own self-respecting conduct she helps to give them the reverence for self which will insure their acting wisely.

      According to this text, the most important goal of female life is childbearing (passing on good genetics) and childrearing (passing on good behaviours). Female worth is equated with motherhood, both biological and as a practice.

  8. Aug 2017
    1. MuseuminShawvilleonestepnearerrealityWithcooperationofOFY

      The story of this museum might be a good story to explore. You could crosslink annotations to other editions of the Equity, to videos, images, audio...

    1. Colonial Newspaper Database OR the Shawville Equity folder.

      or whatever other dataset you've put together (tweets, whatever).

      NB: has to be arranged as one file per document within your input directory

    2. Double-click on the file you downloaded in step 1

      the tool has changed somewhat, and I'm getting an error on the dmg that he's made. I can make available an earlier version of this if folks are having trouble

    3. Assuming your files are in equityfolder,

      DHbox Is not recognising the comand ?

    4. nd learn the basics of R within R

      How do you use Rstudio? with the DH box How do you Integrate the two of them once you have Rstudio on your computer?

    5. shit+cmnd+rightarrow, on Windows shitf+crtl+rightarrow

      Only noting a typo in case Dr. Graham is looking for these. The content is clear.

    6. select 'coordinates'.

      It won't let me select anything under 'places', could this be why my map doesn't show up?

    7. the RezoViz tool to create a graph where people, places, and organizations that appear in the same documents (and across documents) are connected (you can find 'rezoviz' under the cogwheel icon at the top right of the panel)

      I don't see the gear symbol. I looked up Rezovis, I can't tell if it's a seperate voyant tool or if I should be seeing it when I copy the text into the Voyant text box in the first page or in the results?

    1. quick visualizations using RAW

      I dont undersand how RAW works or how you are suppse to use it?

    2. Back to wrangling

      Always reassuring to know that even the pros struggle sometimes! I have a feeling there will be a lot of "wrangling" for all of us during this week's exercises! :)

    1. http://bridge.library.wisc.edu/hw1a-Rcoding-Jockers.html

      This link is not working any more. I could not find the code on http://www.matthewjockers.net/

    2. This file was put together by Matt Jockers.

      Would it be possible for someone to take an issue of the equity and do this for ourselves?

    1. name of a traitor like Paul Revere from those of two hundred and fifty four other men, using nothing but a list of memberships

      I would like to know how they came up with the list of organizations that are/were considered terrorist groups, was it through the author's employment at the Royal Security Administration? Or is that the whole "hush hush" part of this article. Just got me thinking! Perhaps this is an instance where someone could be hurt from this metadata.

    2. Once again, I remind you that I know nothing of Mr Revere, or his conversations, or his habits or beliefs, his writings (if he has any) or his personal life. All I know is this bit of metadata, based on membership in some organizations.

      It's quite amazing what you can derive from simple metadata

    3. people who seem to bridge various groups in ways that might perhaps be relevant to national security.

      As I was discussing in my reply, thee commonalities are most intriguing to me.

    4. Here’s what that looks like.

      Visuals are so much easier to read than tables, in my opinion. I am enjoying learning about the tools that make this possible.

    5. Surely this is but a small encroachment on the freedom of the Crown’s subjects.

      I would agree that this is a small encroachment. Having recordings of meeting would be a much larger privacy issue, but the data the author is talking about sounds like something that would be fitting for the public record.

    6. I will show how we can use this “metadata” to find key persons involved in terrorist groups operating within the Colonies at the present time.

      Interesting use of metadata. I imagine it will offer a unique and entertaining glimpse into that time period.

    7. So, there you have it. From a table of membership in different groups we have gotten a picture of a kind of social network between individuals, a sense of the degree of connection between organizations, and some strong hints of who the key players are in this world

      The use of this kind of statistical analysis really helps viewers who are unfamiliar with this historical landscape build a better understanding how the social connections between individuals.

    8. Instead of seeing how (and which) people are linked by their shared membership in organizations, we see which organizations are linked through the people that belong to them both.

      This is an interesting way to think outside the box. Instead of focussing on the people, focus on the institutions they are associated with, which might lead to some of their human associations. This is essentially history policing work.

    9. “information acquired does not include the content of any communications”

      On the question of 'who is hurt by this', the obvious answer is those being followed in this manner. Though the content of communications is not possessed by the government, those who are being followed are still at the mercy of this kind of surveillance. This of course brings us to the questions the exist in our post-Snowden world. Though this big data is not always used in this kind of context, it still possible. Imagine this scenario. Say we had no communications from Paul Revere and his only association with the the American Revolution was that he knew some of the key players and attended some of the same clubs. Based on this data and this reading we would make the assumption that he was a key player, perhaps unnamed. (This is maybe a poor example as Paul Revere's involvement in the revolution is so well documented) As historians we would be granting credit to a person who may not have had any. This is perhaps a danger of analysing big data in this way.

    1. Had the story been told in simple chronological order, it would have been bland, perhaps even boring. What gave Harvey’s show power was his narrative technique.

      Such a clever approach especially for a radio show. It reminds me of the way TV shows cut to a commercial right at pivotal moments of suspense in the show. It hooks the viewers, just like how the key elements are held back until the very end of the radio program.

    1. Function words and other frequent words carry little meaning in modelled topics

      Is there a rule of thumb for this?

    2. models language instead of topics

      I think this is one of the issues with bookworm. Though I love it with a passion and intensity it does only model language. Language can sometimes reflect topic (See Martha Ballard's diary) but that is not always true.

    3. Three topics modelled on 64,000 song lyrics: baby like come oh yeah let know gonna m go never get one na re hey love ll wanna man get like baby know let go ll got gonna love back girl feel away want oh gotta time take hey na que de y like la m get el te re tu en mi ang yo un ya sa es

      Not sure I'm fully understanding this reading but this made me laugh!

    1. OCR is, at base, a process by which a computer program scans these images and attempts to identify alpha-numeric symbols (letters and numbers) so they can be translated into electronic text.

      I've played with Google's Cloud Platform OCR. It can be found here

    2. While such basic searches can, indeed, find stray information scattered in unlikely places, they becomeincreasingly less useful as datasets continue

      I think we've all tried to find a needle in a haystack once or twice on Google Search. Learning how to properly navigate archives is such an important skill!

    3. which have not received nearly as much attention from historians as the political disputes between the United States and Mexico during this period

      This is a good example of how analyzing large amounts of data can give us insight we have not had before.

    4. Thisdecisionallowstheinterfacetomaintainahighlevelofresponsetotheuser’squeriesandquestions,

      It is important to keep this in mind when using massive amounts of data to an interface.

    5. since these are eras that were targeted by the initial phases of the Chronicling America project, and therefore are most likely to be overrepresented in that dataset

      This is a very important piece of information to know about the data. I wonder what would have happened if they did not know this.

    6. “off the shelf” interface widgets

      I wonder if the team considered creating their own widget and would it have been better for the project's goal of being used by outside sources or worse.

    7. We chose not to ignore them because that seemed to artificially increase the level of noise in the corpus, and we wanted to represent as refined—and thus as accurate—a sense of the quality of the corpus as possible

      This shows the integrity a digital historian needs to have. It would be easy in large data sets to cut corners.

    8. Stanford’s computer science department)

      I find it so interesting that a large number of people involved in the project are computer scientists.

    9. NDNP’s Chronicling Americaprojec

      These are the kinds of things we need to start seeing. A universal standard of digitizing will be incredibly useful in moving forward. However would this cause isolation of data that can't match the standard? Would some data be left out?

    10. but at base they attemptto find—and often quantify—meaningful language patterns spread across large bodies of tex

      I know that a goal of linguistics is to find all the quantifiable similarities across languages. Using data mining techniques seems like an incredibly useful tool for the field

    11. The age of abundance

      The age of abundance. It makes me question what is going to happen in this new age as all these new techniques are coming out. Myself I still find the most reliable sources are printed ones books and journals. Do you think that as we progress further and further into this age of technology in History will there be many who cling to the older methods or will historians and academics see the potentially paradigm shifting benefits of technology. Will we leave something behind moving to technology.

    12. Between these two models—the quantitative and qualitative—we hoped to fulfill the project’s central mission.

      Good problem solving to separate the mission into two models.

    13. If, for example,a search for a particular term yields 4,000,000 results, even those search results produce a dataset far too large for any single scholar to analyze in a meaningful way using traditional methods

      Think of using google but the first 2 pages of results that traditionally the most relevant links are gone.

    14. ord breaks (such as “pre-diction”for “prediction”)

      I remember in a previous module the word- breaks drove me crazy! If they were used so often in old newspaper I can definitely see a challenge for OCR...searching for hits of "Lincoln" might not be so accurate if half of the hits were missed because they were typed as "Lin-coln"

    15. 12optical character recognition (OCR). OCR is, at base, a process by which a computer program scans these images and attempts to identify alpha-numeric symbols (letters and numbers) so they can be translated into electronic text. So, for example, in doing an OCR scan of an image of the word “the,” an effective OCR program should be able to recognize the individual “t” “h” and “e” letters, and then save those as “the” in text form. Various versions of this process have been around since the late 1920s, although the technology has improved drastically in recent years. Today most OCR systems achieve a high-level of recognition accuracy when used on printed texts and calibrated correctly for specific fonts

      We have encountered issues with OCR in our previous module though! If the paper was blurred, smudged, wrinkled, had complex fonts or was faded OCR can make errors such as changing "rn" to "m"

    16. it would be difficult for a researcher to know whether a search that produced a small number of search results would indicate few discussions of Lincoln from that era or simply that few relevant resources we

      I didn't think of this problem but it certainly leads to the conclusion that no matter how large the data set is, the researcher has to understand the nature of the data to be able to accurately analyze data mining results such as these.

    17. in order to enable users of digitized historical newspapers to make more informed choices about what sort of research questions could, indeed, be answered by the available sources.

      and I see that this is indeed what they did! Preliminary data mining to create useful questions that can be answered by the data mining proper.

    18. represented rural or urban spaces, and whether there was enoughquantity and quality of thedata from both regions to undertake a meaningful comparison

      very important to actually understand the nature and content of your data before questions can be posed. I can imagine this is quite difficult; with a quarter of a million documents you somehow need to recognize some patterns BEFORE you even begin data mining and spatial analysis...there must be some sort of preliminary tools to scan the data before beginning the true work?

    19. 4,000,000 results

      I have created e-commerce websites in the past and you have to increase SEO by tagging products with specific words. If you have unrelated tags then search engines such as Google will filter you out making you less visible to people searching. It is important to understand what tags are specific to your product/article.

    1. Follow the installation instructions. Start Open Refine by double clicking on its icon.

      I'm having trouble getting my downloaded "openrefine.zip" file to install... did anyone else run into this issue?

    1. Keyness reveals that "women" is a statistical significant negative key word, which means male authors used it less frequently than female authors.

      This is very interesting. Definitely would explain a lot of the research though.

    2. Note that the process outlined relies on a back and forth between machine reading of the texts and close readings of the individual items

      Showcases that is be a team effort between computers and historians

    3. Keyness reveals that "women" is a statistical significant negative key word, which means male authors used it less frequently than female authors.

      not exactly a surprise! but useful to see that data analysis confirms it

    4. I'd also note where "women" is not present at very high frequency.  What are those items talking about then in a volume about woman suffrage?  

      definitely an interesting question! Maybe it is the nature of language in the time period that the noun "women" wouldn't be mentioned frequently because legal/flowery/expressive language was used throughout? Just a guess

    5. This is extremely valuable for the “small” words that the human brains tends to slide over when reading [these words are often call “stop words” in machine reading because the programming ignores them before analyzing a corpus].

      I know I am guilty of only hunting for the main key words/points in a text & often miss the smaller details the first few times I read a text/data. However, by using a technique like corpus linguistics, you are guaranteed to be made aware of everything from the start.

    6. Creating a corpus is generally the most daunting obstacle that confronts the historian. 

      Is this a a popular tool? If not, could the time it takes to prepare to use it why it is not used? Who else could this software benefit other than historians?

    7. The texts comprise what is called the “corpus.” Computer-aided corpus linguistics looks for mathematical relationships between words in a body of texts.

      In historians craft we had to use Voyant to do this with a primary source our professor uploaded. Although it frequently crashed and most members of my class had issues with it, I found this tool very interesting. Demonstrating various patterns in written sources. How could this tool be used in historical research? Who uses this tool? For what purpose?

    8. comparing patterns and shifts over time and space

      This makes it harder than simply being conscious about what we do, why we do it, how we do it. If we begin with a distant reading based on these papers, how can we possibly know what shifts will take place in order to focus on the right words/thoughts over time? Perhaps going into this process before knowing what we are looking for is useful. We might note significant usages/ shifts of words or topics over time and then have an idea that we should be looking for them more deeply.

    9. close reading.

      Just as we discussed previously, the use of "outside" context and close reading in addition to this digital work is essential for good scholarship.

    10. Linux)

      As a personal note, I appreciate seeing Linux included because all too often it is forgotten or ridiculed by others.

    1. I thought it fitting to select the 1916-1918 diaries of a British soldier named Robert Lindsay Mackay [4] to be the text that I would first test this method on

      I listened to his final product, and it is really striking to hear the words of this soldier through sonification. It really brings his words to life and created an emotional response in me. I think that sonification is a very significant way in which to allow for greater interpretation of texts, especially historical ones.

    2. one difference with this proposed sonification method would be that the listener could be able to hear not only the broad trends, but also the outliers in word usage. This could potentially aid in guiding closer reading of a text.

      I like that sonification allows for both distant and close readings of texts. Many tools, such as text analysis tools, seem to focus on distant readings which are only possible through technology, but being able to actually hear the words which stick out from the norm could greatly impact the way in which the text is analysed.

    3. Second, with all of the data combined, we could be able to hear how different words are used in relation to each other in the text, and how these relations change over time.

      I think the ability to turn data into sound is really fascinating, as it creates a way in which the data can almost come to life and truly influence the way it's perceived. I think that being able to hear relationships between words in a text can really shape the way in which the data is understood.

    4. In a word cloud, the linear narrative of a text is reduced to a static representation that, some argue, perhaps obscures more than it reveals.

      I can see how word clouds can limit the way in which data is understood. Relying solely on text analysis could potentially harm one's understanding of their data. I think that using multiple types of analysis would allow for a greater understanding of a text and/or data, which would benefit those within the discipline. Collaborative work seems to general produce more rounded interpretations, which consider the data through multiple lenses.

    5. There is value in reshaping textual data into a form that we can approach with a new set of eyes (or ears), in order to analyze the data and ultimately find new patterns or meanings within it.

      I completely agree that reshaping data allows for one to gain a new understanding of it. Sound is such a powerful sense, that it seems foolish not to explore the ways in which our hearing can interpret data rather than our sight.

    1. While  simply  having  such  a  large  volume  of  information  online  in digital  form  for  researchers  is  valuable,  the  usual  restriction  to  a  web-­based  ‘search’  form  interface  often renders  it  of  limited  use  and  approachability

      I am confused on what this means. Does it mean that websites often restrict some of the information or that not all information is able to acquire for everyone.

    2. However,  for  researchers  of  twentieth-­ and  twenty-­first  century  history  the  opposite  problem  is  also  increasingly  common. 

      I find that often it can still be very hard to find information on certain topics. Most times when I have been writing papers for other classes, I find it very hard to find useful information or specifics that would help back up an argument.

    3. accompanying slides on the Quantifying Kissinger website.

      Broken links for me. (Not unexpected, content keeps changing) As an aside, this is a good example of rot in digital sources and perhaps a need for digital historians to take into account version controls in the raw information we process and also the idea that having a more reliable digital source of history from a point in time means backing it up along with information that is linked to it.

    1. Experts in the area have argued that the most powerful visualizations are static images with clear legends and a clear point,

      I think its interesting that there is any sort of consensus at all about the "most powerful visualizations." How on earth would one measure that? Do they judge it by emotional response of readers? Or understanding? Or further applications of data? Are visualizations about communication or understanding, or a bit of both?

    2. In fact, even before a dataset is complete, visualizations can be used to recognize errors in the data collection process.

      This is cool! I've only really been thinking of how these programs can do after they process the data, but never really thought that they can help you before they even get started.

    3. The use of visualizations to show the distribution of words or topics in a document is an effective way of getting a sense for the location and frequency of your query in a corpus, and it represents only one of the many uses of information visualization

      I think this is one of the most interesting types of visualization. It is incredible what simply the use words can show a historian.

    4. any visualization we create is imbued with the narrative and purpose we give it

      It's so important to keep this in mind: who are we helping/hurting in the way we arrange our data? Sometimes it's easy to forget that a picture has an underlying narrative as much as a text. Data, in whatever form, is ammunition; we need to aim carefully.

    5. ot convey the sheer magnitude of difference between earlier and later years

      This point is a significant one. Scale is vital in graphs and visualizations as it determines how others will see your data. If you pick a particular scale so that all the data can be seen easily, that may diminish the point you are trying to make with that visual. There are also issues with this chart. For example, what about the years BCE mentioned in dissertation titles? Are they included? Does 500 cover both 500 BCE as well as 500 CE? These questions just show that sometimes visualizations cannot entirely replace text, because sometimes visuals raise questions in addition to answering them.

    6. Exploratory visualizations like this one form a key part of the research process when analyzing large datasets.

      These types of visualizations are important because they allow researchers to simplify their research. Instead of searching through all of the data someone can run different simulations and visualizations in order to get the data points you need. These visualizations are also an important part of open research. Other researchers can use the visualizations as a starting point for original research.

    7. visualizations can be used to get a quick understanding of the structure of data being entered, right in the spreadsheet. The below visualization, of salaries at a university, makes it trivial to spot which department’s faculty have the highest salaries, and how those salaries are distributed. It utilizes basic functions in recent versions of Microsoft Excel.

      These visualizations may be simple but they are important and easy to understand. They can be jumping off points for future analysis but also can aid in clarifying and simplifying the raw data, making it easier to determine what kind of analysis a researcher might do.

    1. patterns of usage

      Again, the importance of context

    2. Using the metadata from ASP

      I would like to see the data, at least samples of it to get the structure of what went into the visualization tool. (Maybe it is linked from somewhere here.) This goes back to the positive of open notebook DH.

    3. data visualization

      I recommend the short blog at this link, it led to http://app.rawgraphs.io/ It looks like a neat tool for visualization. It`s used in the diagrams above.

    1. This data was scraped and converted into a table with a document for each row, and a column for every available metadata property.

      Great open notebook sample for a DH project.