1,068 Matching Annotations
  1. Jul 2017
    1. Occupations of Transcribe Bentham user survey respondents

      It makes sense that the largest percentage of people are those in acedemia and students. It would be very difficult to make Bentham sexy to anyone not already interesting in history.

    2. Retaining users was just as integral to the project’s success as recruiting them in the first place. It was therefore important to design a user-friendly interface which facilitated communication in order to keep users coming back to the site, and to develop a sense of community cohesion

      Having a constant stream of activity and communication is key for a community like this. Also ensuing an interface that is usable by more people would assist. It is interesting to read how they have continued to gain collaborators and contributors as the project went on.

    3. distinguishing between a "crowd" and a "community". Contributions made by a crowd, which Haythornthwaite describes as "lightweight peer production", tend to be anonymous, sporadic, and straightforward, whereas the engagement of a community, or "heavyweight peer production", is far more involved. A community of volunteers engaged in the latter requires, Haythornthwaite suggests, qualitative recognition, feedback, and a peer support system.

      Just like kinda what we're doing in class, we are contributing towards a collaboration. Although the work itself is still our own.

    4. Time and money will need to be spent on interacting with volunteers, maintaining and developing the transcription interface in response to volunteer needs, continual promotion of the project, and checking and offering feedback on submitted work.

      Crowdfunded projects face these challenges that projects run by corporations likely wouldn't run into as easily.

    5. Giving experienced and motivated volunteers moderator status may be one way in which crowdsourcing projects could improve community cohesiveness, and is something we would like to explore in the future.

      A good solution to my last question.

    6. volunteers appeared to prefer starting transcripts from scratch, and to work alone

      This sort of defeats the intention of open collaboration projects. How could participants engage with each other more? Peer revision?

    7. minimal evidence of interaction between Transcribe Bentham users.

      This also strikes me as odd. I would have thought that collaboration is especially important for deciphering Bentham's handwriting -- some people may be more familiar with it than others.

    8. any contribution to Transcribe Bentham is beneficial to the project;

      Hypothetically, what if someone tried to sabotage the project?

    9. survey respondents had no problem with this, which speaks to the mutual trust and respect between a project and its volunteers which is vital for success.

      Again, this is wonderful that people don't pride themselves on their work, but instead seek to make a contribution to the larger picture. This shows that they are truly participating because it is meaningful to them.

    10. it was thus surprising that Transcribe Bentham volunteers regarded "competition" and "recognition" to be of such low importance.

      I think it's great that volunteers found intrinsic meaning within the project, and simply didn't treat the process as a means to an end.

    11. appears to have had little impact in driving traffic directly to the site, despite staff using them on a routine basis for publicity, communicating with volunteers, and issuing notifications.[

      This surprises me. I would have thought that this is an effective way to spread the word and recruit participants.

    12. Google Adwords was a failure for us as a recruitment strategy. Our advert was displayed 648,995 times, resulting in 452 clicks, but sent no visitors to the Transcription Desk

      Why is this so? Would this be the result of accidentally clicking on an advert?

    13. "Team-building" features like these have been found to be useful in stimulating participation by other projects like Solar Stormwatch and Old Weather:

      This answers my previous question regarding whether other projects try to create this online sense of community.

    14. Each registered participant was given a social profile

      This illustrates the notion of trying to create a community. I wonder if most or all open source/crowd funded projects do this.

    15. Retaining users was just as integral to the project’s success as recruiting them in the first place.

      I wonder what the turnover rate is for some of these projects. Maybe some participants would want to contribute, say, a single fact, whereas others would want to oversee a larger aspect of the project.

    16. This is a shame, as any contribution to Transcribe Bentham is beneficial to the project;

      This hardly seems true if the volunteer quoted could not even read Bentham's handwriting!

    17. advertisements in academic journals

      I wonder if nowadays, crowdfunded/open projects will make use of social media to spread the word and recruit historians. This is a free alternative to spending money advertising in academic journals.

    18. Ninety-seven per cent of survey respondents had been educated to at least undergraduate level, and almost a quarter achieved a doctorate.

      I wonder about the types of undergraduates represented in this number! It was much higher than expected...I thought that many truly amateur laypeople would attempt to volunteer their time

    19. distinguishing between a "crowd" and a "community". Contributions made by a crowd, which Haythornthwaite describes as "lightweight peer production", tend to be anonymous, sporadic, and straightforward, whereas the engagement of a community, or "heavyweight peer production", is far more involved.

      I'm sure that different subjects would attract different numbers of people. Could a project with crowd support be of the same quality as one with community support?

    20. A community of volunteers engaged in the latter requires, Haythornthwaite suggests, qualitative recognition, feedback, and a peer support system.

      I like this distinction between the types of volunteers possible for a project! It depends on the nature of the work which is more preferable but the structured nature of a "community of volunteers" addresses issues of shoddy workmanship or inaccuracies.

    21. The Bowring edition omitted a number of works published in Bentham’s lifetime

      Is there a particular reason why some works were omitted, or was it just poor workmanship?

    22. the edition will run to around seventy volumes

      I wonder how long this process will take...

    23. Galaxy Zoo

      This sounds so neat! I will definitely be checking this out on my spare time.

    24. task is well facilitated, and the institution or project leaders are able to build up a cohort of willing volunteers. P

      I agree...I think that the planning and execution of crowdsourcing needs a coherent strategy to be viable. Not only willing volunteers but clear instructions so that the volunteers are producing useful and accurate work! I think that unorganized crowdsourcing projects could get very sloppy and end up taking up more time to sort and correct results. Enthusiasm is not enough !

    25. Crowdsourcing aims to raise the profile of academic research, by allowing volunteers to play a part in its generation and dissemination.

      I think this phrase "raise the profile of academic research" can be interpreted two ways...the first is that crowdsourcing could allow the general public to get involved in academic research in new and exciting ways that bring further attention to the research in question (media, etc.), but it could also mean raising the "quality" of academic research and I'm not sure if crowdsourcing always does this. I think that crowdsourcing can also lead to issues of lack of accountability for accuracy and sourcing, tainting results.

    26. We attracted an anonymous crowd of one-time or irregular volunteers, along with a smaller cohort of mutually supportive and loyal transcribers.

      To counter my previous annotation, I suppose the very nature of crowd-sourcing relies on the differing backgrounds and knowledge bases. You choose to use a crowdsourcing initiative with this in mind, fully aware that not all the participants come from the same background.

    27. Transcribing the difficult handwriting, idiosyncratic style, and dense and challenging ideas of an eighteenth and nineteenth-century philosopher is more complex, esoteric, and of less immediate appeal than contributing to a genealogical or community collection.

      Previously in the article it said the volunteers needed no special skills or relevant previous knowledge. If it's complex, is it really the greatest idea to allow those previously mentioned volunteers deal with it?

    28. made searchable

      This would lessen the workload of those trying to find something in Bentham's work immensely. A searchable repository is so convenient and would make it accessible to so many other people.

    29. require no specialist training or background knowledge in order to participate

      This make me question if the quality would remain consistent over all the transcriptions.

    30. create a freely-available and searchable digital Bentham Papers repository

      This would be such a helpful feature. Bentham is so widely known and I have discussed him in so many different courses that I'm sure this database would be a huge help for those in school.

    31. will replace the poorly-edited, inadequate and incomplete eleven-volume edition published between 1838 and 1843

      Great to see progress. Our standards and expectations have changed over time and it is important to have work accessible that reflects that and can cater to modern needs.

    32. which is based in large part on transcripts of the vast collection – around 60,000 folios

      For large collections it makes sense to enlist many people to help. The project would be completed way faster with, presumably, just as much accuracy. The only issue I could see arising is giving credit or equal distribution of work to recognition.

    33. Galaxy Zoo, for example, has successfully built up a community of more than 200,000 users who have classified over 100 million galaxies

      For future reference.

    34. be accomplished more quickly and more cheaply by outsourcing them to enthusiastic members of the public who volunteer their time and effort for free

      All sounds like a great idea, as long as speed and cost doesn't reduce quality.

    35. Crowdsourcing

      Increases the number of eyes on a project and minds coming together, crowdsourcing can result in different ideas and perspectives coming together.

    36. we also tried to build a dedicated user community to enable sustained participation by, for example, implementing a qualitative and quantitative feedback and reward system.

      I'm interested to see where this goes. I think both ideas will help create this dedicated user community they are looking for. Personally, I tend to be more involved when my work is recognized and appreciated.

    37. Transcribing the difficult handwriting, idiosyncratic style, and dense and challenging ideas of an eighteenth and nineteenth-century philosopher is more complex, esoteric, and of less immediate appeal than contributing to a genealogical or community collection. 7

      But could a volunteer who might not have much background on the subject or experience, be able to comprehend and analyze the data correctly??

    38. build up a cohort of willing volunteers

      I can understand how this might be challenging at times

    39. a task usually performed by skilled researchers, via the web to members of the public who require no specialist training or background knowledge in order to participate. The project team developed the "Transcription Desk", a website, tool and interface to facilitate web-based transcription and encoding of common features of the manuscripts in Text Encoding Initiative-compliant XML. Transcripts submitted by volunteers are subsequently uploaded to UCL’s digital repository, linked to the relevant manuscript image and made searchable, while the transcripts will also eventually form the basis of printed editions of Bentham’s works.[4

      Answering my previous question! I find this so interesting. In Historian craft we had a guest lecture by an archivist how explained the process of cataloging and creating collections. Although I do not remember specific terms he used it was a long process for one person to create what many of us use for research. This not only saves time but gets others involved in the process. I wonder how often is crowdsourcing used? Any other projects that anyone knows of I could look at?

    40. This material has significant implications for our understanding of utilitarian thought, the history of sexual morality, atheism, and agnosticism. Bentham’s writings on his panopticon prison scheme still require transcription, as do large swathes of important material on civil, penal, and constitutional law, on economics, and on legal and political philosophy.

      This is so interesting. To be involved in this work would be fascinating. How does one get involved in crowdsoucring? Postings? Is it as simple as being picked to do the job or is some experience needed?

    1. . Descriptive markup describes what the elements in a document mean, but not how they look, and CSS is intended to let the designer specify the rendering separately from the XML, so that meaning and appearance do not become conflated or confused. JavaScript JavaScript is a client-side programming language for manipulating, among other things, the appearance of web pages in the browser. Client-side means that JavaScript runs in the user’s browser, so that, for example, the user can change what is rendered in the

      Oh my God. I literally never understood what JavaScript was until I read this. I'm not sure yet how I'd use it, but I guess like everything else in this course I'll just have to play around with it one day.

    2. be used without reference to the Web or the Internet. For example, one could write XSLT to genera

      Aha! No wonder XML seems familiar to me.

    3. nd mark up in your documents is dictated by your research agenda, it is important to conduct your document analysis and develop your schema with your g

      This is a good point. I just see myself getting halfway into a text then realizing that my markup won't provide the results I need. Maximize planning, minimize hours wasted, etc.

    4. an index of place names for the document, or cause a map to appear when the reader mouses over a place name while reading, or make it possible to search for the string “London” when it refers to the place, b

      This blew. My. Freakin. Mind. The possibilities are endless - literally, because XML definitions can be anything, and then transformed into any kind of display.

    5. ning: a list is still a list, no matter how it is presented. In XML the person who creates the document

      Hence the TEI, so that we have some standards to build a communal collection of data that can be used by scholars. I wonder how widespread TEI is? Are there other common XML schemes used by scholars in the humanities? What about in other disciplines?

    6. The markup used in digital humanities projects is descriptive, which means that it describes what a textual subcomponent is. Descriptive markup differs from presentational markup, which describes what text looks like. For example, presentational markup might say that a sequence of words is rendered in italics, without any explanation of whether that’s because they’re a book title, a foreign phrase, something intended to be emphasized, etc. Descriptive markup also differs from procedural markup, which describes what to do with text (e.g., an instruction to a word processor to switch fonts in a particular place).

      This was illuminating for me. As a member of the Neopets/Proboards/Geocities generation, I learned basic presentational markup from a relatively young age. I had literally never considered the possibility of descriptive markup until this class. I think I love it so much because it combines my interest in textual analysis with the comforting tags I spent so many years perfecting in my messages and webpages. The tag-based structure of XML is almost second-nature to me.

    7. XML is a formal model designed to represent an ordered hierarchy, and to the extent that human documents are logically ordered and hierarchical, they can be formalized and represented easily as XML documents.

      I had never considered this before, but it makes a lot of sense. Most texts have underlying structure; making it visible with XML helps make us conscious, as researchers, of the structures underlying our texts. It could also help us understand better the connections between different texts - the Macroscope readings we did in the first week discussed doing so with Old Bailey records.

    1. Ta da!

      Are we suppose to have this much? for a purpose are we using all of these file for something in the future or are we just downloading for and exercise?

    2. odge it in your repository.

      I am still confused on how to do this?

    3. associate a cell phone number with your account

      No cell service in the farm yard, waiting on a verification code forever! Gunna take a walk down the lane way. Rural vs Urban problems.

    4. For instance, $ twarc search canada150 > search.json will search Twitter for posts using the canada150 hashtag.

      Has anyone gotten 401 Client Error Aurhorization required for URL when they this command?

    5. ure is done by year and month a quicker route might be to just run the original wget command ten times, changing the folder each time

      I have tried changing the years in the 1880's and it keeps coming up with not found errors does any one else have this issue? I did get 1883 and 84.

    6. We will install a command that can convert the json to csv format like so: $ sudo npm install json2csv --save -g. Full details about the command are here.

      Hi folks - something wonky has happened to this utility (the json2csv program). See this post in our slack space: https://hist3814o.slack.com/files/dr.graham/F6DEUCQUF/twarc_json_to_csv which will walk you through what to do.

      or if you're feeling adventurous, you could try installing this https://github.com/jehiah/json2csv/releases/download/v.1.2.0/json2csv-1.2.0.linux-amd64.go1.8.tar.gz to your dhbox. Follow the pattern you used when you installed pandoc, using similar commands. Usage, once it's installed: https://github.com/jehiah/json2csv .

    7. Hey - you've scrolled all the way down here. Here's the wget command to nicely download the Equity txt files but don't run it just yet:

      If you're wondering where you can find the extra commands(like -A) you can type wget -help and it will list all the commands wget supports

    8. $ mkdir equity

      If you want to go back to your home directory without having to type cd .. several times cd ~ will take you to you home directory in one step

    9. cd equity

      If you are feeling lazy you can also type cd eq which takes you to the first directory(aka folder) that matches the letter eq. The just tells the command cd to find the first directory matching the letters before the . This is super helpful if you have long folder names that you don't want to type like in the exercise when it tells you to type "cd wget-activehistory" you could just type "cd wget"

    10. $ sudo pip install twarc

      For some reason this only worked for me when I removed sudo from the command

    11. json2csv -i search.json -o out.csv

      I'm getting stuck at this step. It responds with "throw err; SyntaxError:/home/claremaier/search.json: Unexpected token {

    12. Ta da!

      I ran this, and then saw I was getting a lot of files so I cancelled the run. I did not want to pollute DHBox (my output.txt was 340 mb) So I refined my search using Shawville as the city. I kept getting huge downloads, regardless of the time period. I also used a couple other towns. "Why am I getting all of this for New Germany, Nova Scotia?" I thought. I got better results when I deleted the working files before running $ ./ canadiana.txt I added this to the top of the program: # clean up from the previous session rm results.txt rm cleanlist.txt rm urlstograb.txt

    13. $ sudo pip install twarc

      type $ pip install twarc (take out the sudo) to make this work.

    14. 755

      Why does this make the file okay to run?

    15. Add your command to your history file, and lodge it in your repository.

      how do you do this? I have tried going back to last week's exercises and I am still just as lost

    16. Lodge a copy of this record in your repository.

      does this mean to save the excel document of the downloaded war graves into github or to somehow export the nano file from DHBox to github...?

    17. sudo apt-get install pdftk

      Won't let me download. Says it is " E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"

    18. notes in your repository.

      Is this "commit" comes in? Should we making comments in the DH Box and and uploading it to Git Hub, or should we be making comments in the Git Hub when we post this?

    19. Skip ahead to step 2, since your DHBox already has wget installed.

      Step 2 is really the only step you need, don't worry about trying to mirror an entire website.

    20. Download your results.

      For this part, I'm only using the Epigraphic Database Heidelberg because when I use the Commwealth War Graves Commission, the page will just refresh. I'm assuming that it's just because it can't find anyone under my last name. But as for the download your results? Not quite sure how to do this?

    21. Using the nano text editor in your DHBox

      Im not exactly sure how to use this function of the Dbox. Is there anyone who could give me some help?

    22. 9

      typo. should be an 8

    23. run the program:

      the ./ is important! don't forget it.

    24. permissions

      chmod = 'change mode'. https://en.wikipedia.org/wiki/Chmod

    25. save the changes.

      you might want to write out on paper what you think each step in this program does. Google any strange words you see in there - these are smaller commands/programs already in DHBox that we've pieced together to do our bidding!

    26. For future reference

      for when you get around to installing tesseract on your own personal computer

    27. Download the output.txt file with the DHBox filemanager

      ie, to your own machine via DHBox's filemanager.

    28. first file

      when you burst a single file into individual pages, many files result.

    29. burst it into individual pages.

      the 'burst' command is part of pdftk (pdf tool kit, geddit?) so that's why you needed to install that.

    30. So what we're going to do is modify the command so that we only grab a subset of the files. Given that each filename contains within it the date of the issue, download only the .txt files for a particular decade. The command below is modified so that wget, as it searches through each subdirectory, only grabs the ones from the 1880s - do you see the crucial bit that does that?

      Please: this is the part that I want you to do. This is because of memory issues lately discovered. Otherwise I'd get you to grab everything. But for now, pick a decade...

    1. digital ruins of GeoCities

      I doubt such a broad comparison will ever be made and instead propose we examine a more constructive thought: that being, how can we adequately communicate and preserve relevant source material in the internet era. I agree it is important to respect social media's place in the annals of history however it is also important for us to productively articulate and archive it. Non-profits like Internet Archive have gone to great extents to accumulate and archive popular public records that will assist future generations in depicting our contemporary social landscape. Large amounts of data are only as valuable as they can be navigated and interpreted. If we fail to address these truths we’ll make big data more of a burden than a blessing for future historians to come.

    2. Thirty-eight million pages and millions of images were about to get lost forever as Yahoo! did not facilitate user export. Ian Milligan is working with web archives and he examines the digital ruins of GeoCities.

      I love this metaphor and Milligan expands on it brilliantly in the interview. It's good practice to contextualize digital history as a new form of history, rather than getting overwhelmed by how very big big data can be. It's not a one-to-one comparison, obviously - the methodologies of digital archaeology are obviously different from the methodologies of a physical dig site - but it helps emphasize the value of the data that was lost at the demise of Geocities. If we historians cry over the library of Alexandria, we should cry over Geocities all the more.

    1. Who does your data impact? (How?) What data might be worth trying to create? (Why?) How can you develop a plan to work with your data? How will you advocate for and about this data?

      Wonderful and concise list of questions. I can't believe I've never asked some of these (developing a plan, advocating for data) questions before. They aren't simple questions, but just considering them opens up a lot of potential avenues for growth and new research.

    2. I know that mess is often used as a word that has bad associations – people may tell you that something that’s messy has to be cleaned up before it’s worth anything. And I want to push back hard against that assumption

      This is something I want to keep in mind as a scholar. Reading over some of my old papers (yes - I am both a nerd and a narcissist) I'm shocked at how simplistic and reductive my conclusions used to be. I still have a long way to go, and part of my growth over these past three years at university has been learning to love the mess.

    3. Working with data can get messy really quickly. And that mess isn’t just something to be cleaned up – it’s people’s lives. People’s lives can be genuinely complicated in so many different ways, and figuring out how to handle that mess is a vital part of working with data. When people don’t consider the mess; or when they try to shove it out of the way so that it doesn’t complicate their analysis, then there’s a good chance that they’re not representing the reality that people are living.

      This perfectly captures one of my concerns with working with data. It sounds curmudgeonly, but sometimes hard data can feel static or stuffy - not as amenable to fluid interpretation as traditional sources. Of course, through this class, I've come to understand that many digital historians, like Morgan, appreciate the "mess" of digital data and handle data accordingly.

    1. Re-read your page and highlight / colour the following things:

      When we are looking to ill in this information are we allowed to look in the rest of the leaflet for this information? or are we only allowed to look at this one page?

    2. Any claims, assertions or arguments made Now that you have highlighted these, you are going to put proper code around them.

      Small formatting error here!

    3. ""

      I think the double quotation marks "" is a typo, should be one ".

    4. If you still don't see your text

      "This page contains the following errors:

      error on line 2 at column 1: Extra content at the end of the document Below is a rendering of the page up to the first error."

      This is what i'm getting. I know the file and browser are fine because the 8 shows up. Did anyone else get this? I'm just not sure what I did wrong, I've checked everything so far.

    5. What makes you believe this site is a trustworthy provider of historical texts?

      That there are actual scanned documents.

    6. What makes you believe this site is NOT a trustworthy provider of historical texts?

      The layout of the website as well. Usually it's not as flashy.

    7. replace the number one (1) with the page number you are transcribing

      I'll use the page number from the viewer (example: page 19) instead of the page number from the document (page 32 or 33) in this example.

    8. download that repository as a zip file

      In GitHub, when I viewed subfolder "tei-hist3907" I had to click on repository module3-wranglingdata to download it all as a zip. (Not sure if there is a slicker way to just get only the subfolder of the repository)

    9. What makes you believe this site is NOT a trustworthy provider of historical texts?

      It doesnt seem to come from a scholarly source. I could be wrong.

    1. If we do this for a variety of TLDs and GeoCities neighbourhoods, what patterns emerge? Could we use this as part of a finding aid to learn about a neighbourhood by ‘distantly reading’ the images?

      I want to see this done on screengrabs of Geocities sites by different 1990s fandoms or interest groups. I bet the websites of stamp collectors looked really different from websites of punk fans and Riot Grrls!

    1. . It is merely one way of formalising those processes that should be part and parcel of any analysis—namely making darn sure I know the precise individual or place a word is referring to, and the possible implication that this has for the source at large, even if it is only mentioned in passing.

      Better digital methods make for better scholarship, period. There are analog methods of recording this information, of course, but this class is all about harnessing the new opportunities afforded by good digital practice.

    2. at least three very important (argument-shaping) articles were only identified during routine tagging of my database backlog—a number I expect will rise as I endeavour to encode my remaining 400 transcriptions.

      Wow! Data management may not be sexy, but apparently it's useful. I think this article (and Beals' other articles on her workflow) are super useful to anyone doing history in a digital age - not just digital historians. She outlines clearly how she made digital tools work for her rather than against her.

    1. Acknowledging digitized historical texts as new editions is an important step, I would argue, to developing media-specific approaches to the digital that more effectively exploit its affordances; more responsibly represent the material, social, and economic circumstances of its production; and more carefully delineate with its limitations.

      This is a very succinct argument for looking at digital texts as editions. It covers many of the topics we've been discussing in class regarding the availability and reliability of digital sources. I think the most important part to remember here is that the circumstances under which a digital text is created can let a researcher know certain limitations of that text. It's important to know, for example, if an OCR-based search of the text will be effective given the quality of digitization; or if related texts are missing due to inequalities in funding.

    2. Over their three rounds of funding, then, Penn State sought to digitize newspapers from as many counties as possible, meaning they prioritized breadth of geographic coverage over digitizing the “most influential” newspapers in the state, which might have produced a corpus skewed in another way: toward Philadelphia over more rural areas in the state.

      Without any consensus on how to decide between digitizing "influential" and inclusive materials (or even how to define "influential" and "inclusive"), digital archives are going to be skewed for a long time yet. For example, Penn State may have decided to go for geographic coverage, but another state university might focus on one or two cities. Even raw numbers of newspapers digitized, then, cannot express the state of the digital archive.

    3. $393,650

      DIGITIZATION IS EXPENSIVE!!! This is an incredible number to keep in mind. No wonder so few newspapers are digitized, and that less are digitized well.

    4. The details gleaned from these files, however, are only one part of a full bibliographic account, which should also concern itself with the institutional, financial, social, and governmental structures that lead one historical textual object to be digitized, while another is not. In Ian Milligan’s study of newspapers cited in Canadian dissertations, he demonstrates quantitatively that overall citations of newspapers have increased in “the post-database period,” but also that those citations draw ever more disproportionately from those papers which have been digitized over those which have not

      This is exactly the point I was trying to make in my blog posts for module 2! I think it's so important to consider why and how texts get digitized, and to consider the ideological impact that digitization can have on our projects.

    5. I am not particularly bothered by the fact that OCR is an “automatic” process, while composing type is a “human” process. To maintain such a dichotomy we must both overestimate the autonomy of human compositors in print shops and underestimate the role of computer scientists in OCR. Both movable type and optical character recognition, along with a host of textual technologies in between, attempt to automate laborious aspects of textual production. Indeed, we can only speak of editions as such, whether printed or digital, within an industrialized framework.

      This makes a lot of sense to me. I've never thought about textual reproduction "within an industrialized framework" before, but it has heavy implications for text-based scholarship. One question that it raises in my mind: why do we care so much if machines are transcribing texts if we never really cared about the people who typeset the originals back in the 18th, 19th, 20th century? If human error is different from machine error (and I believe it is), why don't we talk more about the conditions that produced human errors in text? Cordell implies that asking questions about human textual production helps us better understand the implications of mechanical production.

    6. in which the digital archive can be only a transparent window into the “actual,” material objects of study

      This is an interesting point: what potential are we missing when we want only perfect OCR, and ignore the variety of things that can be done with digital sources?

    7. we require more robust methods for describing digital artifacts bibliographically:

      I've always had trouble, even as an undergrad student, citing digital books. When do I cite the original publisher, and where do I acknowledge the digital publisher? Should I date it to the day the website was uploaded, or the date of the print release? Should I note how the source was digitized (scan, transcription, other)? This is definitely a conversation that needs to be had.

    8. What is this text—this digital artifact I access in 2016?

      Hmm, interesting. It's easy to look at a poorly OCR'd text and dismiss it - just say "that's wrong, throw it out." But that isn't failing productively, that's just giving up. By asking these sorts of questions - what is an OCR'd document, really, and what can it be used for - we can wrangle with important questions about big data and digital methodology even if, character for character, the OCR "failed."

    1. It’s not just the simplicity of that single search box, it’s our faith that search will just work.

      This is very true- we instill all our faith in it working, it is a bit magic every time it does

    2. A seam-free service

      As with most things in life I don't think there will ever be a completely seam free manner to do this- there is always going to be one aspect that will not be what someone wants. Maybe striving for more seamless as opposed to seam free is more realistic?

    3. web-scale discovery services

      I don't think I understand what this means

    4. ‘Here was magic!’.

      This reminds me of a quote I cannot seem to get my hands on right now about how every generation has its own type of magic- and this was theirs!

    5. A seam-free service is one that maximises ease-of-use.

      Hmm, but at what cost? Google has ease-of-use down pat, but I have to wonder what kind of concessions you make as a user (without necessarily realizing) in terms of the kind of information you're receiving.

    6. People already struggling for visibility and recognition within our cultural record might be lost amidst the overwhelming numbers of the safe and the sanctioned.

      Excellent point. While we champion digital history and public history as a way to make minorities visible, we have to still be aware of who is doing the work, coding the search formulas and such. I recall a conversation about the importance of including women programmers on projects to address online bullying and harassment because they bring a different perspective and work on the problem from a different angle.

    7. When did the ‘Great War’ become the ‘First World War’? < http://dhistory.org/querypic/43/>

      Never knew that it was referenced as that. Shows how important the research of history is.

    8. If we are developing resources to support the creation of new knowledge we cannot simply black box our tech and trade on trust.

      Never heard of Trove, but from some of the professors I have, they all say Wikipedia is horrible. Prior to University, I use to always use it for high school projects and etc. Weird to think that something I trusted nearly my whole life for education is now something unreliable. Probably learned a good portion of knowledge that ended up being false.

    9. Of course we all want to make life as easy as possible for the people who use our services. The question is how the pursuit of a Google-like experience constrains our options and assumptions.

      I remmeber reading a while ago that Google keeps track of your information, for example, your location, therefore whenever you search something, it is based on what you would need. For example, if I just went to google and typed in online shopping, it will all be in Canadian dollars and direct me to either an American or Canadian Website rather than to an Australian one. This really made me think about how this kind of search engine is a lot more complex than I thought. Even all the advertisements are all based on what I would like, or something that I have previously searched.

    10. In the library world, seamless discovery is commonly associated with what are variously called ‘next-generation catalogues’, ‘web-scale discovery services’ or ‘discovery layers’.1

      I specifically relay on the internet for this purpose. To go through libraries to search certain information takes so much time. For example, if you have the PDF, all you need to do is press control f, and you have the ability to search a book. This is so useful in a massive textbook when you want to use something as a reference that you read previously. I really only go to the library because I like the environment to study in and it helps me focus. Realistically though, everything I do there can almost all be done at home.

    11. Technology promises instant access to information — a future beyond silos.

      This defiantly holds true, as long as you have a good internet connection, you can access anything online within seconds.

    12. Information superhighway’?

      Never heard of this term before.

    13. Google does it exist

      Even if content does exist, does it matter if it can not be found?

    1. My tweet led to a flurry of activity amongst scholars, and even now, the transcription has begun. Indeed, I made an android-only game out of it.

      This is so interesting! Games like this really encourage interest especially in people who may not have heard of these types of things in any other way before!

    2. Indeed, I made an android-only game out of it

      I think this actually opens up a really interesting concept! I remember how growing up I enjoyed learning about mythology through game like Age of Mythology! Not only did it spark an interest in Greek mythology and history, the game also held some pretty interesting historical facts as well. I think turning historical study into games can have a huge positive impact on how kids might become interested and learn about history.

    3. When we topic model Martha Ballard's diary, did she give this to us?

      Reminds my of the Diary of Anne Frank, it touched so many people, yet was never intended for others to read.

    4. Indeed, I made an android-only game out of it.

      I think this is cool. It's a good way to mobilize volunteer work, and I think the transcriptions that result would be as accurate as any (as long as they are checked for quality, which is another issue discussed well in this article about the Bentham Project) What do the rest of you think of "gamifying" historical contribution? Does it improve the amount or quality of sources being digitized? Does it impact how scholars should look at the resulting source?

    1. National Institute of Health (NIH) has been a longstanding champion for creating open access

      This seems to be mirrored in the 'open research' idea. Scientists find the idea useful, effective and that it adds value to their field, and this idea is found also to be useful in other fields.

    2. database subscription seldom includes the most recent, current material and publishers purposefully have an embargo of one or two years to withhold the most current information so libraries still have a need to subscribe directly with the journals

      This is something which is important to keep in mind if you run straight to JSTOR for your academic research. JSTOR is one of these databases and tends to be three to four years out of date, something I only learned last year.

    3. “For Elsevier it is very hard to purchase specific journals—either you buy everything or you buy nothing,” says Vincent Lariviere, a professor at Université de Montréal. Lariviere finds that his university uses 20 percent of the journals they subscribe to and 80 percent are never downloaded.

      This seems like a double-edged sword to me - on the one hand, obviously there's a financial incentive to create more and more journals that universities and libraries will then be obliged to buy according to this system, but on the other hand, the creation of more journals means more opportunity for academics to get published...which I guess leads to a whole other conversation about the premium placed on academic publishing as a form of accreditation.

    4. Elsevier says you can publish in open access, but in reality it means paying twice for the papers.

      Is this extra revenue passed on or is it pure profit for the journal publisher? Could we argue that online journals also cost money because they are considered valuable and Western culture typically associates value with high costs?

    5. The most important journals will always look pretty much like they do today because it is actually a really hard job.

      I'd be interested in knowing more about the work that goes into a journal production. I don't think we students appreciate this enough.

    1. death of mainframes

      We still use a mainframe where I work for a healthy amount of the transaction processing and data storage we do. As governments and companies "move to the cloud" workplaces are moving back to accessing more centralized computers, often in huge data centres run by large companies such as Microsoft. Sort of more mainframe than ever.

    2. enforce your monopoly

      Monopoly or creator's rights? If I was talented enough to produce music or write a book I would want to get royalties from the work, not have copies of my work downloaded for free (unless I chose to do that). I like music and books, to get new music and books musicians and authors need to make a living to produce new works. Royalties are part of that living and worth protecting.

    3. Second, there's a bigger problem with "owner controls": what about people who use computers, but don't own them?

      Like my pervious comment, when working on a government computer, I'm careful with what I say in personal emails about my job or the people I work with.

    4. the division between property rights and human rights.

      property rights as in while your "here" you have certain rights, but human rights are always yours (or should be, at least). No matter where you are, who you are, you should always have your human rights.

    5. If that's not clear, think of it this way: a "war on general-purpose computing" is what happens when the control freaks in government and industry demand the ability to remotely control your computers

      I work in government and I guess the human equivalent is an ATIP request. At any point, you may be asked to provide information on someone or something. It stands for Access to Information and Privacy. The access to privacy throws me off a bit, but I suppose using government servers kind of takes away your privacy.

    6. Moreover, since the bootloader determines which OS launches, you don't get to control the software in your machine.

      Why is this a positive aspect of having a TPM?

    7. Or your optic nerve, your cochlea, the stumps of your legs.

      Are we talking about robots here? or maybe artificial, mechanical joints or nerves implanted as a kind of transplant during a hip replacement or something?

    8. But there's a problem. We don't know how to make a computer that can run all the programs we can compile except for whichever one pisses off a regulator, or disrupts a business model, or abets a criminal.

      Science will never FULLY take over the social roles of humans because of the important nuances that make individuals non-programable or not 100% decipherable/predicable by computers.

    9. If we don't start now, it'll be too late.

      At first, I wondered if the author was being melodramatic, but I see his point and hadn't thought about the types of situations he presents. Like him though, I don't have any solutions. I'd like to say people wouldn't do those things, but "we are only human".

    10. Digital Rights Managment. [Defective by design]

      I'm not sure I quite understand what DRM and TPM are. Perhaps someone could summarise or suggest a helpful video? Thanks

    1. What would be worse, however, would be to abandon our past rather than learn from it.

      <del>FAILING PRODUCTIVELY</del>

    2. Data management is an important part of any research project and should always, if possible, be done at the start of the project. This allows for consistency, repeatability, and reuse of your material in the future

      Yeah... still learning this. I'm grateful that the blogosphere has allowed for scholars in similar fields to share these kinds of thoughts. It's not something I've ever been taught in class - until now, of course - it's something that every scholar seems to have to learn the hard way. Collaborative online learning spaces that allow for posts like this can make this process less necessary, or at least less painful.

    3. It could also include complicated formatting and typographical information in a way that could be used to recreate the original presentation of the text but be easily disregarded when irrelevant

      This is fascinating! I love the idea of XML containing "layers" that can be added or removed as necessary, letting us manipulate data on-the-fly. It kind of reminds me of the iterative nature of github. Flexibility is a digital historian's best friend.

    4. determining the relevant keywords for any given work was a highly subjective process, as was the creation of new keywords to describe new, or at least newly noticed, themes and topics.  This again added a layer of inconsistency to my database. 

      This is the case with most long projects, I'd assume. I think a level of uncertainty in tagging is almost inevitable. Projects are going to evolve, and human understanding will always be adding new layers of understanding to previously-categorized texts. Beals gives some good examples of how digital methods can cut down on some of the mess.

    5. However, my naive lack of documented search parameters, and indeed the incompleteness of my transcriptions made their reuse in other contexts dubious.

      This sounds like... everything I've ever done. I can't tell you how many times I've pulled up an old file full of citations and notes and gone, "what?" It's interesting to note that not all digital history practices have to have some grand, open-access-world-changing goal. Best practices are useful for the individual scholar, too.

    1. Wget operates on the following general basis:

      I love these side notes. The breakdown helps me understand the basis of the commands.

    1. I’ve spent a lot of my career talking about what digital methodology can do to advance scholarly arguments

      This is a very important line. Sums up the article and his argument very well. people need to start using this technology to make advancements instead of using it for stuff we already know.

    2. It’s gotten cited in books, journal articles, conference presentations, grant applications, government reports, white papers, and, of course, other blogs. 

      It's very cool how this research he did was able to be spread around so fast, and to so many different types of academic research.

    3. I ended up arguing that it was precisely this fragmentary, mundane, and overlooked content that explained the dominance of regional geography over national geography

      I find this is often the case - even in daily life. In trying to set some colleagues up for a task this week, I wrote out what I thought were clear instructions. Soon found out that there were numerous details which I understood, but were not set out and were causing confusion and extra work to sort out. Useful, but unnecessary eye-opener.

    4. It’s time to start talking in the present tense.

      I feel like I need to make a sparkly animated .gif that says "FAILING PRODUCTIVELY" so I can link it every time a reading suggests doing just that. Seriously, though, this is so important for us students of digital history. We need to DO it before we can theorize about it. We need to know the limitations AND the abilities of our tools before we start planning what we can (or can't, or should, or shouldn't) build.

    5. By publishing it in the Journal of American History, with all of the limitations of a traditional print journal, I was trying to reach a different audience from the one who read my blog post on topic modeling and Martha Ballard. I wanted to show a broader swath of historians that digital history was more than simply using technology for the sake of technology.

      What kind of audience are we reaching with our blogs? I'm consciously writing my blog posts to my classmates, though I know Dr. Graham shares some of our stuff on Twitter. Should I be writing for a wider audience? How will I write about digital history after this class? Can I use these methodologies in other classes, and if so, how should I introduce them? We should all be considering these questions.

    1. a detail that might interestingly complicatea scholar’s assumptions about moral consciousness

      This compelled me to think about the times when i would be shocked by the words i had to end up using in order to find material. Those instances made me reflect on my own knowledge, and the endless frustration that accompanies searching for works via text search.

    2. Likeeveryone else, we begin with Google.

      My high school teachers always said not to use Wikipedia for research - but to use it to broadly understand a topic before arming yourself with key words and ideas for in-depth research.

    3. Scientists who try to model the print record overa significant time span often make assumptions about continuity that huma-nists would recognize as confining.15On this topic, and many others, a rareopportunity is emerging for a genuinely productive exchange between sci-entific methodology and humanistic theory.

      I wonder if this interdisciplinary concept of digital humanities will ever come to eclipse the current and more popular humanities and social science disciplines.

    4. Using algorithms for discovery raises an interesting but unfamiliar set ofphilosophical questions.

      The enormous potential for interdisciplinary research here is a little scary.

    5. I don’t mean to imply a causal connection between these changes (forone thing, there are many other topics in the model; these three don’tconstitute a closed system). The illustration is only meant to show how topicmodeling can generate suggestive leads

      I think that this is the best way to use search engines. They are a great way to find and follow potentially interesting and important leads. However, in terms of analyzing the content in a more qualitative fashion, more in depth study will no doubt be required.

    6. But some strategies are also able to revealevidence that challenges prior assumptions.

      I think that as digital search tools and open notes become more prevalent in the historical discipline, the prior assumptions and confirmation bias' currently in place will continue to be challenged as a broader context of opinions and point of view are brought to the fore. This is an excellent example of why learning and practicing digital history is so important!

    7. Full-textsearchmadethatkindoftopicridiculously easy to explore. If youcould associate a theme with a set of verbal tics, you could suddenly turn updozens of citations not mentioned in existing scholarship and discoversomething that was easy to call ‘‘a discourse.’’

      This point makes me think that one of the most useful ways to use a simple search engine is to discover or support a narrative. If you can connect a series of events or ideas across years and in different sources using simple words as a search tool, it is much easier to propose a narrative as a result of your research.

    8. The scholarly consequences ofsearch practices are difficult to assess, since scholars tend to suppressdescription of their own discovery process in published work.2

      This point would make an excellent argument in McDaniel's article concerning the concept of open notes!

    9. It’s a name for a large family of algorithms that humanists have been usingfor several decades to test hypotheses and sort documents by relevance totheir hypothesis.

      This isn't a new idea! Humanists have been using similar methods for decades. It is just a matter of adapting to the new digital medium, and is not necessarily re-writing the guidebook.

    10. In practice, a full-text search is often a Boolean fishing expedition fora set of documents that may or may not exist.

      I find this very interesting. The fact that the basic 'search' method is literally just looking for a true or false answer to the data you inputted in the search bar really breaks down the process and demonstrates how simple it is in practice. Not only this, but it also shows why it's important to be specific in what you are trying to search for.

    11. Our guesses about search termsmay well project contemporary associations and occlude unfamiliar patternsof thought.

      The phrasing of this really drove home the dangers of full-text search for me. I tend to project quite a bit onto the texts I read; it's a habit of close reading, and something I know to be aware of and counteract. I haven't ever applied such a careful corrective measure to search terms.

    12. Instead, the algorithm has tosort them according to some measure of relevance. Relevance metrics areoften mathematically complex; researchers don’t generally know which met-ric they’re using

      Again, digital tech seems to blind us to confirmation bias, because if it's shiny and new and exciting, how can it be biased? Do we even know enough about some of these search algorithms to determine their internal biases?

    13. The search terms I have chosen encode a tacit hypothesis about the literarysignificance of a symbol, and I feel my hypothesis is confirmed when I getenough hits. I

      It's important to be aware of confirmation bias in all forms - digital (encoded into the algorithm itself) and human (in the ways we use these algorithms). There's bias in regular scholarship too, but this is a good point about how "big data" can be particularly convincing, tempting us into being less discerning as scholars.

    14. ‘‘search’’ is a deceptively modest name for a complextechnology that has come to play an evidentiary role in scholarship.

      I can't believe I'd never even considered this before. This has massive implications not only for "digital history" like we do in this class, but for ALL my academic work. I remember being told in first year that MacOdrum's Summon Search is an unreliable and inefficient search (it searches too many things with too few options for parameters). By second year, I abandoned that advice and Summon Searched without caring what I missed. I wound of spending a lot of time looking at articles that weren't quite what I needed. Given that digital history relies even more heavily on the results from such search technology, this kind of laziness could be fatal to a digital history project.

    1. A case can be made that this was the single largest repository of social historical resources ever generated, on the public facing World Wide Web (unlike Facebook).

      Put this way, its loss is truly horrific. However, as historians, I think the ever-present caution about ethics is important too. It would be fascinating information (for anthropologists as well), but should we really access it?

    1. When used in conjunction with traditional close reading of the diary and other forms of text mining (for instance, charting Ballard’s social network), topic modeling offers a new and valuable way of interpreting the source material.

      I think it's important to note that the information gained through text mining needs to be compared to that found through a close reading, not simply used on its own. Yes text mining is a valuable way to analyze large quantities of data, but without a combination with close readings, it's always possible to miss something. Topic modeling is a good way to help understand the data gained through text mining, as it offers a new way of interpreting it.

    2. But MALLET is completely unconcerned with the meaning of a word (which is fortunate, given the difficulty of teaching a computer that, in this text, discoarst actually means discoursed). Instead, the program is only concerned with how the words are used in the text, and specifically what words tend to be used similarly.

      This is actually super helpful, especially when it comes to analyzing historical documents with outdated words or spellings as the program could otherwise miss out on important information that the researchers wouldn't necessarily find. Having a program be able to analyze such a large collection of writing is really amazing and allows for greater possibilities when it comes to historical research. I am really excited by the doors this avenue of work can open in the field.

    1. I think there’s some neat work to be done here, and I’m always happy to chat with people if they ever want to play with historical data. These aren’t my areas of expertise, but playing with data is.

      I like how honest Milligan is. Truth be told none of this is my expertise so it's reassuring to know that we all can't be well-versed in everything! I do appreciate how Milligan explains his research, it makes for an enjoyable read. I am not one to discuss politics -frankly political vocabulary is not my strong suit- but in 'Political Open Data', I was able to digest the information with ease.

    2. We’re relying on the data as it was submitted, so it’s not going to be perfect.

      This can be a huge problem when working with data bases as there can be so many different words used to describe the same thing. It is something I've have struggled with before in my program trying to organised this imperfect data that has been submitted.

    3. even manipulate the data that a country generates.

      This seems like it could end up poorly but I do agree the public should have access to read and view it. Allowing full open access seems like a wikipedia page style disaster of misinformation.

    1. For over twenty years, I have written about feminist cultural forms of activism.

      Clearly Moravec has found her calling and is well-versed on the subject of Woman's studies.

    2. I am also interested in the methodological implications of doing history digitally

      Moravec appears to be covering a lot of ground in her research with linguistics, digital humanities and feminism.

    1. Are we allowed to aggregate some kinds of data but not others?

      Interesting question I had not considered before. If data is aggregated poorly or presented poorly, is the researcher at fault (for example, faulty medical studies leading to harmful reactions to medicines?)

    1. . Spaces matter. cd.. is a nonsense to the computer: there is no program called cd.. But there is something called cd, and a location ..

      Oh, the amount of times I have made that mistake...

    1. the recent attention paid to an Excel error made by two economists

      I mentioned it in Trevor Owens article but this is why providing a URL/URI as a source can be problematic. I wanted to know more about this but sadly was greeted by a 404 page not found error :/

    2. we are usually willing to share sources when we are finished with them

      This plays well into our course where every final assignment will be a collection of everyones work and annotations the work of a community not a single person. Where in most other classes our work is individual like essays or tests we now are a class developing projects together while developing our individual understanding of digital history.

    1. As Director 0f University of Michigan Press I’m afraid to say that everything you say in this post, Sheila, is true. We’ve struggled over the last few years to bring innovative digital projects into the mainstream press workflow, and you’ve been caught in the middle.

      I have to say, seeing this comment is one of the most interesting parts of this article for me! There really seems to be a contrast here between what Michelle Moravac suggests regarding writing in public as a means of countering isolation in academic writing and Sheila Brennan's experience of increased isolation from the academic community as a result of creating work publicly.

  2. www.trevorowens.org www.trevorowens.org
    1. His research and writing has has been featured in:

      Owens is open with his research and writings by providing various links to his featured work. Very beneficial. Highlighting this to take a look at these in the future.

    1. Seeing the annotations has provided deeper insight into the readings, it is interesting to see what others in this class are thinking.

      This is exactly what I was hoping would happen!

    1. type $ git commit.

      I type this and it says that git cannot auto-detect my email address...I can't write any more commands now

    2. Notepad++

      This is a amazing text editor :D

    3. Then change directory into it

      This is a bit awkwardly worded. It took me a second to understand what the instruction was.

    4. EXERCISE 2: Getting familiar with DHBox

      do i hit enter after each command line? there's always an error message when I get to the pandoc -v part. Plus, I just tried to start from the beginning and now it wont lit me log in in the command line

    5. Make a new branch from your second last commit (don't use < or >).

      I tried using git checkout -b branchname <commit> without the < > and adding my own unique branchname and got errors. anyone else get this?</commit>

    6. cd..

      This command is listed as not found and I'm stuck in my repository from exercises 1-5.

    7. The response: Switched to a new branch 'experiment'

      NOPE! The response I got was -bash: syntax error near unexpected token `newline'. I don't know what that means. I've tried it twice.

    8. Helpfully, the Git error message tells you exactly what to do: type $ git config --global user.email "you\@example.com" and then type $ git config --global user.name "Your Name". Now try making your first commit.

      Here is where I hit a wall. I am no longer getting the $ on new lines to type commands. I now have > on new lines. Attempting to fix now.

    9. What you will do is create a new branch of your project from that point. You can think of a branch as like the branch of a tree, or perhaps better, a branch of a river that eventually merges back to the source

      What I really wish Git had was an actual visual of all the individual files in the main branch and all other branches, showing when and where they merge. That I think would be helpful to see the progress of the project for me.

    1. What this work did was help me balance the scales, help me move between close and far, macro and micro, print and print trade, give me confidence to make strong statements.

      I like that he touches on the value of using both micro and macro to create the strongest statement possible.

    2. History writing is concise, precise, and selective: not telling your reader everything you know is central to how we present interpretations of the past

      This is a major problem to me for how digital history could be represented incorrectly either by fault or on purpose. Large data sets being broken down into easily understood snipits can lead to biases.

    1. the importance that his undergraduate philosophy major has had for his career as a neuroscientist —

      I find that working towards a minor in history I have already benefited in my major in Computer Science. Not only has it help with reasoning but it also helped with communication skills (A skill that I feel like is missing from my field sometimes).

    2. But can you imagine a ‘public literary criticism’?”

      This is very fascinating to me

    3. The study of literature, history, art, philosophy, and other forms of culture has been justly lauded by those whose business it is to teach those fields as a key means of providing students with a rich set of interpretive, critical, and ethical skills with which they can engage the world around them.

      I think this is very true. Oftentimes arts students are overlooked because all we do is "write essays" and read but at the end of the day: it is a skill set. Being able to communicate your thoughts in an effective manner is not something every major seeks to teach their students.

    1. Here is a bigger version of the 1808 map without stationers.

      As an Archeologist it is very interesting to create visual display from data sets on a mass scale which could be useful for predicting where artifacts or ruins could be uncovered.

    1. This is the goal of the macroscope: to highlight immediately what often requires careful thought and calculation, sometimes more than is possible for a single person

      I see a problem arising where data (which always has the risk of being misinterpreted or misrepresented) can be used incorrectly on a much larger scale.

    2. mired in evidence or lost in the noise

      Creating historians who are skilled in both Macro and Micro history is very useful and seems to me like the future of history.

    1. The value of our work is too wrapped up in the scarcity of sources themselves, rather than just the narratives that we weave with them.

      This is something that has always struck me as odd- a lot of important history- or the things people deem important- is based on very few sources, which makes the rarity of it all something of an appeal readers and other researchers, it adds a challenge. Would we lose this effect if too much information ( if there is such a thing) occurred was made available?

    1. I have contributed to more than 30 digital projects

      Clearly a person who is open to digital humanities.

    1. permanent URI’s on their cache of scanned source materia

      We have to be careful assuming URI's are permanent, as people remove/take things down for various reasons. When something is hosted it cost money to keep it accessible and over time it might be removed to save on cost!

    1. Shawville Equity

      I am sure this will be an exciting read... being from the Ottawa region it will be fun studying something close to home.

    1. What is pernicious about this French vs. Python or Japanese vs. Ruby conversation is that it is based on a false equivalency hinging upon the slipperiness of a shared word: language

      Personally, I don't agree with considering Python or Ruby "languages" per se... As they are what I consider code. I have some knowledge of Python, albeit very minimal, but I still would not consider it a language such as Japanese.