35 Matching Annotations
  1. Dec 2022
  2. Nov 2022
    1. Preserving web content never really left my mind ever since taking screenshots of old sites and putting them in my personal museum. The Internet Archive’s Wayback Machine is a wonderful tool that currently stores 748 billion webpage snapshots over time, including dozens of my own webdesign attempts, dating back to 2001. But that data is not in our hands. Should it? It should. Ruben says: archive it if you care about it: The only way to be sure you can read, listen to, or watch stuff you care about is to archive it. Read a tutorial about yt-dlp for videos. Download webcomics. Archive podcast episodes.

      Should people have their own webarchive? A long list of pro's and cons comes to mind. For several purposes a 3rd party archive is key, for others having things locally is good enough. For other situations having a off-site location is of interest. Is this less a question of webarchiving and more a question of how wide the scope should be of one's own 3-2-1 back-up choices? I find myself more frequently thinking about the processes at e.g. the National Archive in The Hague, where a lot comes down to knowing what you will not keep.

  3. Mar 2022
  4. Feb 2022
  5. Jan 2022
    1. The databases include multiple copies of some titles. But they will never provide all the copies of, say, “The Wealth of Nations” and the early responses it provoked.

      The exact same could be said of the early web which hasn't been evenly archived or easily searchable, so responses to early blog articles may not be easily found amidst a mountain of noise.

    2. commercial enterprises like Alexander Street Press, which offers libraries beautifully produced collections of everything from Harper’s Weekly to the letters and diaries of American immigrants.
    3. Early English Books Online offers a hundred thousand titles printed between 1475 and 1700. Massive tomes in Latin and the little pamphlets that poured off the presses during the Puritan revolution—schoolbooks, Jacobean tragedies with prompters’ notes, and political pamphlets by Puritan regicides—are all available to anyone in a major library.
    4. Chadwyck-Healey and Gale, which sell their collections to libraries and universities for substantial fees.
  6. Dec 2021
    1. Now we update as needed and make good use of the Internet Archive WayBack Machine for legacy or potentially unstable URLs.

      Stanford runs their own archive instance (https://swap.stanford.edu/). Why shouldn't the LOC, too?

  7. Nov 2021
    1. "The Guide to Social Science Data Preparation and Archiving is aimed at those engaged in the cycle of research, from applying for a research grant, through the data collection phase, and ultimately to preparation of the data for deposit in a public archive: " from tweet

  8. Oct 2021
  9. Aug 2021
  10. Jul 2021
    1. It's great to enhance the Internet Archive, but you can bet I'm keeping my local copy too.

      Like the parent comment by derefr, my actual, non-hypothetical practice is saving to the Wayback Machine. Right now I'm probably saving things at a rate of half a dozen a day. For those who are paranoid and/or need offline availability, there's Zotero https://www.zotero.org. Zotero uses Gildas's SingleFile for taking snapshots of web pages, not PDF. As it turns out, Zotero is pretty useful for stowing and tracking any PDFs that you need to file away, too, for documents that are originally produced in that format. But there's no need to (clumsily) shoehorn webpages into that paradigm.

      If you do the print-to-PDF workflow outlined earlier in the thread, you'll realize it doesn't scale well, requiring too much manual intervention and discipline (including taking care to make sure it's filed correctly; hopefully you remember the ad hoc system you thought up last time you saved something), that it's destructive, and that it ultimately gives you an opaque blob. SingleFile-powered Zotero mostly solves all of this, and it does it in a way that's accessible in one or two clicks, depending on your setup. If you ever actually need a PDF, you can of course go back to your saved copy and produce a PDF on-demand, but it doesn't follow that you should archive the original source material in that format.

      My only reservation is that there is no inverse to the SingleFile mangling function, AFAIK. For archival reasons, it would be nice to be able to perfectly reconstruct the original, pre-mangled resources, perhaps by storing some metadata in the file that details the exact transformations that are applied.

    1. <small><cite class='h-cite via'> <span class='p-author h-card'>Jonathan Zittrain</span> in The Rotting Internet Is a Collective Hallucination - The Atlantic (<time class='dt-published'>07/08/2021 22:10:42</time>)</cite></small>

    1. Ebooks don’t have those limitations, both because of how readily new editions can be created and how simple it is to push “updates” to existing editions after the fact. Consider the experience of Philip Howard, who sat down to read a printed edition of War and Peace in 2010. Halfway through reading the brick-size tome, he purchased a 99-cent electronic edition for his Nook e-reader:As I was reading, I came across this sentence: “It was as if a light had been Nookd in a carved and painted lantern …” Thinking this was simply a glitch in the software, I ignored the intrusive word and continued reading. Some pages later I encountered the rogue word again. With my third encounter I decided to retrieve my hard cover book and find the original (well, the translated) text. For the sentence above I discovered this genuine translation: “It was as if a light had been kindled in a carved and painted lantern …”A search of this Nook version of the book confirmed it: Every instance of the word kindle had been replaced by nook, in perhaps an attempt to alter a previously made Kindle version of the book for Nook use. Here are some screenshots I took at the time:It is only a matter of time before the retroactive malleability of these forms of publishing becomes a new area of pressure and regulation for content censorship. If a book contains a passage that someone believes to be defamatory, the aggrieved person can sue over it—and receive monetary damages if they’re right. Rarely is the book’s existence itself called into question, if only because of the difficulty of putting the cat back into the bag after publishing.

      This story of find and replace has chilling future potential. What if a dictatorial government doesn't like your content. It can be all to easy to remove the digital versions and replace them whole hog for "approved" ones.

      Where does democracy live in such a world? Consider similar instances when the Trump administration forced the disappearance of government websites and data.

  11. Apr 2021
    1. A tool targeted at journalists that appears to be a silo-based app for backing up/archiving articles on the web as well as providing analytics, newsletter/email functionalities, and other options.

  12. Mar 2021
    1. I love this idea. I have a fairly extensive personal commonplace book and collect and archive tons of material, but really should delve more deeply into the topic. I'd be particularly interested in the taxonomies portions you've outlined.

  13. Jan 2021
    1. Twitter threads gave illness a name and a face, grounding the dread in particular bodies and disparate — if often overlapping — experiences. They placed these experiences in history, creating an archive of disease, fear, rage, and hope that will persist even as these feelings — and some of these people — have passed.

      Archives are only worth their weight in water if interested parties can find what they're looking for. When artifacts aren't gathered and curated into public-facing unities or collections, then history elides them until further notice. These threads are still floating in the sprawl of the Twitterverse, placed into history and drowned out by an ocean of pure, frantic noise. What this piece makes evident to me is the need for restoration: that they need to be resurfaced, preserved, made visible again.

  14. Nov 2020
    1. Controlled tagging for groups

      Hi Jon, I would need for our group both the ability of controlled tagging and the exporting feature - the latter would be a replacement of the (nonexisting?) feature of archiving a note instead of deleting. Thanks a lot in advance!

  15. Aug 2020
  16. Jun 2020
    1. Good intro on Zettelkasten note management method.


      Further reference:

      The Zettelkasten Method - Lesswrong 2 »Link«. on how to create "physical" Zettelkasten notes.

  17. Apr 2020
    1. However, as stated by Pourret [18], a majority of the journals in geochemistry also have a green colour according to the SHERPA/RoMEO grading system, indicating that preprint (and the peer-reviewed postprint version) articles submitted to these journals can be freely shared on a preprint server, without compromising authors’ abilities to publish in parallel in those journals. Moreover, Pourret et al. [17] highlighted that the majority of journals in geochemistry allow authors to share preprints of their articles (47/56; 84%).
      • Bahwa sebagian besar jurnal di bidang geokimia, membolehkan pengarsipan modus hijau (Green OA), atau pengarsipan dokumen riset, data, makalah versi preprint di repositori nirlaba (misal repositori kampus).

      • Di tahun 2020, fakta ini masih belum banyak diketahui oleh para dosen/peneliti. Mereka cenderung menerima untuk dikendalikan oleh jurnal dalam proses publikasi, tanpa keinginan berargumentasi untuk mempertahankan hak miliknya terhadap makalah (to retain copyrights).

  18. Dec 2019
  19. Nov 2019
  20. Aug 2018
    1. Vene­tians

      The Venetian Republic which lasted a millennium is famous for its archival practice. It virtually documented everything.

  21. Jul 2018
  22. Sep 2017
  23. www.softwareheritage.org www.softwareheritage.org
    1. This is interesting, could it become something like the LOCCS / CLOCCS for software? I like that you can check if your own code is already in their archive.

      It's a French initiative, and was founded by https://en.wikipedia.org/wiki/French_Institute_for_Research_in_Computer_Science_and_Automation. I don't know what their long term sustainability model is going to be.

  24. Jul 2017
  25. Apr 2017
    1. What technology does the archive use? The archiving system fetches links using an enhanced version of wget, with a little extra intelligence about fetching dependencies. Every crawled page gets stored in a single directory, and the links rewritten to point to the local copy.

      Simple explanation

  26. Sep 2014
    1. The cacophony of the crowd erases the past and affirms the present. It started with search and now its accelerated with the now web. I dont know where it leads but I almost want a remember button — like the like or favorite. Something that registers something as a memory — as an salient fact that I for one can draw out of the stream at a later time.