30 Matching Annotations
  1. Jul 2024
    1. wie wärs mit selbsthilfe?!

      diese passive "wir sind konsumenten" scheisse ist doch genau das problem...

      ich hab mir das print buch gekauft für 22 euro, hab den buchrücken aufgeschnitten mit ner kreissäge, und hab die 208 seiten durch meinen ADF scanner gejagt (Brother ADS-3000N, 150eur gebraucht). ohne vorbereitung ist das vielleicht ne halbe stunde arbeit. dann noch die scans rotieren, croppen, leveln, und durch tesseract jagen. für tesseract braucht man ne schnelle CPU.

      aktuell tu ich die hocr dateien von tesseract korrekturlesen, später werd ich ne pdf draus machen und über libgen.rs auf annas-archive.org hochladen - ein problem weniger.

      hocr dateien hab ich hochgeladen auf https://github.com/milahu/enteignung - vielleicht mag wer helfen beim korrekturlesen, dann gehts 1 oder 2 tage schneller.

      mann mann mann... als "IT insider" bin ich so gelangweilt von den normies, die beim thema IT vor 20 jahren stehen geblieben sind, kein plan haben von linux, git, python, torproject, monero, ... aber hauptsache scheisse labern in telegram >: (

  2. Mar 2024
    1. ChatGPT Vision: The Best Way to Transform Your Paper Notes Into Digital Text

      Upload a photo into ChatGPT and request it to transcribe the photo into text. Better than OCR? It creates meaning out of surrounding context; even though words may be wrong.

  3. Nov 2023
    1. https://docdrop.org/

      Can be used to create optical character recognition on .pdf documents and return documents with selectable/machine readable text.

  4. Sep 2023
  5. Jan 2023
  6. Oct 2022
    1. Worried about paper cards being lost or destroyed .t3_y77414._2FCtq-QzlfuN-SwVMUZMM3 { --postTitle-VisitedLinkColor: #9b9b9b; --postTitleLink-VisitedLinkColor: #9b9b9b; --postBodyLink-VisitedLinkColor: #989898; } I am loving using paper index cards. I am, however, worried that something could happen to the cards and I could lose years of work. I did not have this work when my notes were all online. are there any apps that you are using to make a digital copy of the notes? Ideally, I would love to have a digital mirror, but I am not willing to do 2x the work.

      u/LBHO https://www.reddit.com/r/antinet/comments/y77414/worried_about_paper_cards_being_lost_or_destroyed/

      As a firm believer in the programming principle of DRY (Don't Repeat Yourself), I can appreciate the desire not to do the work twice.

      Note card loss and destruction is definitely a thing folks have worried about. The easiest thing may be to spend a minute or two every day and make quick photo back ups of your cards as you make them. Then if things are lost, you'll have a back up from which you can likely find OCR (optical character recognition) software to pull your notes from to recreate them if necessary. I've outlined some details I've used in the past. Incidentally, opening a photo in Google Docs will automatically do a pretty reasonable OCR on it.

      I know some have written about bringing old notes into their (new) zettelkasten practice, and the general advice here has been to only pull in new things as needed or as heavily interested to ease the cognitive load of thinking you need to do everything at once. If you did lose everything and had to restore from back up, I suspect this would probably be the best advice for proceeding as well.

      Historically many have worried about loss, but the only actual example of loss I've run across is that of Hans Blumenberg whose zettelkasten from the early 1940s was lost during the war, but he continued apace in another dating from 1947 accumulating over 30,000 cards at the rate of about 1.5 per day over 50 some odd years.

  7. Sep 2022
  8. Aug 2022
  9. Jun 2022
    1. COCO-Text: Dataset for Text Detection and Recognition

      • 63K images
      • 145K text instances
      • Feature labels: machine printed / handwritten. Legible / illegible, English / non-English script

      See also the COCO-Text V2 site.

  10. Feb 2022
    1. Free All-in-one PDF tools A reliable, intuitive and productive PDF Software

    Tags

    Annotators

    URL

  11. Dec 2021
  12. Nov 2021
  13. Jul 2021
    1. T.LUCRETICARI

      Not going to be the prettiest version, but at least somewhat OCR'd for annotating!

    Tags

    Annotators

    1. Titi Lucreti Cari De Rerum Natura Libri SexWith a Translation and NotesVolume 1Edited by H. A. J. Munro Lucretius

      Testing out the OCR functionality of docdrop.org.

      I'm noticing that the pdf fingerprint of this text somehow matches that of other texts as there are a lot of non-related annotations on this page.

      Is docdrop doing something squirrelly with the fingerprint @dwhly?

  14. Feb 2021
  15. Jan 2021
    1. Apart from a basic segmenter taken from OCRopus a trainable line extractor is in the process of being implemented. Full trainability of layout analysis is of utmost importance to a truly universal OCR system, as text layout and its semantics varies widely across time and space, e.g. hand-crafted methods for printed Latin text are unlikely to work reliably on Arabic text or manuscripts with extensive interlinear annotation.

      wip implementation of line segmentation in kraken

  16. Oct 2020
  17. Jul 2020
  18. Apr 2020
    1. Adobe AcrobatPro.

      gImageReader is an excellent open source alternative. It runs both on Windows and Linux, and it provides a simple (yet powerful) frontend GUI to Google's robust open source OCR engine, Tesseract.

      I think an open source tool as this is a better fit to the open annotation ecosystem, based on libre software and standards, that Hypothesis promotes, instead of a proprietary (and expensive) tool such as Adobe AcrobatPro.

  19. Apr 2019
  20. Sep 2015
  21. Aug 2015