84 Matching Annotations
  1. Last 7 days
    1. This is how Git really works. It has its own diff engine(s) (and accepts plugins), and has basically a full understanding of how changes are made and how to merge them

      Should be no problem. See copy-on-write file systems for implementations.

    2. addressing of the commits

      Why not? The commit can include a link to the original content. This would still create new addresses for each modification of a document.

    3. mutability in content addressing systems

      What's the basic problem here? A document should be addressable under the same identifier despite being changed over time. Depending on your notion of "document" this could be impossible (see https://doi.org/10.1002/meet.2008.14504503143)

    4. the address is some query

      In this case the address is delegated to another authority to look up the current file content.

    5. The system cannot always resolve merge conflicts

      Only a problem if edits independently take place at different places.

    6. Each file has a useful content address

      So the content address changes only if the file content changes. This is no problem if files are never changed.

    7. meta-data

      A name that links to a file is just an example of meta-data.

    8. after Jesse’s comments

      What did he say?

  2. Nov 2018
    1. Wikidatahas still a lower normative burden of rules and policies than Wikipedia

      Hopefully this will not change too fast!

    2. For the second, we could try to detect inconsistencies, eitherby inspecting samples of the class hierarchy

      Yes, that's what I do when doing quality work on the taxonomy (with the tool wdtaxonomy)

    3. these are not the only reason behindthe trends observed. An examination of the peaks

      See http://wikicite.org/statistics.html for similar phases and events better analysed by history instead of calculations.

    4. Wikidata quality assessment

      I guess the the average values are spoiled by outliers. A look at distribution would be interesting instead of average and median value only.

    5. Ontology depth

      I'm curious about the depth distribution to compare my findings on other classification systems, including Wikipedia categories: https://arxiv.org/abs/cs/0604036

    6. Possible relations between Items

      This only includes properties of data-type item?! It should be made more clear because the majority of Wikidata classes has other data types.

    7. sed in the present analysis

      I'd add number of classes with connected Wikipedia articles because these provide definition and context

    8. We used the following keywords:‘ontology metrics’, ‘ontology evaluation framework’, and ‘ontology evaluation’. From the results,we selected only papers including primarily structural metrics.

      A similar study on metrics and evaluation of classifications, taxonomies, thesaurus and other knowledge organization systems would be interesting!

    9. and as such not connected to editing activities

      Other aspects of Wikidata ontology quality can also be connected to editing activities but this cannot be measure from the Wikidata data dumps. It would need to take into account external data such as applications that make use of Wikidata (Histropedia, Wikipedia instances...)

    10. Creating KGs is not trivial.

      This applies to universal KG in particular. Domain specific KGs can have any level of complexity - can they still be called knowledge graphs then?

    11. A KG typically spans across several domains and is built on topof a conceptual schema, orontology, which defines what types of entities (classes) are allowed inthe graph, alongside the types ofpropertiesthey can have

      Wikidata differs from typical KG as it is not build on top of classes (entity types). Any item (entity) can be connected by any property. Wikidata's only strict "classes" in the sense of KG classes are its data types (item, lemma, monolingual string...).

  3. Sep 2018
  4. Aug 2018
    1. Elec­tronic Recording Machine Accounting (ERMA)

      It also introduced bank account numbers, see https://en.wikipedia.org/wiki/Electronic_Recording_Machine,_Accounting

    2. Vene­tians

      The Venetian Republic which lasted a millennium is famous for its archival practice. It virtually documented everything.

    1. Sie wird unser aller Leben verbessern

      So wie Zugang zu Trinkwasser und Gesundheitsversorgung?!

    2. KIs bauen will, die künftig erhebliche Verantwortung in der realen Welt übernehmen

      KIs können keine Verantwortung übernehmen, das können nur Menschen als ethische Wesen!

    1. Habe ich einBuch, das für mich Verſtand hat, einen Seelſor¬ger, der für mich Gewiſſen hat, einen Arzt der fürmich die Diät beurtheilt, u. ſ. w. ſo brauche ich michja nicht ſelbſt zu bemühen. Ich habe nicht nöthigzu denken, wenn ich nur bezahlen kann; anderewerden das verdrießliche Geſchäft ſchon für michübernehmen.

      Kant über künstliche Intelligenz

  5. May 2017
    1. we should be critical with over-emphasising metadata as a different category, and accompanying it with myriads of specifically designed technologies and tools as we often see in digital humanities

      and in libraries

    2. metadata is not made of a difference substance, and does not possess a different essence, as compared to plain data. They both are made of the same stuff, so to speak. But some data, which happens to represent other data rather than other kinds of things in the world, may play a meta role in some circumstances.

      Thanks for this paragraph!

    3. we must conclude that metadata should be treated as any other data. We can use the same tools, languages, approaches and techniques to deal with metadata as we do for plain data. We don’t need specifically designed repositories to hold metadata; good old databases should be enough. We don’t need metadata standards; data standards should work. We don’t need metadata schemas; data schemas should do


    4. we should conclude that we were wrong in asserting that all fruit is juicy

      Note that the Liskov substitution principle does not apply to things like fruits and bananas but about our concepts of this things. The world is not made of categories and sub-categories but humans structure and describe the world this way.

    5. represents

      There may be a subtle difference between "is about" and "represents" but nevermind

    6. type 1 metadata should not be called “metadata” at all, but data structure, data schema, data model, or something like that, depending on your preferences and your particular field of expertise

      In my PhD thesis I summarized it as "methods, technologies, standards, and languages exist to structure and describe data" and identified general patterns in this ways to structure and describe data.

    7. we can strip data of its type 2 metadata

      Nope. If we remove metadata type 2, e.g. "LastUpdatedTime", from a piece of data, still there is a time when the piece of data was updated. The information about this time is just not stored (=made explicit). The same applies to type 1 metadata with explicit schemas, data types etc. and implicit assumptions about the structure of data.

    8. we cannot enter data in a database if we haven’t decided on what structure it will have first

      Well, we can, but in these case the data lacks metadata for proper understanding or processing. For instance we could write and store a name as string e.g "Luxemburg, Rosa", implicitly assuming the schema given, surname. Such schema, once documented, would be metadata.

    9. such as the field names and types in a database schema

      that is the original use of the term in computer science as introduced by Philip Bagley (1968)

    10. important concept

      the concept is important in other domain as well but not under the name "metadata"

    11. in digital humanities

      and in libraries as well

  6. Mar 2017
    1. Wikipedia, with its open access editing, open data mining, discussion pages, version control, and encyclopedic scope is another model for bibliographic community building

      article should mention WikiCite and Wikidata

    2. There are still relatively few tools that provide a simple interface, without forcing users to wrangle with command line utilities and XML or JSoN serializations. As Kolkman suggests, there may be a deficit in education, outreach, and/or community-building, which inhibits the adoption of the most promising vocabulary management tools

      Rules of bibliographic data are often hidden in tools that make use of the data. Each tool implies its own rules to the data. Separating rules from tools (e.g. with schema files) can help but schemas are rarely applied to bibliographic data because they would uncover a lot of data as invalid/broken.

    3. Given that bibliographic vocabularies used outside traditional centralized distribution are, by nature, highly interdependent, they stand to benefit greatly from semantic versioning

      This would require a definition of "functionality" of bibliographic vocabularies.

    4. knowledge organization systems

      Is KOS used as more general concept or as synonym to "vocabularies"?

    5. foundation for creating and maintaining Web-based vocabularies

      There are some general guidelines but I doubt that mixing rules for three types of vocabularies is helpful. Otherwise we end up with general rules about any kind of data.

    6. Bibliographic data fall into three general categories

      Ok, the article is not about vocabularies but about bibliographic vocabularies as defined here. In my words:

      • individual bibliographic records
      • bibliographic data formats and models
      • terminologies and other controlled vocabularies
    7. maintenance and repair studies

      libraries have long been part of infrastructure run by public services. Interesting to compare with maintenance of other infrastructures. There seems to exist a general trend to prefer building new infrastructure instead of maintaining the existing one.

    8. vocabularies

      This term can be understood in many different ways. The article should better have given a definition of "vocabularies".

  7. Apr 2016
  8. Mar 2016
    1. die im Unterschied zu anderen ikonischen Formen bestimmten Regeln unterliegt.

      Regeln, Notationen, Mustern, Grammatiken...

    2. diagrammatische Darstellungen in der Wissensorganisation?

      Und wie hängen diese Darstellungen mit Kapitel 3.9 (S. 151-158) meiner Dissertation (http://aboutdata.org/) zusammen?

    3. um neue Formen der Wissensorganisation zu entwickeln.

      Dafür sollte schon die Überführung vorhandener Formen in Diagramme genügen! Hier bleibt der Artikel leider etwas unkonkret zumal Semantic Web und Conceptual Graphs eher aus der Informatik statt aus LIS kommen.

    4. In traditionellen Wissensorganisationssystemen wie Klassifikationen und Thesauri lassen sich mit natürlicher Sprache drei semantische Relationen darstellen: Äquivalenz, Assoziation und Hierarchien

      Hier wäre ein Überblick über vorhandene Visualisierungsen von Wissensorganisationssystemen hilfreich (Baumdiagramm u.A.)

    5. sentential representation systems in the history of modern logic has obscured several important facts about diagrammatic systems

      Es handelt sich in beiden Fällen um Formalismen, d.h. Diagramme können gleichwertig zu formalen Sprachen behandelt werden.

    6. Evidenzsuggestion

      Woher stammt der Begriff? Tobias Kraft (2014, S.98) führt diesen Artikel als Quelle an, obgleich der Begriff hier nur einmal verwendet wird.

  9. Dec 2015
    1. yep

    2. Diane Hillmann

      sorry for not having her on my active radar yet. She seems not to have spoken at SWIB yet (?)

    3. So linked data has got good marketing and a critical mass, in an environment where decision-makers want to do something but don’t know what to do.


    4. of vocabulary consensus and implementation. That’s not a problem that linked data solves.

      linked data does not "solve it" but at least it "addresses it".

    5. We primarily need common identifiers in use between systems

      Almost all I do with Linked Data is primarily introduction and enforcement of identifiers and creation of domain models. RDF is only secondary to express the outcome of this relevant work. Sure it could also be done with other technology but if it is done, RDF is a natural choice.

    6. who can’t always even give us a reliable list in any format at all of what titles with what coverage dates are included in our license

      What do we pay them thousands of $€ for?! Why do we still buy this crap?

    7. to figure out when a record from one system represents the same platform/package/license as in another system

      again, it's all about identifiers and sameness. Linked Data does not solve it but encourages it much more than other technology!

    8. Well, it’s a hard problem of domain modeling, harder than it might seem at first glance.

      Isn't library science supposed to solve this problems? What do we actually have library science for?

    9. Well, because the actual world we are modeling is complicated and constantly changing over the decades, it’s unclear how to formally specify this stuff, especially when it’s changing all the time

      that's why catalog records should regularly be updated and revised (which they are not)

    10. most of the records in our local catalog also haven’t been updated since they were added years, decades ago.

      If data is not updated it is not usage data but archived material

    11. Google Books actually has a pretty good API,

      Even good APIs and good data does not help if libraries are incapable making use of it. See https://danelleorange.wordpress.com/2015/12/02/in-which-your-blogger-loses-it-about-the-library-field/

    12. Why do we get from there instead of OCLC? Convenience? Price?  No easy way to figure out how to bulk download all records for a given purchased ebook package from OCLC? Why don’t the vendors cooperate with OCLC enough to have OCLC numbers on their records

      I fear that acquisition of library holdings or licenses is still unconnected to any use cases for their bibliographic data (if there are any at all)

    13. A major linked data campaign may not be the most efficient, cost effective, or quickest way to solve those problems.

      There may be more efficient ways to solve the problems but without campaign it is hard to solve anything. Linked Data gives a reason to actually change something, it can also be used as door opener!

    14. For the past 7+ years, my primary work has involved integrating services and data from disparate systems, vendors, and sources, in the library environment. I have run into many challenges and barriers to my aspired integrations. They often have to do with difficulties in data interoperability/integration; or in the utility of our data, difficulties in getting what I actually need out of data.

      me too, and probably many people involved in library IT

    15. and is considered a successor to Freebase by some

      Wikidata is not based on RDF internally but it can be mapped to RDF which is quite useful: https://query.wikidata.org/

    16. Modeling all our data as individual “triples” makes it easier to merge data from multiple sources

      Wikidata is not based on RDF by purpose but it is mapped to RDF which is quite useful: https://query.wikidata.org/

    17. abandoned in 2010

      So the hype ended somewhere between 2010 and 2012. SWIB conference started in 2009.

    18. can actually make the linked data more difficult to deal with practically

      Only that's the core benefit of Linked Data.

    19. while establishing standard shared URI’s for entities (eg `http://example.org/lccn/79128112`) is basically “authority control”.

      That's why certain kind of libraries (including me) like about Linked Data: it promotes best practice in catalouging.

    20. I call the linked data model an “abstract data model“,

      I'd call it a data structuring language (comparable with other languages such as XML and JSON). Each data structuring languages implies its own data model - in the case of RDF this model is based on graphs.

    21. even before there were computers

      It started with writing in structures such as lists, tables, cards, forms etc.

    22. of how widespread or successful linked data has been in the wider IT world

      it hasn't.

  10. Oct 2015
    1. division criteria


    2. The interchangeability of the terms "vocabulary" and "controlled vocabulary"

      "controlled vocabulary" might be a superclass of "knowledge organization system". All KOS are controlled vocabularies, aren't they?

    3. prototype of the designed knowledge base

      Where is this knowledge base?!

    4. Knowledge Organization System (levels of knowledge organization)

      The image can only be understood with the following explanation.

    5. currently represented via the technology of Linked Open Data (LOD)

      Well, LOD may be the most prominent technology and for good reason but it's just one possible technology among others.

    6. with the objective of making its storing and access possible

      KOS may also be used for other purpose such as learning aid!

    7. its definition varies from author to author

      Wikipedia should give an overview but it is also biased toward article authors. The current article https://en.wikipedia.org/wiki/Knowledge_Organization_Systems is very brief and the article on https://en.wikipedia.org/wiki/Conceptual_system is mixed with articles about KOS in some languages. See https://www.wikidata.org/wiki/Q6423319 and https://www.wikidata.org/wiki/Q3622126 in Wikidata.

    8. DF13P01OVV013 "Knowledge Base for the Subject Area of Knowledge Organization" project
  11. Sep 2015
    1. people now think hypertext means the web

      See Nelson's reply to Tim Berners Lee about crediting him in Weaving the Web.

    2. In 1960 I had a vision of a world-wide system of electronic publishing, anarchic and populist, where anyone could publish anything and anyone could read it.  (So far, sounds like the web.)  But my approach is about literary depth-- including side-by-side intercomparison, annotation, and a unique copyright proposal.  I now call this "deep electronic literature" instead of "hypertext," since people now think hypertext means the web.