- Jul 2024
-
-
Today, data is abundant, but for the most part, unusable. Seventy percent of a data scientist’s job is just cleansing data. The modern software architecture encourages data to be hoarded only accessible through proprietary APIs. And, even with proprietary APIs the market for data integrations is expected to grow to a trillion dollars by the end of the decade. When humanity is spending the GDP of Indonesia just so that the data in System X can work with the data in System Y, the field of software engineering has failed us. So much data - data that could be used by new startups and nonprofits that couldn’t exist today - goes unused because it’s so difficult to access.
-
- Feb 2024
-
en.wikipedia.org en.wikipedia.org
-
Conflict-free Replicated Data Type (CRDT) https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
-
- Jan 2023
-
citejournal.org citejournal.org
-
Big tech has benefited from an educational dynamic that consistently underfunds public education but demands increased technology to prepare the workers of the future, providing low-cost solutions in exchange for data and the potential for future product loyalty
This is a pattern most of us are familiar with. The best example I know is Apple's launch of the iPad in LA schools without saying, or knowning, how it will be used. Apple has a long history of testing its products out on users. Google habitually does the same, offering products for "free" in exchange for data and expanding a user base for its products.
-
- Jun 2021
-
loc.gov loc.gov
-
This doesn't seem entirely trust-worthy/useful.
The native name seems incorrect/missing for some languages, like German, Hebrew, compared to https://gist.github.com/piraveen/fafd0d984b2236e809d03a0e306c8a4d
-
-
stackoverflow.com stackoverflow.com
-
The US Library of Congress has been designated the official registration authority by the ISO and they publish the entire, official, up-to-date list as a trivial to parse text file for free.
-
-
gist.github.com gist.github.com
-
gist.github.com gist.github.com
-
www.w3.org www.w3.org
-
Because ISO code lists were not always free and because they change over time, a key idea was to create a permanent, stable registry for all of the subtags valid in a language tag.
Why was it not free???
-
- Apr 2021
-
arxiv.org arxiv.org
-
Yang, K.-C., Pierri, F., Hui, P.-M., Axelrod, D., Torres-Lugo, C., Bryden, J., & Menczer, F. (2020). The COVID-19 Infodemic: Twitter versus Facebook. ArXiv:2012.09353 [Cs]. http://arxiv.org/abs/2012.09353
-
- Mar 2021
-
github.com github.com
-
The repository also contains the datasets used in our experiments, in JSON format. These are in the data folder.
Tags
Annotators
URL
-
- Feb 2021
-
trailblazer.to trailblazer.to
-
What this means is: I better refrain from writing a new book and we rather focus on more and better docs.
I'm glad. I didn't like that the book (which is essentially a form of documentation/tutorial) was proprietary.
I think it's better to make documentation and tutorials be community-driven free content
-
- Jan 2021
-
getbible.net getbible.netAPI1
-
-
github.com github.com
Tags
Annotators
URL
-
-
docs.api.bible docs.api.bible
-
-
github.com github.com
Tags
Annotators
URL
-
-
labs.bible.org labs.bible.org
Tags
Annotators
URL
-
- Nov 2020
-
openlibrary.org openlibrary.org
-
The ultimate goal of the Open Library is to make all the published works of humankind available to everyone in the world. While large in scope and ambition, this goal is within our grasp.
-
- Oct 2020
-
thispersondoesnotexist.com thispersondoesnotexist.com
-
;
Tags
Annotators
URL
-
- Apr 2020
-
lnakamur.files.wordpress.com lnakamur.files.wordpress.com
-
User subjects and data objects are treated as programmable matter, which is to say extractable matter.
yes.
-