1,236 Matching Annotations

Feb 2025
media.dltj.org media.dltj.org

Video: Don't use spans or code for shortcuts and keystrokes by Kevin Powell, annotated

1
1. peter_murray 17 Feb 2025
  
  in Public
  
  🔗 Links
  
  The code from this video
  
  Relative colors
  
  MDN on kdb (includes samp examples too)
  
  :has() support table
  
  HTML & CSS tip of the week
  
  accessible technology
Visit annotations in context

Tags

accessible technology

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20250217T122424-keKj7tfOc2w-dont-use-spans-code-shortcuts-keystrokes/index.html
www.usmint.gov www.usmint.gov

2024 Annual Report

1
1. peter_murray 10 Feb 2025
  
  in Public
  
  FY 2024 unit costs increased for all circulating denominations compared to last year. The penny’s unit cost increased20.2 percent, the nickel’s unit cost increased by 19.4 percent, the dime’s unit cost increased by 8.7 percent,and the quarter-dollar’s unit cost increased by 26.2 percent. The unit cost for pennies (3.69 cents) and nickels(13.78 cents) remained above face value for the 19th consecutive fiscal year
  
  Unit costs for pennies and nickels is more than their face value
  
  central bank
Visit annotations in context

Tags

central bank

Annotators

peter_murray

URL

usmint.gov/content/dam/usmint/reports/2024-annual-report.pdf
www.lbjlibrary.org www.lbjlibrary.org

The LBJ the Nation Seldom Saw

1
1. peter_murray 09 Feb 2025
  
  in Public
  
  He talked about the difference between constructive action and obstructive action: “Any jackass can kick a barn down. But it takes a carpenter to build one.”
  
  The difference between constructive action and destructive action
  
  politics
Visit annotations in context

Tags

politics

Annotators

peter_murray

URL

lbjlibrary.org/life-and-legacy/the-lbj-the-nation-seldom-saw
media.dltj.org media.dltj.org

Video: The runaway debris threat to space technology | FT Tech by Financial Times Tech, annotated

3
1. peter_murray 07 Feb 2025
  
  in Public
  
  While collision avoidance alerts have become an almost weekly occurrence at ESA mission control, the world’s largest satellite operator, the SpaceX-owned Starlink, reported more than 49,000 risk mitigation manoeuvres in the six months to the end of May 2024. Its fleet has grown to a total of almost 7,000 craft in orbit.
  
  Starlink reported 49k risk-avoidance maneuvers in 6 months in 2024
  
  low earth orbit internet infrastructure
2. peter_murray 07 Feb 2025
  
  in Public
  
  On average, around a dozen “fragmentations”, break-ups caused by collisions or wear and tear, have been recorded every year over the past 20 years, and observations suggest they’re on the rise, amid booming demand for satellite services.
  
  Around a dozen orbital break-ups per year
  
  low earth orbit
3. peter_murray 07 Feb 2025
  
  in Public
  
  When a defunct Russian craft collided with an Iridium communications satellite in 2009, thousands of fragments were left behind, threatening more craft and more collisions, which would create even more debris. This cascading space junk scenario is the Kessler syndrome, named after a NASA scientist who foresaw the possibility back in the late 1970s.
  
  Origin of the phrase "Kessler Syndrome"
  
  low earth orbit
Visit annotations in context

Tags

internet infrastructure

low earth orbit

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20250206T191727-_jJc5LLmHsI-runaway-debris-threat-space-technology-ft-tech/index.html
media.dltj.org media.dltj.org

Video: No, We Did Not Ask For This... Now What? by Rev Ed Trevors, annotated

2
1. peter_murray 07 Feb 2025
  
  in Public
  
  So like I said, the prime minister, I can't remember his name of Quebec, he said, we're going to take this moment and we're going to invest. We're going to invest in new ways forward, new trade relationships. We're going to invest our money in making new economic friends. I love that mentality. I'm not going to be beholden to the whims of this person. I'm not going to be beholden to the whims of this guy. I'm going to choose a new way forward. The status quo, the status quo, it's just too volatile.
  
  Canada looks for its own path in response to Trump
  
  Trump2
2. peter_murray 07 Feb 2025
  
  in Public
  
  So one of the biggest things that ever happened in Christianity obviously was the death of Jesus Christ. As far as the church is concerned, everything is rolling along. You can think about the stories, the disciples, they're sitting there, they're listening, they're picking up on everything Jesus says and they're arguing with each other about which one of them is going to get to see it at the right of the left and they're doing all the things that humans do, but Jesus is right there. And Jesus is teaching and Jesus is forgiving and Jesus is being merciful and Jesus is being a good friend and he's sharing with them these universal truths, these eternal truths. He's sharing with people who in their day and time would not have been considered worthy to receive such teaching and yet here Jesus is, he's teaching this stuff. And then one night he gets arrested and the next morning he's beaten, he's whipped, he's crucified and he's dead within hours. He's dead. They take him down off the cross, off the implement of his demise and they put him in a tomb. Roll the rock across and it's over. Everything that they've been experiencing it's over. They have no idea, they're terrified. They think they're probably next that the religious leaders will send for them next, that they will, they'll be recipients of a similar fate. So they huddle away, they hide. They're not going outside. Again, everything they thought would happen. It's thrown up in the air. Now we know how the story goes. We know how the story goes. And the truth is the story goes and everything they knew was thrown up in the air because he comes back. And so despite the fact that here they went through this moment where their lives sort of came to this halt, the direction of their life came to a screeching halt. Then all of a sudden they get themselves, it takes two or three days and you can imagine as I would do anyway, I figure out another way. What does this mean for my life going forward? Well, it means that I'm going to have to live in a cave the whole time because I'm terrified that the religious leaders are going to hang me up or hang my family up and I don't want that or I'm going to be arrested. I'm going to be prosecuted and be persecuted for my faith in this guy. So you sort of start wrapping your head around how do I survive and then all of a sudden dude comes back. Mmm, it throws it all into turmoil again. And I've got to learn how to live with this new reality.
  
  The real-time death and resurrection story from the disciples' perspective
  
  religion
Visit annotations in context

Tags

Trump2

religion

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20250205T173128-PnVRRx0TBBk-no-not-ask-now-what/index.html
crsreports.congress.gov crsreports.congress.gov

OMB Releases OPEN Government Data Act Guidance

3
1. peter_murray 05 Feb 2025
  
  in Public
  
  Further, M-25-05 clarifies that the definition is intentionally broad: “a databaseprocured through a contract may be a data asset subject to the requirements of this guidance, even if thecontents of the database are owned by a private party.”
  
  Covered data includes procured databases owned by a private company
2. peter_murray 05 Feb 2025
  
  in Public
  
  The law expands on the requirement in theFreedom of Information Act (FOIA, 5 U.S.C. §552) for agencies to make electronic copies of previouslyreleased records more broadly available for public inspection.
  
  Builds on the FOIA
  
  Freedom of Information Act
3. peter_murray 05 Feb 2025
  
  in Public
  
  Open, Public, Electronic, and NecessaryGovernment Data Act (OPEN Government Data Act, P.L. 115-435, Title II),
  
  From the Foundations for Evidence-Based Policymaking Act
  
  Foundations for Evidence-Based Policymaking Act of 2018
  
  open government
Visit annotations in context

Tags

Freedom of Information Act

open government

Annotators

peter_murray

URL

crsreports.congress.gov/product/pdf/IN/IN12502
arstechnica.com arstechnica.com

It seems the FAA office overseeing SpaceX’s Starship probe still has some bite

1
1. peter_murray 03 Feb 2025
  
  in Public
  
  The good news is there were no injuries or reports of significant damage from the wreckage that fell over the Turks and Caicos. "The FAA confirmed one report of minor damage to a vehicle located in South Caicos," an FAA spokesperson told Ars on Friday. "To date, there are no other reports of damage." It's not clear if the vehicle owner in South Caicos will file a claim against SpaceX for the damage. It would be the first time someone makes such a claim related to an accident with a commercial rocket overseen by the FAA.
  
  First damage from a commercial space launch in US
  
  low earth orbit
Visit annotations in context

Tags

low earth orbit

Annotators

peter_murray

URL

arstechnica.com/space/2025/02/it-seems-the-faa-office-overseeing-spacexs-starship-probe-still-has-some-bite/
www.scientificamerican.com www.scientificamerican.com

How Many Planets Are in the Solar System?

2
1. peter_murray 02 Feb 2025
  
  in Public
  
  Such slipperiness is a universal theme in the natural sciences: definitions are a human conceit, not a natural occurrence. Look at colors; we have definitions for what the wavelength cutoff is for orange versus red, but that boundary was a choice, not something observed. Blue light and red light are two flavors of the same thing, just with different wavelengths. And all the colors in between literally lie along a spectrum; one blends smoothly into the next with no sharp steps to discern between them.It’s not hard to come up with many more examples: sex, gender, political affiliation, species, and more. All of these exist on a spectrum. The differences are obvious between examples at opposite ends of a spectrum, tempting you to put them in a binary category, but when you compare any two samples close together on that spectrum, the differences are far harder to tease out. So where do you draw the line?Humans like putting things in clear-cut categories, but in general, nature isn’t so picky. Acknowledging that can make life a lot easier and help us understand the universe—and ourselves—better.
  
  “Definitions are a human conceit, not a natural occurrence”
  
  philosophy
2. peter_murray 02 Feb 2025
  
  in Public
  
  Eventually, in 2006, the IAU settled on a three-part definition: a planet orbits the sun, is sufficiently massive to form itself into a round shape by gravity and has “cleared the neighborhood around its orbit,” which means it’s the most gravitationally dominant body there. (This term did initially cause some confusion because it could be misinterpreted to mean that the planet can sweep its orbit completely clean of any and all other bodies, which is impossible.) If a body fulfills the first two conditions but not the third, it’s called a “dwarf” planet.
  
  IAU’s definitions of “planet” and “dwarf planet”
  
  Later: “Objects need to be at least 400 km or so in size to round out via self-gravity”
  
  space exploration
Visit annotations in context

Tags

space exploration

philosophy

Annotators

peter_murray

URL

scientificamerican.com/article/how-many-planets-are-in-the-solar-system/
Jan 2025
stratechery.com stratechery.com

DeepSeek FAQ

3
1. peter_murray 27 Jan 2025
  
  in Public
  
  Distillation is a means of extracting understanding from another model; you can send inputs to the teacher model and record the outputs, and use that to train the student model. This is how you get models like GPT-4 Turbo from GPT-4. Distillation is easier for a company to do on its own models, because they have full access, but you can still do distillation in a somewhat more unwieldy way via API, or even, if you get creative, via chat clients.
  
  Distillation
  
  Using the outputs of a "teacher model" to train a "student model".
  
  building LLMs
2. peter_murray 27 Jan 2025
  
  in Public
  
  DeepSeekMLA was an even bigger breakthrough. One of the biggest limitations on inference is the sheer amount of memory required: you both need to load the model into memory and also load the entire context window. Context windows are particularly expensive in terms of memory, as every token requires both a key and corresponding value; DeepSeekMLA, or multi-head latent attention, makes it possible to compress the key-value store, dramatically decreasing memory usage during inference.
  
  Multi-head Latent Attention
  
  Compress the key-value store of tokens, which decreases memory usage during inferencing.
  
  building LLMs
3. peter_murray 27 Jan 2025
  
  in Public
  
  The “MoE” in DeepSeekMoE refers to “mixture of experts”. Some models, like GPT-3.5, activate the entire model during both training and inference; it turns out, however, that not every part of the model is necessary for the topic at hand. MoE splits the model into multiple “experts” and only activates the ones that are necessary; GPT-4 was a MoE model that was believed to have 16 experts with approximately 110 billion parameters each. DeepSeekMoE, as implemented in V2, introduced important innovations on this concept, including differentiating between more finely-grained specialized experts, and shared experts with more generalized capabilities. Critically, DeepSeekMoE also introduced new approaches to load-balancing and routing during training; traditionally MoE increased communications overhead in training in exchange for efficient inference, but DeepSeek’s approach made training more efficient as well.
  
  Mixture-of-Experts
  
  Split LLM models into components with specialized knowledge, then activate only the modules that are required to address a prompt.
  
  building LLMs
Visit annotations in context

Tags

building LLMs

Annotators

peter_murray

URL

stratechery.com/2025/deepseek-faq/
media.dltj.org media.dltj.org

Video: MAGA And The Danger Of Empathy by Rev Ed Trevors, annotated

2
1. peter_murray 27 Jan 2025
  
  in Public
  
  point us towards ideology, like Christian nationalism, point us towards understandings and values that are held by by maga. Empathy is the worst for those folks because empathy. Well, empathy gets in the way of hatred. Empathy gets in the way of blame. Empathy gets in the way of of our desire to destroy another or a group of people. And for what it's worth, Christian nationalism is all about all about the dangers of the other, all about pointing out the how dangerous this group is or how dangerous that group is and how you must be careful of this group.
  
  Empathy is antithetical to christian nationalism
  
  nationalism
2. peter_murray 27 Jan 2025
  
  in Public
  
  Empathy is in fact essential for the Christian life. Whether it's a new concept or an old concept, the idea that we learn, we hear one another stories, hearing one another stories is essential to connecting. It's essential for building community. It's essential for building trust. It's essential for coming to a place of understanding and ultimately it's essential to love.
  
  Empathy is essential for the Christian life
  
  spirituality
Visit annotations in context

Tags

spirituality

nationalism

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20250127T092431-7Ni0xnQq_bQ-maga-danger-empathy/index.html
www.wnycstudios.org www.wnycstudios.org

Wars Are Won By Stories | On the Media | WNYC Studios

5
1. peter_murray 26 Jan 2025
  
  in Public
  
  Whispering was a subspecialty of propaganda. You had to be trained in it specially. Not just anybody could do it. Now, you might think that spreading rumors means talking as loudly and widely as possible, but that's not true. The coordination of loose lips had to be as tight as the coordination of special forces. I'll tell you how it worked. The Allies put together rumors at something called the rumor factory. The head of this section had the enviable title master whisperer. The whispers would go out through strategic networks. In a given region, a chief whisperer would organize the whispers, give them to agents. They would give them to sub-agents. Mostly sub-agents were ordinary civilians. You could be a reliable sub-agent in a propaganda network and not even know it.
  
  "Whispering" technique of propaganda
2. peter_murray 26 Jan 2025
  
  in Public
  
  Now talk about Adele Kibre. Elise Graham: Kibre had, without knowing it, been training all of her life to be a spy. She had a PhD in classics from the University of Chicago. Because women couldn't really go into the professoriate in these years, she became a professional archive hunter, hopping from archive to archive across Europe, earning money by taking photographs of rare texts for scholars back home in the States. By the time she was recruited by the OSS, she had been doing this for almost a decade. Kibre became the most productive document acquisitions agent working for the Allies. She was working undercover in supposedly neutral Sweden. Sweden could continue to be neutral as long as no spies operated in Sweden, so Kibre had to work completely undercover. Furthermore, Sweden tilted in the direction of the Axis. The Swedish police had trained with the Gestapo so this was actually still a very dangerous place to be a spy. She acquired and sent home on microfilm not just industry directories and trade magazines and railway schedules and German newspapers and atlases and maps and technical journals, but stuff that only went to vetted subscribers who were sympathetic to the Nazis. A massive number of documents that went all over the world on behalf of the Allies, including into the library at a little place in New Mexico called Los Alamos.
  
  Exploring archives, and knowing what you are looking for
3. peter_murray 26 Jan 2025
  
  in Public
  
  All the work of those professors and librarians would have been nothing if Sherman Kent hadn't been their spokesman. What he was trying to persuade the military of was that most of what an intelligence agency needs to know can come from public sources. In the right hands, paper can be more effective than bombs. It could tell the right reader what factory should be bombed to stop the production of ball bearings. It's more useful to stop the production of ball bearings than to stop the production of fighter planes because ball bearings are used to create fighter planes. How do you know what factory? By comparing minute fluctuations in railroad rates and then you find its address by looking at a street directory. It was really adventurous and imaginative reading in the New York Public Library and the Library of Congress that allowed the Allies to come to these insights. You could see why intelligence analysis proved so powerful as a weapon of espionage why it became the basis of modern spycraft. However, none of that is actionable unless you can convince the military that what you're pulling out of a street directory is actually useful. That was Sherman Kent's job, or at least part of his job. He would fight with the generals on behalf of his analysts and he would say, "Listen, what these guys are coming up with is actually useful."
  
  Research, and selling/explaining the research
4. peter_murray 26 Jan 2025
  
  in Public
  
  Rumor has it that to this day, the CIA does recruiting at the annual American Library Association conference.
  
  CIA recruits at the American Library Association conference
  
  There was at least a resolution passed by ALA Council to prohibit CIA recruiting: Librarians Protest Against CIA Presence at ALA Annual Conference in D.C., Submit Resolution | Library Journal
5. peter_murray 26 Jan 2025
  
  in Public
  
  In early 1945, a fellow named Henry DeWolf Smyth was called into an office in Washington and asked if he would write this book that was about a new kind of weapon that the US was developing. The guy who had called him into his office, Vannevar Bush, knew that by the end of the year, the US was going to drop an atomic bomb that had the potential to end the war, but also that as soon as it was dropped, everybody was going to want to know what is this weapon, how was it made, and so forth. Smyth accepted the assignment. It was published by Princeton University Press about a week after the bomb was dropped. It explained how the US made the bomb, but it told a very specific kind of story, the Oppenheimer story that you see in the movies, where a group of shaggy-haired physicists figured out how to split the atom and fission, and all of this stuff. The thing is, the physics of building an atomic bomb is, in some respects, the least important part. More important, if you actually want to make the thing explode, is the chemistry, the metallurgy, the engineering that were left out of the story. The book was published the way it was so that it would satisfy people's curiosity but not give other countries the information that you actually need to build a bomb. It was a misinformation campaign, the very last one of the war, and the most successful because it still utterly dominates the way that we think about how the bomb was built and how the war was won.
  
  A writer was embedded in the Manhattan Project to make a book about the atomic bomb
Visit annotations in context

Annotators

peter_murray

URL

wnycstudios.org/podcasts/otm/articles/wars-are-won-by-stories
media.dltj.org media.dltj.org

Video: The True and Tragic Tale of Ballerina Farm and Candice Miller by Siobhan Brier Aguilar, annotated

1
1. peter_murray 17 Jan 2025
  
  in Public
  
  Commentary on Meet the queen of the ‘trad wives’ (and her eight children) and How an Instagram-Perfect Life in the Hamptons Ended in Tragedy - The New York Times
  
  journalism
Visit annotations in context

Tags

journalism

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20250117T085458--CoG6QGAHfQ-true-tragic-tale-ballerina-farm-candice-miller/index.html
radiolab.org radiolab.org

Curiosity Killed the Adage - Transcript

3
1. peter_murray 13 Jan 2025
  
  in Public
  
  MARIA PAZ: They're not flying, they're not weightless. They're not in zero G, but instead, up there in the Space Station ... MICHELLE THALLER: The reason you can put your pen right beside you and it'll just float when you let go of it, the pen and you are falling towards the Earth at exactly the same rate. MARIA PAZ: What? MICHELLE THALLER: They're falling. ANNIE: They're falling? MICHELLE THALLER: Yes! Every second of every day they're up there, their whole space containment, their capsule, their space station, everything's falling. They're freely falling towards the Earth. ANNIE: Oh my God! MICHELLE THALLER: I mean, if you've ever been on, like, a really great roller coaster that drops, that kind of thing, I mean, that is what they feel. They feel like they're falling. MARIA PAZ: Ugh, that's nauseating! MICHELLE THALLER: Oh, yeah. Some people get very sick. [laughs]
  
  Astronauts in orbit don't feel weightless, they feel constant falling
  
  low earth orbit
2. peter_murray 13 Jan 2025
  
  in Public
  
  SCOTT WEIDENSAUL: And when they migrate to Africa, from the moment they leave their breeding grounds in central Europe all the way south to Africa, through the entirety of the winter in Africa and all the way back on their spring migration, they never touch ground. ANNIE: These birds lift up off the ground, and don't come down again for 10 months of the year! LATIF: Ten months of the year? ANNIE: Yes. It flies—it flies for 10 straight months. SCOTT WEIDENSAUL: They only come to the ground for the shortest period of time that they possibly can manage. They have stretched the thread connecting them to the ground absolutely to the breaking point.
  
  The Common Swift flies for 10 straight months
  
  See Common Swift - eBird and The Common Swift Is the New Record Holder for Longest Uninterrupted Flight | Audubon.
  
  animal behavior
3. peter_murray 13 Jan 2025
  
  in Public
  
  Curiosity Killed the Adage, Radiolab Release date: Dec 20, 2024
  
  Episode Description
  
  The early bird gets the worm. What goes around, comes around. It’s always darkest just before dawn. We carry these little nuggets of wisdom—these adages—with us, deep in our psyche. But recently we started wondering: are they true? Like, objectively, scientifically, provably true?
  
  So we picked a few and set out to fact check them. We talked to psychologists, neuroscientists, runners, a real estate agent, skateboarders, an ornithologist, a sociologist and an astrophysicist, among others, and we learned that these seemingly simple, clear-cut statements about us and our world, contain whole universes of beautiful, vexing complexity and deeper, stranger bits of wisdom than we ever imagined.
Visit annotations in context

Tags

low earth orbit

animal behavior

Annotators

peter_murray

URL

radiolab.org/podcast/curiosity-killed-the-adage/transcript
www.nytimes.com www.nytimes.com

At the Intersection of A.I. and Spirituality

5
1. peter_murray 06 Jan 2025
  
  in Public
  
  On a recent afternoon at his synagogue, Rabbi Hayon recalled taking a picture of his bookshelf and asking his A.I. assistant which of the books he had not quoted in his recent sermons. Before A.I., he would have pulled down the titles themselves, taking the time to read through their indexes, carefully checking them against his own work.“I was a little sad to miss that part of the process that is so fruitful and so joyful and rich and enlightening, that gives fuel to the life of the Spirit,” Rabbi Hayon said. “Using A.I. does get you to an answer quicker, but you’ve certainly lost something along the way.”
  
  LLMs taking the joy out of the search for information
  
  LLMs for information retrieval
2. peter_murray 06 Jan 2025
  
  in Public
  
  For centuries, new technologies have changed the ways people worship, from the radio in the 1920s to television sets in the 1950s and the internet in the 1990s. Some proponents of A.I. in religious spaces have gone back even further, comparing A.I.’s potential — and fears of it — to the invention of the printing press in the 15th century.
  
  Religions use new technologies
  
  The first major book printed by Guttenburg on his printing press was, of course, the Bible. Having biblical texts widely available in vernacular languages was one of the causes of the Reformation.
  
  See also The Divided Dial: Episode 2 - From Pulpit to Politics | On the Media | WNYC Studios.
  
  LLMs for religion
3. peter_murray 06 Jan 2025
  
  in Public
  
  Critics of A.I. use by religious leaders have pointed to the issue of hallucinations — times when chatbots make stuff up. While harmless in certain situations, faith-based A.I. tools that fabricate religious scripture present a serious problem. In Rabbi Bot’s sermon, for instance, the A.I. invented a quote from the Jewish philosopher Maimonides that would have passed as authentic to the casual listener.
  
  LLM Confabulation of Religious Ideas
  
  LLM confabulation
4. peter_murray 06 Jan 2025
  
  in Public
  
  To assist his research, Rabbi Hayon regularly uses a custom chatbot trained on 20 years of his own writings. But he has never used A.I. to write portions of sermons.“Our job is not just to put pretty sentences together,” Rabbi Hayon said. “It’s to hopefully write something that’s lyrical and moving and articulate, but also responds to the uniquely human hungers and pains and losses that we’re aware of because we are in human communities with other people.” He added, “It can’t be automated.”
  
  Sermon-writing as an art form
  
  There is a creative process in religious writing. It takes the humanity of being human to have the insight to write something that is moving.
5. peter_murray 06 Jan 2025
  
  in Public
  
  the presence of A.I. in faith-based spaces, he said, poses a larger question: Can God speak through A.I.?
  
  "Can God speak through A.I.?"
  
  To the extent that "God speaking" is a mystical form of communication, can one find meaning in the non-deterministic output of an algorithm?
Visit annotations in context

Tags

LLMs for religion

LLM confabulation

LLMs for information retrieval

Annotators

peter_murray

URL

nytimes.com/2025/01/03/technology/ai-religious-leaders.html
Dec 2024
www.wnycstudios.org www.wnycstudios.org

Why Men And Boys Are Struggling | On the Media | WNYC Studios

2
1. peter_murray 02 Dec 2024
  
  in Public
  
  The Bureau for Labor Statistics actually has a measure of jobs that require physical strength. The number of jobs that require any kind of serious physical strength has now dropped to below 10%. It's not that there are none, but it used to be closer to 30%.
  
  Number of jobs requiring physical strength decreasing
  
  employment
2. peter_murray 02 Dec 2024
  
  in Public
  
  The way the education system is currently structured does build in something of an advantage for girls and women because the prefrontal cortex of girls develops earlier than boys, largely because that's triggered by puberty, which occurs earlier in girls than boys. The prefrontal cortex is an interesting part of the brain because it's the bit that helps you turn in your chemistry homework on time. It's the bit that is sometimes called the CEO of the brain. It's about non-cognitive skills, organizational skills, et cetera. To turn your chemistry homework in on time, you have to take your chemistry homework home. You have to remember to take it back in. You have to remember that you have a chemistry class to go to. It requires a whole bunch of skills that are not really about smarts. It's not true that girls are smarter than boys, or the other way around. There's no evidence for a gap in terms of that. Interestingly, SAT and ACT, these standardized tests, there's really no gender gap there, but in GPA there's a huge gap because GPA rewards turning in your homework on time.
  
  No gender gap in standardized tests, but gap in GPA
  
  Education rewards high executive functioning skills, and those develop earlier in women.
  
  education
Visit annotations in context

Tags

education

employment

Annotators

peter_murray

URL

wnycstudios.org/podcasts/otm/articles/why-men-and-boys-are-struggling
media.dltj.org media.dltj.org

Video: The Mind Bomb by Randahl Fink, annotated

13
1. peter_murray 02 Dec 2024
  
  in Public
  
  Finally, I will say, "Fear not, because we have options". here are four counter measures that we can all take today.
  
  Proposed countermeasures
  
  Report bots
  
  Avoid anonymity online...be a real person
  
  Build a strong social network presence
  
  Choose resilient social media
2. peter_murray 02 Dec 2024
  
  in Public
  
  In fourth generation warfare, seen here on the right, the direct control of the bots is replaced with AI. Using AI we can define artificial personalities with artificial social media behaviour, that will make it increasingly difficult to detect the bots.
  
  "Artificial Intelligent Personalities", forth generation warfare
  
  LLM generated inauthentic content
3. peter_murray 02 Dec 2024
  
  in Public
  
  third generation mass information warfare, which I define as using remote agents equipped with advanced automated systems for controlling thousands of social media accounts, to both spread disinformation but also to carry out behavioral attacks on social media platforms.
  
  Bot farms—automation, third generation mass information warfare
4. peter_murray 02 Dec 2024
  
  in Public
  
  This is what enables second generation mass information warfare, which I define as using remote agents (with one or more social media accounts) to spread disinformation online.
  
  Social media disinformation, second generation mass information warfare
  
  I think he is talking about Section 230 of the U.S. [[Communications Decency Act]], which absolves platform providers of liability for users' when the platform meet a low bar of requirements.
5. peter_murray 02 Dec 2024
  
  in Public
  
  Mass media has become so popular, we pay for the receivers… to receive the propaganda which is meant to deceive us. A good example of that is Der Volksempfänger, which René showed a little earlier. That was a cheap, simple radio receiver, and it was the brainchild of Joseph Goebbels — the Reich Minister of Propaganda in Nazi Germany. It was presented in 1933 and it was designed to meet the price point of one week salary, allowing them to sell 12.5 million units before the war, so propaganda could flow freely to the ears of the masses.
  
  Nazi Germany "People's Receiver"
  
  "Volksempfänger." Wikipedia, Wikimedia Foundation, 28 Nov. 2024, en.wikipedia.org/wiki/Volksempf%C3%A4nger. Accessed 2 Dec. 2024.
  
  Symbolic of what he calls, "First Generation Mass Information Warfare".
  
  broadcast media
6. peter_murray 02 Dec 2024
  
  in Public
  
  If you want to convince people that something fake is real, use a credibility enhancer, and that's what the agents did. They took the visual identity of a French news network called Euronews, which is based in Lyon, and they put their logo and all their graphics and everything, their fonts — they used all of that from Euronews to deceive people into thinking this was indeed real.
  
  Credibility Enhancer, defined
  
  Appropriating the style of a credible source—say, a news organization—to promote the disinformation.
7. peter_murray 02 Dec 2024
  
  in Public
  
  Social media is the very reason truth is threatened — why our ability to see the world clearly is threatened. Any military expert will tell you that on a real battlefield NATO would be unbeatable to an increasingly weakened Russia. But in this unmoderated vulnerable social media space, our truth is an easy target. And that brings us to how we are under attack.
  
  Social media as a weapon against truth
  
  social media
8. peter_murray 02 Dec 2024
  
  in Public
  
  Every narrative in Putin's regime is a product of propaganda. From the from the school books which show a very different, distorted view on World War II, to every daily news episode on Russian State TV, the purpose of every piece of information is to forge a distorted view on reality in which Putin is the only solution. This is why the truth covered by Western media media and shared every day through social media is a threat to Vladimir Putin. And that is why he has launched an international war on truth.
  
  The "Mind Bomb" defined
  
  The firehose-of-bullshit problem...it can't all be combatted. Later:
  
  But remember Olga Skabeyeva's words: There is no truth only interpretation. Her job and the job of every one of her colleagues is to replace truth with whatever narrative Putin desires.
  
  political disinformation
9. peter_murray 02 Dec 2024
  
  in Public
  
  In Russia, Putin's regime has created a system called SORM, which is installed at each and every internet service provider; and it allows them to surveil what information each and every citizen spreads through the internet.
  
  Russia's "System for Operative Investigative Activities"
  
  "SORM." Wikipedia, Wikimedia Foundation, 1 Dec. 2024, en.wikipedia.org/wiki/SORM. Accessed 2 Dec. 2024.
  
  internet surveillance
10. peter_murray 02 Dec 2024
  
  in Public
  
  Information is anything we know and can communicate. So now you know what information is, because I just communicated it. Misinformation is the subset of information which is false. So if by accident I tell you something which is wrong, then it's misinformation. But if tell you something wrong because I want to deceive you, trick you into doing something for me, then that's disinformation — that's the deliberate part.
  
  Information versus Misinformation versus Disinformation
  
  See also: Misinformation, Disinformation, and Malinformation from Leaked Documents Outline DHS’s Plans to Police Disinformation (The Intercept), 31-Oct-2022
  
  disinformation
11. peter_murray 02 Dec 2024
  
  in Public
  
  But in reality, these social media platforms are now a key instrument in our enemy's attack on democracy. What you see as constructive platforms for interaction, others see as a digital weapon of mass destruction — a tool for polluting the minds of millions. So in this presentation I will tell you why we are under attack, how we are under attack, and what we can do about it.
  
  Using Constructionists' view of social media platforms as weapons
12. peter_murray 02 Dec 2024
  
  in Public
  
  Deep Sec, Vienna, 21-Nov-2024 — https://deepsec.net/speaker.html#PSLOT734
  
  Abstract
  
  We are in the middle of the most dangerous information war in the history of mankind, and the survival of democracy depends on what we do next. In this presentation from DeepSec 2024 in Vienna, Austria, Randahl Fink reveals how Russia wages an international war on truth, and shows how we can all be part of the resistance.
  
  societal transformation
13. peter_murray 02 Dec 2024
  
  in Public
  
  Seymour Papert has called constructionist learning
  
  Constructionist Learning
  
  "Constructionism (Learning Theory)." Wikipedia, Wikimedia Foundation, 5 Feb. 2024, en.wikipedia.org/wiki/Constructionism_(learning_theory). Accessed 2 Dec. 2024.
Visit annotations in context

Tags

societal transformation

internet surveillance

broadcast media

disinformation

political disinformation

social media

LLM generated inauthentic content

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20241201T170857-wnLviKf13lc-mind-bomb/index.html
media.dltj.org media.dltj.org

Video: DEF CON 32 - Why are you still using my server for your internet access - Thomas Boejstrup Johansen by DEFCON Conference, annotated

2
1. peter_murray 01 Dec 2024
  
  in Public
  
  So back in 1996, Netscape decided that there must be an easier function when you're adding a machine to a network to automatically config the machine. So they come up with this WPAD function to automatically ask for the machine where the configuration file is stored.
  
  Origin of the Web Proxy Auto-Discovery protocol
  
  Bringing back memories...a long time ago I would use WPAD files to configure library pubic workstations.
  
  network standards
2. peter_murray 01 Dec 2024
  
  in Public
  
  From DEF CON 32, August 8-11, 2024
  
  https://defcon.org/html/defcon-32/dc-32-speakers.html#54469
  
  Abstract
  
  Pawning countries at top level domain by just buying one specific domain name ‘wpad.tld’, come hear about this more the 25+ years old issue and the research from running eight different wpad.tld domains for more than one year that turn into more the 1+ billion DNS request and more then 600+GB of Apache log data with leaked information from the clients.
  
  This is the story about how easy it is to just buying one domain and then many hundreds of thousands of Internet clients will get auto pwned without knowing it and start sending traffic to this man-in-the-middle setup there is bypassing encryption and can change content with the ability to get the clients to download harmful content and execute it.
  
  The talk will explain the technical behind this issue and showcase why and how clients will be trick into this Man-in-the-middle trap.
  
  digital security
Visit annotations in context

Tags

network standards

digital security

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20241201T145817-uwsykPWa5Lc-def-con-32-why-still-using-my-server-internet-access-thomas-boejstrup-johansen/index.html
Sep 2024
storage.courtlistener.com storage.courtlistener.com

gov.uscourts.ca2.60988.306.1.pdf

9
1. peter_murray 05 Sep 2024
  
  in Public
  
  In sum, IA has not met its “burden of proving that the secondary use doesnot compete in the relevant market[s].” Warhol I, 11 F.4th at 49. Its empiricalevidence does not disprove market harm, and Publishers convincingly claim bothpresent and future market harm. Any short-term public benefits of IA’s FreeDigital Library are outweighed not only by harm to Publishers and authors butalso by the long-term detriments society may suffer if IA’s infringing use wereallowed to continue. For these reasons, the fourth fair use factor favors Publisher
  
  Factor 4: "does not disprove market harm, and Publishers convincingly claim both present and future market harm"
2. peter_murray 05 Sep 2024
  
  in Public
  
  In this case, the relevant market for purposes of analyzing the fourth fair usefactor is the market for the Works in general, without regard to format. TheCopyright Act protects authors’ works in whatever format they are produced. Seegenerally 17 U.S.C. § 106 (granting copyright owners the exclusive right toreproduce their works in numerous formats). Even if we consider the printeditions to be the “original” forms of the Works in this case―and we recognizethat the concept of a singular “original” form may be difficult to grasp in the digitalage―the Act recognizes that presentation of the same content in a differentmedium is a derivative work subject to the same legal protections as the purportedoriginal. 17 U.S.C. § 106(2). Here, Publishers obtained from authors the exclusiveright to publish their Works in numerous formats, including print and eBooks, andit is this exclusive right that IA is alleged to have violated via its Free DigitalLibrary. For that reason, the relevant harm―or lack thereof―is to Publishers’markets for the Works in any format.
  
  "Copyright Act protects authors' works in whatever format they are produced"
3. peter_murray 05 Sep 2024
  
  in Public
  
  Here, IA makes unauthorized digital copies of the Works and reveals thosecopies to the public in their entirety. The “amount and substantiality” of thecopying is not necessary to achieve a transformative secondary purpose, butserves to substitute Publishers’ books. 17 U.S.C. § 107(3). For that reason, the thirdfair use factor favors Publishers.
  
  Factor 3: "...reveals those copies to the public in their entirety"
4. peter_murray 05 Sep 2024
  
  in Public
  
  Here, while the nonfiction Works undoubtedly convey factual informationand ideas, they also represent the authors’ original expressions of those facts andideas—and those “subjective descriptions and portraits” reflect “the author’sindividualized expression.” Harper & Rowe, 471 U.S. at 563. Thus, because theWorks in Suit are “of the type that the copyright laws value and seek to protect,”the second fair use factor favors Publishers.
  
  Factor 2: Works...are of the type that copyright laws value and seek to protect
5. peter_murray 05 Sep 2024
  
  in Public
  
  We conclude, contrary to the district court, that IA’s use of the Works is notcommercial in nature. It is undisputed that IA is a nonprofit entity and that itdistributes its digital books for free. Of course, IA must solicit some funds to keepthe lights on: its website includes a link to “Donate” to IA, and it has previouslyreceived grant funding to support its various activities. App’x 6091. But unlikethe defendant in TVEyes, who charged users a fee to use its text-searchabledatabase, IA does not profit directly from its Free Digital Library. 883 F.3d at 175.It offers this service free of charge.
  
  Factor 1: "IA's use of the Works is not commercial in nature"
6. peter_murray 05 Sep 2024
  
  in Public
  
  But this characterization confuses IA’s practiceswith traditional library lending of print books. IA does not perform the traditionalfunctions of a library; it prepares derivatives of Publishers’ Works and deliversthose derivatives to its users in full. That Section 108 allows libraries to make asmall number of copies for preservation and replacement purposes does not meanthat IA can prepare and distribute derivative works en masse and assert that it issimply performing the traditional functions of a library.
  
  "IA does not perform the traditional functions of a library"
7. peter_murray 05 Sep 2024
  
  in Public
  
  We conclude that IA’s use of the Works is not transformative. IA createsdigital copies of the Works and distributes those copies to its users in full, for free.Its digital copies do not provide criticism, commentary, or information about theoriginals. Nor do they “add[] something new, with a further purpose or differentcharacter, altering the [originals] with new expression, meaning or message.”Campbell, 510 U.S. at 579. Instead, IA’s digital books serve the same exact purposeas the originals: making authors’ works available to read. IA’s Free Digital Libraryis meant to―and does―substitute for the original Works.
  
  Factor 1: "IA's use of the Works is not transformative"
8. peter_murray 05 Sep 2024
  
  in Public
  
  IA hosts over 3.2 million digital copies of copyrighted books on its website.Its 5.9 million users effectuate about 70,000 book “borrows” a day―approximately25 million per year. Critically, IA and its users lack permission from copyrightholders to engage in any of these activities. They do not license these materialsfrom publishers, nor do they otherwise compensate authors in connection with thedigitization and distribution of their works.
  
  Lack of license permissions
  
  Nor, of course, do libraries have to seek permission from publishers for any of these activities for printed books...but let's read on.
9. peter_murray 05 Sep 2024
  
  in Public
  
  Hachette Book Group, Inc. v. Internet Archive (23-1260)
  
  Court of Appeals for the Second Circuit
  
  via Recap archive: https://www.courtlistener.com/docket/67801014/hachette-book-group-inc-v-internet-archive/?order_by=desc
  
  This appeal presents the following question: Is it “fair use” for a nonprofit organization to scan copyright-protected print books in their entirety, and distribute those digital copies online, in full, for free, subject to a one-to-one owned-to-loaned ratio between its print copies and the digital copies it makes available at any given time, all without authorization from the copyright-holding publishers or authors? Applying the relevant provisions of the Copyright Act as well as binding Supreme Court and Second Circuit precedent, we conclude the answer is no. We therefore AFFIRM.
  
  Hachette v. Internet Archive
Visit annotations in context

Tags

Hachette v. Internet Archive

Annotators

peter_murray

URL

storage.courtlistener.com/recap/gov.uscourts.ca2.60988/gov.uscourts.ca2.60988.306.1.pdf
Aug 2024
www.msnbc.com www.msnbc.com

Why Is This Happening? The end of libraries as we know them? with Brewster Kahle and Kyle Courtney

1
1. peter_murray 08 Aug 2024
  
  in Public
  
  Chris Hayes speaks with Internet Archive founder Brewster Kahle and Library Futures co-founder Kyle Courtney about why megapublishers are suing to redefine e-books. as legally different from paper books.
  
  Could the future of libraries as we’ve known them be completely different? Our guests this week say so. Megapublishers are suing the Internet Archive, perhaps best known for its Wayback Machine, to redefine e-books as legally different from paper books. A difference in how they are classified would mean sweeping changes for the way libraries operate. Brewster Kahle is a digital librarian at the Internet Archive. Kyle Courtney is a lawyer, librarian, director of copyright and information policy for Harvard Library. He’s the co-founder of Library Futures, which aims to empower the digital future for America’s libraries. They join to discuss what’s animating the lawsuit, information as a public good and the consequences should the publishers ultimately prevail.
  
  July 9, 2024
  
  ebooks
Visit annotations in context

Tags

ebooks

Annotators

peter_murray

URL

msnbc.com/msnbc-podcast/why-is-this-happening/end-libraries-know-brewster-kahle-kyle-courtney-podcast-transcript-rcna160939
Jul 2024
bjs.ojp.gov bjs.ojp.gov

Federal Justice Statistics, 2020

2
1. peter_murray 07 Jul 2024
  
  in Public
  
  Federal Justice Statistics, 2020 — May 2022, revised July 2023 U.S. Department of Justice, Office of Justice Programs, Bureau of Justice Statistics
2. peter_murray 07 Jul 2024
  
  in Public
  
  Total casesadjudicatedConvicted Not convictedTotal Guilty plea Bench/jury trial Total Bench/jury trial DismissedAll offenses 71,126 92.6% 90.9% 1.7% 7.4% 0.3% 7.1%
  
  In fiscal year 2020, there were 71k cases in U.S. district court. Of those, 92.6% were convicted: 90.9% by guilty plea and 1.7% by bench/jury trial. 7.4% were not convicted, 0.3% by bench/jury trial and 7.1% dismissed.
  
  criminal justice
Visit annotations in context

Tags

criminal justice

Annotators

peter_murray

URL

bjs.ojp.gov/content/pub/pdf/fjs20.pdf
radiolab.org radiolab.org

The Alford Plea - Transcript

3
1. peter_murray 07 Jul 2024
  
  in Public
  
  Right. So let's start here: an Alford plea is fundamentally a form of coercion, because it's basically telling a person, "Admit to this crime or else we'll kill you."
  
  Alford Plea as a form of judicial coercion
2. peter_murray 07 Jul 2024
  
  in Public
  
  Johanna Hellgren, who has for years has been researching this Alford plea. JOHANNA HELLGREN: Yeah. PETER: So could we just start—like, where did the Alford plea—where did it even come from? JOHANNA HELLGREN: Yeah, the way it came to be was this guy, Henry Alford, was accused of first-degree murder. PETER: This is, like, the early 1960s. So first degree murder meant he was facing the death penalty. JOHANNA HELLGREN: But he took a plea for, I believe, second-degree murder. PETER: Which meant instead he'd get life in prison. JOHANNA HELLGREN: Yeah. PETER: But when he gets up to enter his plea in front of the judge, Alford says—and I'll quote from the transcript right here, hold on one second, he says, "I just pleaded guilty because they said if I didn't they'd gas me." JOHANNA HELLGREN: "I'm just pleading because I don't want to get the death penalty, but I didn't do it." PETER: And then later he said, "I'm not guilty, but I plead guilty."
  
  Origin of the "Alford Plea"
3. peter_murray 07 Jul 2024
  
  in Public
  
  "The Alford Plea", Radiolab, Jun 28, 2024
  
  In 1995, a tragic fire in Pittsburgh set off a decades-long investigation that sent Greg Brown Jr. to prison. But, after a series of remarkable twists, Brown found himself contemplating a path to freedom that involved a paradoxical plea deal—one that peels back the curtain on the criminal justice system and reveals it doesn’t work the way we think it does.
  
  criminal justice
Visit annotations in context

Tags

criminal justice

Annotators

peter_murray

URL

radiolab.org/podcast/the-alford-plea/transcript
pluralistic.net pluralistic.net

Pluralistic: Kitchensink callithump linkdump (29 Jun 2024) – Pluralistic: Daily links from Cory Doctorow

1
1. peter_murray 06 Jul 2024
  
  in Public
  
  The realization that you have to live in a society with people who are harmed by injustice, even if you personally escape that justice? It's the whole basis for solidarity.
  
  Defining solidarity
Visit annotations in context

Annotators

peter_murray

URL

pluralistic.net/2024/06/29/pasticcio/
media.dltj.org media.dltj.org

"Hachette Book Group, Inc. v. Internet Archive Appeal Oral Argument" uncorrected transcript (United States Court of Appeals for the Second Circuit via Internet Archive)

31
1. peter_murray 06 Jul 2024
  
  in Public
  
  But actually a physical library cannot make a physical copy of a book and circulate it in place of the original copy. It can't do that, right? So if it can't do that, then you're not doing just what a library is doing. So your argument actually depends on the physical, the digital copy being transformative and the physical copy not being transformative. And that's the thing that makes the difference, not the one-to-one ratio and the analogy to physical books.
  
  Is circulating a photocopy of a book to preserve the original like CDL?
2. peter_murray 06 Jul 2024
  
  in Public
  
  The struggle I'm having with your response to these questions is on the one hand you want to say, look, this is transformative because it's efficient and we can get people to read more books faster. They don't have to go to libraries. The efficiency is the value of this to the public. But at the same time, you're saying, but that efficiency has absolutely no impact on whether the publishers can sell the e-books or the hard copies. And it sounds wonderful when you're saying it, but when I step back and listen, I'm having trouble reconciling those two.
  
  Digitized lending efficiency has two sides
3. peter_murray 06 Jul 2024
  
  in Public
  
  that efficiency may or may not have an effect on either the number of copies that get sold or on the market for the overdrive service, which has a variety of different sort of different aspects and benefits over and above CDL. I mean, CDL is largely sort of image-scanned images of pages of paper books because it's the paper book. The overdrive service has a lot of many. You can flow the text. You can do different features and that is one reason why that is one explanation for the data that you see that there is no reduction in demand for overdrive.
  
  Digitized versus Digital books, explicitly
4. peter_murray 06 Jul 2024
  
  in Public
  
  you're reducing the market from the number of people who might want to read... Let's look at even the paper books. They'll pretend like take out the digital market for a second. The number of people who might want to read it ever, down to the number of people who might want to read it simultaneously. And if you put digital books into the mix, it's the same idea, right?
  
  The market becomes only as large as the number of people who simultaneously want to read a work
5. peter_murray 06 Jul 2024
  
  in Public
  
  That IA's brief and amiki try to create the impression that the public interest is on their side. And it is not. The protection of copyright is in the US Constitution and federal law because it creates an incentive for writers and artists to create new works to benefit our broader society. Internet archives, control digital lending is in direct conflict with that basic principle. And as I previously... You don't really think people are going to stop writing books because of the control digital lending to you? Well, I think publishers are going to go down the tubes if they do not have the revenues. I'm not going to publish your books. You think that that's really... I do, Your Honor. There's no question. I mean, and the standard here is not, will this eliminate... No, I understand. ...the... It's just a part. But this question about balancing the incentive to create a work with the larger distribution of it, that is the question to be decided in this case.
  
  "Publishers are going to go down the tubes is they don't have the revenues"
  
  Authors: the publishers are not necessarily your friends here...
6. peter_murray 06 Jul 2024
  
  in Public
  
  In the same way, control digital lending is a contrived construct that was put together at the behest of Internet Archive back in 2018 when they confronted the fact that libraries didn't want to deal with them. Libraries didn't want to give copies of their works to be digitized because they were concerned about copyright arguments. So they got in a room together with various people and contrived this principle of control digital lending to rationalize what they were doing.
  
  CDL was conceived by IA in 2018 because libraries didn't want to give IA digital copies?
  
  WHAT?!?
7. peter_murray 06 Jul 2024
  
  in Public
  
  that very point has been made by this court that you transform a book when you create or work when you create it into a new format. But that is not the type of transformativeness that the first factor looks at. You're converting it to a derivative form, not a transformative form.
  
  Digitized books as a derivative form, not a transformative form
8. peter_murray 06 Jul 2024
  
  in Public
  
  Really in question that if the comparator on the transformative issue were between the digital book and the ebook, that it really wouldn't be transformative because it is just a different version of a digital book. But he's saying, well, we have a right to use our physical copy and this is a transformative use of the physical copy. Is that the right way to think about it? Are you suggesting that actually the book is something more than the physical copy? And so when we think about the CDL version of the book, we should compare it both to the physical copy and to the ebook because those are both versions of the book that the publisher produces. They are not distributing the physical copy you're on. That's the whole point to control digital lending. That is not the right way to think about it. They are taking the physical copy and transforming it into a nude and different format with different capabilities that has a different market.
  
  Comparison between the digital (digitized) book and the ebook
  
  The judge here is attempting to hone in on the question of digitized versus digital book. The libraries are lending the digitized book. Publishers have a market for the digital book. They are similar, but not the same.
9. peter_murray 06 Jul 2024
  
  in Public
  
  You are still distributing the physical copy of the book. And as Your Honor recognized, there's a lot of friction involved with distribution of physical copies that is significantly different than what is capable with digital copies. That's why they are two independent markets with very distinct capabilities and the law and the digital economy button with books as everything else turns on the fact very key principles that the copyright owner owns the right to distribute their works in different formats and to distribute them under the terms for which they deem to be appropriate.
  
  Physical books and ebooks are two different markets
  
  The difference in friction between lending a physical book and an ebook means that these are two separate markets with "very distinct capabilities." CDL usurps publishers' rights in the ebook market.
  
  I could be more sympathetic to this argument if the "ebook" we were talking about was a digital book, not a digitized book.
10. peter_murray 06 Jul 2024
  
  in Public
  
  there is no market in control digital lending. But of course, there's no market in control digital lending. Control digital lending is predicated on infringement and the nonpayment of any fees.
  
  CDL is predicated on infringement and non-payment of fees
11. peter_murray 06 Jul 2024
  
  in Public
  
  this is so utterly transformative and so utterly, it's substantive. I'm sorry, it's utterly derivative is your position. Exactly, it's a derivative work. And there's, and it's doing nothing than repackaging and repurposing the derivative work.
  
  CDL is a transformative, derivative work
  
  Implication being, I think, that new copyright rights come to the library because of this transformation/derivation.
12. peter_murray 06 Jul 2024
  
  in Public
  
  let me focus for one moment on just a little bit of the evidence here on the commerciality of Internet Archive. Unlike most libraries, Internet Archive is owned by an individual largely who has funded this, and he, that Brewster Kale, as you well know. And every, virtually every page of the Internet Archive has a button that says you can buy this on better world books, which is giving an incredible amount of PR and certain revenue that drives between them.
  
  Commerciality of Internet Archive
  
  And the intersection with Better World Books, which makes this a distinct between this and public libraries.
13. peter_murray 06 Jul 2024
  
  in Public
  
  What about the evidence that certainly the district court cited to it, admissions or undisputed evidence in the 56th statement about pitching CDL or pitching, joining the open library project as a way to save money. And are you relying on that at all as a basis to show? Absolutely, Your Honor. It's very rare in a record that you have actual admitted evidence that shows that a party is intending to supplant your market. And that's what we have in this record. The Internet Archive on this appeal tries to dismiss this as rhetorical flourishes. These pitches were made to hundreds of libraries and hundreds of slide decks provided to libraries with this exact same pitch. You don't have to buy it again.
  
  IA's marketing of the Open Libraries program demonstrates their intent to subvert the digital licensing market
14. peter_murray 06 Jul 2024
  
  in Public
  
  ASTM versus public resource decision
  
  American Society for Testing and Materials et al. v. Public.Resource.Org - Wikipedia on copyright of standards used in legal codes.
15. peter_murray 06 Jul 2024
  
  in Public
  
  then your argument actually is that, yeah, okay, there's a statute that talks about fair use, but there's a more specific statute that says when libraries can digitize books and that should control the fair use statute. Correct, your honor. That should control. And that's what was decided by this Court. They said that if you want to change the law, your job is to go to Congress. We're not in the position to change the statute towards you. In terms of what the statute says, the statute says, okay, you could do whatever you want with a physical book, but you can only create a digital copy for archival purposes or other limited purposes, but you're not allowed to distribute it. Correct, your honor. It does not envision in any way the practices of Internet Archive, which is digitizing literally millions of copies of books and making them available around the world to users.
  
  Appellee: fair use doesn't apply, the more specific statue applies
  
  Where Congress has said that digital copies can occur is the only place digital copies can occur.
16. peter_murray 06 Jul 2024
  
  in Public
  
  So what was not recognized in the argument you just heard is that the copyright office and Congress over the last decade has repeatedly been approached to say, you need to think about the digital economy. You need to think about digital works. You need to think about the first sale doctrine and whether that should apply in the digital world. They have consistently rejected the changes to the law, both by the copyright office as well as Congress.
  
  Congress and the Copyright Office have rejected digital economy changes
17. peter_murray 06 Jul 2024
  
  in Public
  
  So if Congress had not codified the first sale doctrine, it didn't have Section 109 that authorizes libraries. It only had to rely on the fair use doctrine. Would it be obvious that you could do whatever you want with a physical book? Like would that, would libraries fall under fair use if we didn't have the first sale doctrine in the statute? Yes, I think it would, Your Honor. I mean, I think the Supreme Court, the physical books, the Supreme Court, is recognized that you have an unlimited right to distribute physical books. It's not simply because it was codified in Section 109.
  
  Physical lending in libraries could rely on fair use if Section 109 didn't authorize libraries
18. peter_murray 06 Jul 2024
  
  in Public
  
  I want to start by reframing and step back to really focus on the practical realities of what Internet Archive is doing and what is before this court. Internet Archive is asking this court to disregard the controlling law of this court as well as the Supreme Court. And what it is seeking is a radical change in the law that if accepted would disable the digital economy. Not just for books, but for movies, for music, for TV and the like.
  
  Appellee' opening position: CDL is a destabilizing act on the digital economy
19. peter_murray 06 Jul 2024
  
  in Public
  
  You have not addressed the National Emergency Library. That's been sort of silent today. So given your statement now, you would agree that the National Emergency Library was a violation of copyright, because it wasn't one-to-one, correct? I would not agree. I mean, you were allowing multiple users to use the same digital copy of a hard book. The National Emergency Library does present different facts and different justifications you're under.
  
  National Emergency Library
  
  Judge, paraphrased: are there other circumstances where libraries would come to the court saying it was legal to break the physical sequestration of loaned items?
20. peter_murray 06 Jul 2024
  
  in Public
  
  With constraints that impact the value of the library's ability to do that that are very much tied to the physical instantiation of the book, right? That's right. You can't rely on that one book to serve the serial needs of people globally because the costs of sending the book would exceed the costs of just getting another book on the other side of the world. I don't think I agree with that, Your Honor, and I don't think the record supports it.
  
  Question on the costs of shipping the physical book for lending versus purchasing a new copy
  
  The cost of a library buying a copy has additional costs, such as putting it into the inventory and shelving the physical copy. Question: how does that compare to the costs between libraries of ILLing a book around?
21. peter_murray 06 Jul 2024
  
  in Public
  
  Like the Wikipedia links where we are, where people are able to... Yeah, but the way that works is like snippet view, right? You can click on it and go to the particular part of the book. But if you want the whole book, you have to do it through CDL. Again, this is not... That's not really part of CDL, the Wikipedia links, right?
  
  Wikipedia reference links are like Google Books snippet view
  
  This seems like a distraction from the core question...this really isn't a part of CDL.
22. peter_murray 06 Jul 2024
  
  in Public
  
  This is exactly what was going- this is exactly what the plaintiff said and what was going on in Sony as well. They said, well, you don't need to tape these movies off the air. We'll rent you a tape. We'll sell you a tape. You can get those benefits this other way just by paying us. You shouldn't be able to use technology yourself with the access you already have to get those benefits. That is the same thing that's going on here.
  
  Using technology to get the benefits that you already have
  
  Someone was already entitled to receive the content, a la the Sony case.
23. peter_murray 06 Jul 2024
  
  in Public
  
  if they already have a physical copy and they want a circulated digital copy now, in the absence of your program, they would have to license an ebook. But once the program is available, they don't need to and they can just digitize or rely on you to digitize the physical book they have, right? This offers them another way of using the access they've already got, the right they already have to lend it to one patron at a time. In exactly the same way that the VCR and Sony allowed the person to access the material later instead of right now
  
  CDL is an application of own-to-loan for physical items
24. peter_murray 06 Jul 2024
  
  in Public
  
  You are right that the license terms that the publishers offer to libraries do not allow them to have electronic materials in their prominent collection, which is these libraries have print materials in their prominent collection. And if they want to use CDL as an alternative to rental, right, the overdrive scenario, they need to buy those books.
  
  Publishers don't have license terms that allow for electronic materials in a library's permanent collection
25. peter_murray 06 Jul 2024
  
  in Public
  
  When they buy those books, they buy the physical copies to lend to their patrons one at a time or through an interlibrary change. They also buy e-books to make those available to their patrons. We're focused here on e-books and impacting e-licensing. I have a hard time reconciling those two, specifically as to e-licensing. Why would libraries ever pay for an e-license if they could have internet archives, scan all the books, hard copies they buy and make them available on an unlimited basis?
  
  Why would libraries buy ebook licenses when they could get the same from CDL?
26. peter_murray 06 Jul 2024
  
  in Public
  
  under factor 4 you say that actually there's one reason there's still be a market for e-books is because e-books are more attractive than digitized versions of physical books. Right? Because they have features and they're more user friendly or whatever. So what that kind of means is what you're saying is that your digital copies are more convenient or more attractive, I guess more convenient than physical books, but less convenient than e-books.
  
  Digitized physical books are different from publisher supplied ebooks
  
  Publishers have an inherently superior product with "born digital" ebooks than what libraries can produce with scanned physical books: reflowing pages, vector illustrations, enhanced table-of-contents and indexes, etc.
  
  trade publishing
27. peter_murray 06 Jul 2024
  
  in Public
  
  So that statute, Section 109, talks about you can lend out the physical copy, but then it also specifically delineates when you can make a copy of it or a digital copy and it limits when you can distribute that. So why wouldn't it conflict with what Congress has specified to say, well, this is really just the same as the physical copy.
  
  Section 109 "First Sale Doctrine"
28. peter_murray 05 Jul 2024
  
  in Public
  
  That's right, and that is why we have been doing this without molestation by the publishers since 2011. But it's your position that you could lend it out during the first five years and that would still be fair use? That would be a different case, Your Honor. And we've very... I think the answer to that is... My only reason is if you're just doing it in your discretion. So the answer to that is we think it would... We think that would be fair use, Your Honor, because we don't think that would have a market effect either. There might... If they could show there was, or if the facts were different, that's why fair use is case by case, and if there were a case presenting those different facts, that might be different.
  
  IA believes it is fair use immediately but has the 5-year embargo to assuage publisher concerns
29. peter_murray 05 Jul 2024
  
  in Public
  
  in the real world, there's a lot more friction in the sort of market for passing a paper book from one person to another. And I'm imagining that that's priced into the price of the paper book. Your premise is that a scanned digital version of that paper book is nothing more. It is tantamount to the same thing as the book. But we know there's a distinct market for those digital books. They're priced separately. So you're taking something from one market and you're inserting it into another market without ever having paid the premium in that new market.
  
  "Friction" of lending physical items
  
  Reducing friction is seen as a benefit of a transformative use?
30. peter_murray 05 Jul 2024
  
  in Public
  
  Hachette Book Group, Inc. v. Internet Archive Appeal Oral Argument Second Circuit (88 min audio)
  
  by United States Court of Appeals for the Second Circuit
  
  copyright Hachette v. Internet Archive
31. peter_murray 05 Jul 2024
  
  in Public
  
  If the forms are considered distinct things in their separate markets for them, why shouldn't the law recognize that converting the paper book into a digital book isn't just the same thing as passing around the paper book?
  
  Question early in the oral argument focuses on the first fair use factor
Visit annotations in context

Tags

copyright

trade publishing

Hachette v. Internet Archive

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20240703T204937-united-states-court-appeals-second-circuit-via-internet-archive--hachette-book-group-inc-v-internet-archive-appeal-oral-argument/index.html
May 2024
media.dltj.org media.dltj.org

Video: Handling Academic Copyright and Artificial Intelligence Research Questions as the Law Develops by CNI Spring Meeting 2024, annotated

10
1. peter_murray 28 May 2024
  
  in Public
  
  Google translate is generative AI
  
  Google Translate as generative AI
  
  natural language translation
2. peter_murray 28 May 2024
  
  in Public
  
  the various contracts or license agreements that publishers require libraries to sign to access research content for our users. And this is generally described as the problem of contractual override.
  
  Contractual override of fair use
  
  ...includes proposed contract language starting in the 46th minute.
3. peter_murray 28 May 2024
  
  in Public
  
  why training artificial intelligence in research context is and should continue to be a fair use
  
  Examination of AI training relative to the four factors of fair use
  
  LLM copyright building LLMs
4. peter_murray 28 May 2024
  
  in Public
  
  is the output copyrightable
  
  Is the output copyrightable?
5. peter_murray 28 May 2024
  
  in Public
  
  let's go to the second question. Does the output
  
  Does the output infringe?
  
  Is it substantially similar in protected expression? What is the liability of the service provider?
6. peter_murray 28 May 2024
  
  in Public
  
  And then Singapore and Japan also have provisions and they basically allow the reproduction necessary for text and data mining.
  
  Singapore and Japan have laws that allow for reproduction necessary for text and data mining
  
  copyright
7. peter_murray 28 May 2024
  
  in Public
  
  So, let's look at the first question.
  
  Does ingestion for training AI constitute infringement?
  
  Later:
  
  supposedly no expression, none of the original expression from the works that were ingested ends up in the model. That the model should just be this, you know, kind of mass of relationships and patterns and algorithms and all that kind of stuff, but no expression itself.
  
  In the U.S., this would be seen as a matter of fair use: search engines, plagiarism software, Google Books. The underlying theory is that we'll ignore the copies made by the computer...the expression coming out isn't the same.
8. peter_murray 28 May 2024
  
  in Public
  
  three different issues that are being implicated by artificial intelligence. And this is true with, you know, all artificial intelligence, not just a generative but particularly generative.
  
  Three issues implicated by Generative AI
  
  Does ingestion for training AI constitute infringement?
  
  Does the output infringe?
  
  Is the output copyrightable?
  
  The answer is different in different jurisdictions.
  
  LLM copyright
9. peter_murray 28 May 2024
  
  in Public
  
  And one way we've seen artificial intelligence used in research practices is in extracting information from copyrighted works. So researchers are using this to categorize or classify relationships in or between sets of data. Now sometimes this is called using analytical AI and it evolves processes that are considered part of text and data mining. So we know that text data mining research methodologies can but they don't necessarily need to rely on artificial intelligence systems to extract this information.
  
  Analytical AI: categorize and contextualize
  
  As distinct from generative AI...gun example in motion pictures follows in the presentation.
  
  analytical AI
10. peter_murray 28 May 2024
  
  in Public
  
  Handling Academic Copyright and Artificial Intelligence Research Questions as the Law Develops
  
  Spring 2024 Member Meeting: CNI website • YouTube
  
  Jonathan Band Copyright Attorney Counsel to the Library Copyright Alliance
  
  Timothy Vollmer Scholarly Communication & Copyright Librarian University of California, Berkeley
  
  The United States Copyright Office and courts in many United States jurisdictions are struggling to address complex copyright issues related to the use of generative artificial intelligence (AI). Meanwhile, academic research using generative AI is proliferating at a fast pace and researchers still require legal guidance on which sources they may use, how they can train AI legally, and whether the reproduction of source material will be considered infringing. The session will include discussion of current perspectives on copyright and generative AI in academic research.
  
  LLM copyright
Visit annotations in context

Tags

analytical AI

building LLMs

natural language translation

LLM copyright

copyright

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240527T173838-GMttBH1oAD4-handling-academic-copyright-artificial-intelligence-research-questions-law-develops/index.html
media.dltj.org media.dltj.org

Video: Linked Data in Production: Moving Beyond Ontologies by CNI Spring Meeting 2024, annotated

10
1. peter_murray 28 May 2024
  
  in Public
  
  the hardest part of doing this has not been the technology, it's not been the APIs, it's not been training engineers. It's been working with the practitioners with the change management doing it this way requires. Because when we do this recontextualization, when we cross the streams between various disciplines where we take the information that someone has spent their career cataloging and throw a 90% out of it to present it to an audience who doesn't care, that really, really irks people who have spent their entire career doing it in a way that they were trained is the best way you could do this. And some of the things that you can do with technology run into conflict with longstanding practices that were put in place before that technology existed.
  
  The hardest part: helping practitioners understand the recontextualization potential, not building the technology
  
  The practitioners are used to dealing with data in their own silos. They are carefully curate that data. They may not take kindly to you taking a portion of it to use in the global data model and discarding the rest.
2. peter_murray 28 May 2024
  
  in Public
  
  The other cool thing that we've added is the knowing what records of change is really valuable, but the other question, really a more scholarly question is, how did it change? How has information changed over time? And so we built our infrastructure in support in Memento, which is one of the protocols behind the net archive, to give our data the equivalent of that way back machine, be able to say, what did that record look like a year ago? What did that record look like two years ago? To give you that audit log, so you can go and look at how that changed over time, and to give ourselves the ability to audit what changed when?
  
  Use of Memento to expose how data has changed
  
  memento
3. peter_murray 28 May 2024
  
  in Public
  
  You need something that will help you show documents that will help you map to those understandings, those contexts that people bring to the information very quickly. And documents are also the way the internet works, right? You want to be able to use the sorts of affordances that the engineers understand. Rest APIs, JSON documents, cache control. Because those are things that engineers know that will let this stuff work fast. It will also let you hire people who know how the internet works and don't have to understand all the complex crazy stuff that we call children, heritage people do, they make it possible.
  
  Linked Data—and web architecture in general—as a hirable engineering skill
  
  If you can hire engineers that don't immediately need to know the intricacies of cultural heritage metadata, you can expand the pool of entry-level people you can hire and then train up on the cultural heritage details as needed.
  
  web architecture
4. peter_murray 28 May 2024
  
  in Public
  
  In a context based on the sort of context that people would understand, we say, this is the record about an artwork because you have a shared idea of what an artwork's information might be. It's probably not deep in the weeds around like Van Gogh had a brother named Theo Van Gogh and Theo Van Gogh was married and that wife turns out that's not part of what we described the artwork, that's part of the ecosystem knowledge around it. But graphs, on the other hand, are really optimized for asking those sorts of questions, for saying, I have a context, maybe I'm interested in siblings of artists and I want to be able to ask that kind of question.
  
  Van Gogh's family — not cataloged in the metadata of a painting but available elsewhere
  
  This is an interesting question that link data enables. A painting would not have specific details about the artist's family, but that data could be linked to an artist's entity URI through another system.
5. peter_murray 27 May 2024
  
  in Public
  
  But when we want to provide access to that data, what we end up doing is recontextualizing that data. We have to change the lens from that around someone who knows the discipline, who knows the form into one that reaches the way that users expect to see that data, which is often a really different shape to different flavor because their needs are different than the needs of catalogers. And so when we do that, we end up displaying records that are aggregations of data coming from multiple different systems because the tool that you would use to catalog the source is different from the tool you'd use to capture digital media, which is different from the tool you'd use to capture collections records, which is different from where you put the audio for the audio guide.
  
  Enabling recontextualization of data
  
  Allow catalogers to work in the environments that suit them best, but then enable the data to move into different contexts.
6. peter_murray 27 May 2024
  
  in Public
  
  There's this research use case where there are people who are not looking for information, but they're looking for questions that haven't been asked, patterns that they haven't seen before. Things in the data that we don't already know and couldn't share with them, but that they could discover with their expertise.
  
  Answering the expert researcher's question
7. peter_murray 27 May 2024
  
  in Public
  
  Digital infrastructure isn't designed to make computers happy because computers aren't happy. We do it to empower people to be more effective in meeting their mission. And so when we think about who those people are, the reason we have the ecosystem we have is because we have many different constituents with many different kinds of needs.
  
  Digital infrastructure to meet constituent needs
8. peter_murray 27 May 2024
  
  in Public
  
  over the past five years, what we've done is said, what if we take those sort of technologies, a single standard in linked art to linked things together, the vocabularies as connective glue, and used it to power our entire discovery ecosystem, including our archival collections, but also our museum collections on top of that. What if we used it also to power the audio guide that we provide to our visitors, or to provide interesting novel experiential interfaces on top of the collections that we have? What if we worked with third parties to use it, both large parties like Google, Arts, and Culture, and small projects like the Spanish Art in the US, which is a project of the cultural office of the embassy of Spain defined Spanish artworks in American museums and bring them together? And so we've worked across projects at all of these scales to pull records together and say, what would happen if you really tried to build this system out? So we end up with a unified API model, a way to access that data that spans across all of these collections and brings them together into a single data model.
  
  What if we built a unified API model across projects at the Getty?
9. peter_murray 27 May 2024
  
  in Public
  
  American Art Collaborative. And this was a 2017 project to take 14 art museum collections together and use these sort of linked data principles that these things were built on to say, could you bridge across 14 institutions? Could you find connections? Could you provide a unified discovery environment for that many institutions at the same time? And what came out of that was a project that is called linked art. Rob Sanderson, who I'm sure many of you know, he's a regular here and a good colleague of mine. He and I worked together to create this data model called linked art based on the data model of the American Art Collaborative, which was the underlying sort of connective data tissue, building on years and years of work in the academic community under C-Dark to say, what if we had a tool that could bridge these things together?
  
  American Art Collaborative
  
  American Art Collaborative description • demonstration site
  
  art museums
10. peter_murray 27 May 2024
  
  in Public
  
  Linked Data in Production: Moving Beyond Ontologies
  
  Spring 2024 Member Meeting: CNI website • YouTube
  
  David Newbury Assistant Director, Software and UX Getty
  
  Over the past six years, Getty has been engaged in a project to transform and unify its complex digital infrastructure for cultural heritage information. One of the project’s core goals was to provide validation of the impact and value of the use of linked data throughout this process. With museum, archival, media, and vocabularies in production and others underway, this sessions shares some of the practical implications (and pitfalls) of this work—particularly as it relates to interoperability, discovery, staffing, stakeholder engagement, and complexity management. The session will also share examples of how other organizations can streamline their own, similar work going forward.
  
  http://getty.edu/art/collection/ http://getty.edu/research/collections/ http://vocab.getty.edu https://www.getty.edu/projects/remodeling-getty-provenance-index/
  
  linked data
Visit annotations in context

Tags

art museums

memento

web architecture

linked data

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240527T175053-ApFnMJR4_lM-linked-data-production-moving-beyond-ontologies/index.html
media.dltj.org media.dltj.org

Video: Navigating Generative AI: Early Findings and Implications for Research, Teaching, and Learning by CNI Spring Meeting 2024, annotated

9
1. peter_murray 27 May 2024
  
  in Public
  
  So I'm not surprised, and perhaps the last question about user privacy.
  
  Question about the privacy of the user interacting with the RAG
  
  It doesn't seem like JSTOR has thought about this thoroughly. In the beta there is kind of an expectation that the service is being validated so what users are doing is being watched closer.
  
  library privacy
2. peter_murray 27 May 2024
  
  in Public
  
  I quite often would like to ask my research assistant to go and explore for me. Part of what they would be looking for is who is mentioned, what reference, what arguments do they mention, what school of thoughts. So this would be a very simple way and it would certainly make their lives easier.
  
  Research thinks JSTOR's RAG would make their research assistant lives easier
3. peter_murray 27 May 2024
  
  in Public
  
  we can see like there's a whole body of things that aren't working well.
  
  General feedback
  
  Users would like to see longer summaries. They talk-to-the-document to ask if it mentions the patron's topic — similar to a Control-F on the document. They also ask about the nature of the discussion on a topic — seemingly coming from an advanced researcher with "something specific in mind." Also ask what methods are used and whether concepts are related.
  
  There are also queries that seem to want to push the boundaries of the LLM RAG.
4. peter_murray 27 May 2024
  
  in Public
  
  So they say, I really love this tool. It is great how it attempts to relate the search query back to the content and the article. I find that I'm spending more time capturing and downloading the AI summaries than the downloading the PDFs.
  
  Feedback: changed user behavior from downloading PDFs to downloading LLM-generated summaries
5. peter_murray 27 May 2024
  
  in Public
  
  And then we have, and this is a work in progress, we have automated mechanisms for assessing several different aspects of the responses.
  
  Assessment mechanisms
  
  In addition to the patron-submitted assessments of LLM output (positive/negative, qualitative comments), there are several automatic processes: toxicity (using a hate speech model), faithfulness (how close the response is to the document content), relevancy (measured similar to how it is done for search), and similarity (making sure the response is complete...similar to faithfulness).
6. peter_murray 27 May 2024
  
  in Public
  
  So how does this work? I wanted to give this picture of what's actually happening behind the scenes, especially with this question and answer. So first, I will say that we're using a combination of OpenAI's GPT 3.5 to do this as well as some open source, smaller open source models to generate the vectors for the semantic search.
  
  JSTOR implements a RAG
  
  RAG == Retrieval Augmented Generation
  
  LLM RAG
7. peter_murray 27 May 2024
  
  in Public
  
  For any instance where, in most cases, the users will be able to trace back from the response to the point in the article where that information was pulled from. So you can see arrow pointing to the highlighted text that is the start of the segment that was used to generate that answer.
  
  Footnotes in the LLM response go to specific passages in the text
  
  This helps the patron understand that the analyzed output can be more trustworthy than a general prompt to an LLM. That there are "guardrails" that "keep the user within the scope of this document".
8. peter_murray 27 May 2024
  
  in Public
  
  And in this on the side, you see we have this new chat box where the user can engage with the content and this very first action. The user doesn't have to do anything. They land on the page and as long as they run a search, we immediately process a prompt that says what in your voice, how is the query you put in?
  
  Initial LLM chat prompt: why did this document come up
  
  Using the patron's keyword search phrase, the first chat shown is the LLM analyzing why this document matched the patron's criteria. Then there are preset prompts for summarizing what the text is about, recommended topics to search, and a prompt to "talk to the document".
  
  LLMs for research
9. peter_murray 27 May 2024
  
  in Public
  
  Navigating Generative Artificial Intelligence: Early Findings and Implications for Research, Teaching, and Learning
  
  Spring 2024 Member Meeting: CNI website • YouTube
  
  Beth LaPensee Senior Product Manager ITHAKA
  
  Kevin Guthrie President ITHAKA
  
  Starting in mid-2023, ITHAKA began investing in and engaging directly with generative artificial intelligence (AI) in two broad areas: a generative AI research tool on the JSTOR platform and a collaborative research project led by Ithaka S+R. These technologies are so crucial to our futures that working directly with them to learn about their impact, both positive and negative, is extremely important.
  
  This presentation will share early findings that illustrate the impact and potential of generative AI-powered research based on what JSTOR users are expecting from the tool, how their behavior is changing, and implications for changes in the nature of their work. The findings will be contextualized with the cross-institutional learning and landscape-level research being conducted by Ithaka S+R. By pairing data on user behavior with insights from faculty and campus leaders, the session will share early signals about how this technology-enabled evolution is beginning to take shape.
  
  https://www.jstor.org/generative-ai-faq
  
  LLMs in education
Visit annotations in context

Tags

LLMs for research

LLM RAG

LLMs in education

library privacy

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240527T172152-SE4zl7Isy5k-navigating-generative-ai-early-findings-implications-research-teaching-learning/index.html
media.dltj.org media.dltj.org

Video: Navigating Generative AI: Early Findings and Implications for Research, Teaching, and Learning by CNI Spring 2024, annotated

2
1. peter_murray 26 May 2024
  
  in Public
  
  you can see across the bottom we have a set of like preset prompts these are highly optimized to generate um responses that we've we've crafted so the first one is what is this text about this is a summary of the entire document
  
  Default prompt: output a summary of the document
2. peter_murray 26 May 2024
  
  in Public
  
  Navigating Generative Artificial Intelligence: Early Findings and Implications for Research, Teaching, and Learning
  
  Spring 2024 Member Meeting: CNI website • YouTube
  
  Beth LaPensee Senior Product Manager ITHAKA
  
  Kevin Guthrie President ITHAKA
  
  Starting in mid-2023, ITHAKA began investing in and engaging directly with generative artificial intelligence (AI) in two broad areas: a generative AI research tool on the JSTOR platform and a collaborative research project led by Ithaka S+R. These technologies are so crucial to our futures that working directly with them to learn about their impact, both positive and negative, is extremely important.
  
  This presentation will share early findings that illustrate the impact and potential of generative AI-powered research based on what JSTOR users are expecting from the tool, how their behavior is changing, and implications for changes in the nature of their work. The findings will be contextualized with the cross-institutional learning and landscape-level research being conducted by Ithaka S+R. By pairing data on user behavior with insights from faculty and campus leaders, the session will share early signals about how this technology-enabled evolution is beginning to take shape.
  
  https://www.jstor.org/generative-ai-faq
  
  LLMs in education
Visit annotations in context

Tags

LLMs in education

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240526T083931-SE4zl7Isy5k-Navigating_Generative_AI-_Early_Findings_and_Implications_for_Research-_Teaching-_and_Learning/index.html
www.arl.org www.arl.org

The ARL/CNI 2035 Scenarios: AI-Influenced Futures in the Research Environment

6
1. peter_murray 25 May 2024
  
  in Public
  
  In approaching this material, suspend your disbelief, avoid choosing apreferred scenario, and embrace the full set of possibilities included in this material.Remember, the future will not be as described in any one scenario but will be made upof components of all four scenarios
  
  I do t know if it is typical of such tools, but he scenarios presented seem written such that one is highly desirable, one is not, and two are somewhat realistic. That might be the nature of the two-axis outline for the scenarios. I almost wish I had read them in a different order to try to limit a best-to-worst bias. (Although, as presented, the fourth is not the worst-case scenario…it is the third.)
2. peter_murray 25 May 2024
  
  in Public
  
  Intellectual property rights and the impact of AI-consumed and -produced content on the rights of others are not mentioned in the brief, which seems like a significant omission.
3. peter_murray 25 May 2024
  
  in Public
  
  AI not used tobenefit society’sbetterment,drivercapitalism
  
  I think this is trying to say that AI is seen as benefiting capitalist-driven entities, not society as a whole. But in any case it is awkwardly worded. .
4. peter_murray 25 May 2024
  
  in Public
  
  Alex was a couple years in now as a codirector of HIF with her partner, MITA. MITA hadbeen her assistant for several years ahead of their promotion, but at the time of thepromotion it was so clear that they were more partners than a boss and assistantrelationship. So, HIF made the decision to opt for the codirector approach and so far ithad been a rousing success.
  
  The digital assistant has been granted a kind of personhood status?
5. peter_murray 25 May 2024
  
  in Public
  
  The most advanced libraries operate almost exclusively on an AI platform
  
  What does this mean? What is an AI platform? “Operate” is a very broad term: content acquisition and licensing, inventory and description, publication and announcement/advertising, discovery and delivery.
6. peter_murray 25 May 2024
  
  in Public
  
  The ARL/CNI 2035 Scenarios: AI-Influenced Futures in the Research Environment. Washington, DC, and West Chester, PA: Association of Research Libraries, Coalition for Networked Information, and Stratus Inc., May 2024. https://doi.org/10.29242/report.aiscenarios2024
  
  LLMs in education
Visit annotations in context

Tags

LLMs in education

Annotators

peter_murray

URL

arl.org/wp-content/uploads/2024/05/ARL-CNI-2035-AI-Scenarios-5May2024.pdf
Apr 2024
www.wheresyoured.at www.wheresyoured.at

The Man Who Killed Google Search

1
1. peter_murray 28 Apr 2024
  
  in Public
  
  These emails — which I encourage you to look up — tell a dramatic story about how Google’s finance and advertising teams, led by Raghavan with the blessing of CEO Sundar Pichai, actively worked to make Google worse to make the company more money. This is what I mean when I talk about the Rot Economy — the illogical, product-destroying mindset that turns the products you love into torturous, frustrating quasi-tools that require you to fight the company’s intentions to get the service you want.
  
  Rot Economy: taking value from the users
  
  Not [[enshitification]]…Value was taken directly from the users to the company. Is there a parallel to recent Boeing actions here?
Visit annotations in context

Annotators

peter_murray

URL

wheresyoured.at/the-men-who-killed-google/
www.technologyreview.com www.technologyreview.com

It’s time to retire the term “user”

3
1. peter_murray 28 Apr 2024
  
  in Public
  
  A grander sense of partnership is in the air now. What were once called AI bots have been assigned lofty titles like “copilot” and “assistant” and “collaborator” to convey a sense of partnership instead of a sense of automation. Large language models have been quick to ditch words like “bot” altogether.
  
  AI entities are now anthropomorphized
2. peter_murray 27 Apr 2024
  
  in Public
  
  Norman, now 88, explained to me that the term “user” proliferated in part because early computer technologists mistakenly assumed that people were kind of like machines. “The user was simply another component,” he said. “We didn’t think of them as a person—we thought of [them] as part of a system.” So early user experience design didn’t seek to make human-computer interactions “user friendly,” per se. The objective was to encourage people to complete tasks quickly and efficiently. People and their computers were just two parts of the larger systems being built by tech companies, which operated by their own rules and in pursuit of their own agendas.
  
  “User” as a component of the bigger system
  
  Thinking about this and any contrast between “user experience design” and “human computer interaction”. And about schema.org constructs embedded in web pages…creating web pages that were meant to be read by both humans and bots.
  
  user experience design
3. peter_murray 27 Apr 2024
  
  in Public
  
  As early as 2008, Norman alighted on this shortcoming and began advocating for replacing “user” with “person” or “human” when designing for people. (The subsequent years have seen an explosion of bots, which has made the issue that much more complicated.) “Psychologists depersonalize the people they study by calling them ‘subjects.’ We depersonalize the people we study by calling them ‘users.’ Both terms are derogatory,” he wrote then. “If we are designing for people, why not call them that?”
  
  “User” as a depersonalized, derogatory term
  
  linguistics
Visit annotations in context

Tags

user experience design

linguistics

Annotators

peter_murray

URL

technologyreview.com/2024/04/19/1090872/ai-users-people-terms/
two-wrongs.com two-wrongs.com

Laws of Software Evolution

1
1. peter_murray 27 Apr 2024
  
  in Public
  
  We often think of software development as a ticket-in-code-out business but this is really only a very small portion of the entire thing. Completely independently of the work done as a programmer, there exists users with different jobs they are trying to perform, and they may or may not find it convenient to slot our software into that job. A manager is not necessarily the right person to evaluate how good a job we are doing because they also exist independently of the user–software–programmer network, and have their own sets of priorities which may or may not align with the rest of the system.
  
  Software development as a conversation
  
  software development
Visit annotations in context

Tags

software development

Annotators

peter_murray

URL

two-wrongs.com/laws-of-software-evolution
www.nytimes.com www.nytimes.com

Transcript: Ezra Klein Interviews Adam Moss

4
1. peter_murray 27 Apr 2024
  
  in Public
  
  I guess her own self-description that it doesn’t actually matter where she stops, that the important thing in the making of the painting is the making and destroying and making and destroying, that that’s actually what the whole thing is about.
  
  Deciding where to stop is a choice onto itself
  
  This session had me in a panic: how do we put descriptive metatata on this? Where would we draw the line between different representations of the work? Which representation becomes the featured one…the one the artist picked as a point on the timeline of creation, the one the describer picked for an aesthetic reason, one one that broke through in the public consciousness?
  
  descriptive metadata
2. peter_murray 27 Apr 2024
  
  in Public
  
  There are, in my view, three stages of making art. One of them is the imagining, and the final one is the shaping. But in between, there is the judging, which is kind of what we’re talking about here, the editing.
  
  Three stages of creation: imagining, judging, shaping
  
  it is the middle one that is often invisible to the point of being lost. That is the role of editing.
3. peter_murray 27 Apr 2024
  
  in Public
  
  Now, many people, when they read, listen to anything, when they take in media, they don’t necessarily even know where it was from.
  
  The lost role of the editor with the decontainerization of digital media
4. peter_murray 27 Apr 2024
  
  in Public
  
  The Work of Art, How Something Comes From Nothing
  
  Publisher link. He talks later in the podcast about how the physical book itself is a work of art..from the texture of the paper (which he thought was too smooth) to the cloth cover (which he pointedly advocated for with the publisher).
Visit annotations in context

Tags

descriptive metadata

Annotators

peter_murray

URL

nytimes.com/2024/04/23/podcasts/transcript-ezra-klein-interviews-adam-moss.html
storage.courtlistener.com storage.courtlistener.com

Internet Archive Reply Br.pdf

12
1. peter_murray 21 Apr 2024
  
  in Public
  
  But the National Emergency Library (NEL) refutes that.
  
  First time the National Emergency Library is mentioned in the brief.
  
  covid
2. peter_murray 21 Apr 2024
  
  in Public
  
  Properly understood, controlled digital lending simply enables modernlibraries to carry out their time-honored missions in the more efficient and effectiveway digital technologies allow.
  
  CDL is an "efficient and effective digital way"
  
  I wonder if the point will be made later that there is a non-zero cost to providing CDL infrastructure. CDL, and the digital infrastructure required to support it, cost to a rough approximation the same as shelving the physical book.
  
  Edited to add: yes it is mentioned later in the brief.
3. peter_murray 21 Apr 2024
  
  in Public
  
  Publishers claim that the scope of IAs lending is too small to calculatemarket harm (Resp.Br. 52-53) is equally unfounded. To start, Publishers ignore thehuge increase in fixed costspurchasing and storing books, building and expandingdigital infrastructure, and morethat would necessarily accompany (and imposelimits on) any expansion of controlled digital lending.
  
  CDL programs have their own costs
  
  ..so the costs of CDL do get mentioned in the brief.
4. peter_murray 21 Apr 2024
  
  in Public
  
  IAs lending serves additional transformative purposes by enabling innovativeinteractions between books and the Internet, such as cite-checking online resourceslike Wikipedia.
  
  Wikipedia cite-checking as transformative CDL purpose
  
  Wikipedia
5. peter_murray 21 Apr 2024
  
  in Public
  
  Controlled digital lending permits libraries to build permanent collections andto archive and lend older books in a form that preserves their original printing.Publishers ebook licenses cannot serve this preservation mission because they arenot photographs of the original editions, and ongoing access depends on Publishersdiscretion and is subject to change without notice.
  
  CDL supports preservation of published material
6. peter_murray 21 Apr 2024
  
  in Public
  
  Publishers erroneously claim that controlled digital lending does notexpand utility because their ebook licenses already provide the same efficiency.
  
  Publishers can offer greater utility
  
  I propose a notion that may not be legally relevant in this case: publishers, who have the digital source file of the publication, can do so much more than a library can do with CDL. The perfect example of this is the X-Ray functionality in Kindle: "allows readers to learn more about a character, topic, event, place, or any other term, simply by pressing and holding on the word or phrase that interests them." To provide equivalent functionality, the library would need to OCR the image, correct it, and then layer on the additional functionality. In the non-fiction arena, the publisher can make interactive graphs and tables that would be difficult to do based on the page images.
7. peter_murray 21 Apr 2024
  
  in Public
  
  In any event, IAs controlled digital lending program serves a differentpurpose than physically lending the books it owns. Although both ultimately enablepatrons to read the content, that does not mean the purposes are the samemost usesof a book involve viewing its content. Controlled digital lending serves a differentpurpose by permitting libraries to lend the books they own to a broader range ofpeople for whom physical lending would be impractical.
  
  CDL enables access beyond physical boundaries
  
  Here the brief says that libraries can use CDL to lend beyond the physical boundaries of its territory. This is one of the scenarios envisioned by the NISO IS-CDL working group's draft recommended practice. The recommended practice doesn't offer a legal opinion; instead, it leaves it up to the risk assessment of each organization.
8. peter_murray 21 Apr 2024
  
  in Public
  
  But what matters is whether theborrowers are entitled to receive the content. Redigi, 910 F.3d at 661 (emphasisadded).
  
  Purchase on content versus purchase of format
  
  This is an interesting argument—that what is purchased is the content of the book, not the physical/digital artifact itself. This seems right to me (again, unencumbered by deep legal knowledge/reasoning).
  
  Side note: in library school I once argued that de-spining the pages of a book was a legitimate way to ensure good digital copies could be made. The professor was horrified at the suggestion.
9. peter_murray 21 Apr 2024
  
  in Public
  
  Publishers mischaracterize BWBs ownership. Resp.Br. 17. As explained(IABr. 23 n.8), BWB is not owned by IA or Brewster Kahle, but by Better WorldLibraries, which has no owner. A-6087-89. BWB and Better World Libraries areoperated by a three-member board. A-6089. IA has no control over either entity.While Kahle has leadership roles in each entity, some overlap in personnel does notundermine the separateness of corporate entities.
  
  Kahle participates in both BWB/BWL and IA
  
  I don't know the legal significance of this, but from a lay-person's view there does seem to be some entanglement.
10. peter_murray 21 Apr 2024
  
  in Public
  
  Second, Publishers present an erroneously cramped view of librariesmissions. Libraries do not acquire and lend books solely to make them physicallyavailable to patrons within a restricted geographic area. Contra Resp.Br. 4, 9.Libraries provide readers more egalitarian access to a wider range of books,overcoming socioeconomic and geographic barriers by sharing resources with otherlibraries through interlibrary loans.
  
  Second of two "critical misconceptions": libraries' mission to provide broad access to information
11. peter_murray 21 Apr 2024
  
  in Public
  
  First, Publishers disregard the key feature of controlled digital lending: thecontrols that ensure borrowing a book digitally adheres to the same owned-to-loanedratio inherent in borrowing a book physically. Publishers repeatedly compare IAslending to inapposite practices that lack this key feature.
  
  First of two "critical misconceptions": CDL controls on owned-to-loaned ratio
  
  CDL isn't the open redistribution of content (a la openly posting with no restrictions or digital resale).
  
  peer to peer filesharing
12. peter_murray 21 Apr 2024
  
  in Public
  
  Hachette Book Group, Inc. v. Internet Archive (23-1260) Court of Appeals for the Second Circuit
  
  REPLY BRIEF, on behalf of Appellant Internet Archive, filed 04/19/2024
  
  https://www.courtlistener.com/docket/67801014/hachette-book-group-inc-v-internet-archive/
  
  Hachette v. Internet Archive fair use controlled digital lending
Visit annotations in context

Tags

controlled digital lending

Hachette v. Internet Archive

peer to peer filesharing

covid

fair use

Wikipedia

Annotators

peter_murray

URL

storage.courtlistener.com/recap/gov.uscourts.ca2.60988/gov.uscourts.ca2.60988.259.0.pdf
media.dltj.org media.dltj.org

"Researchers study ice to explore our planet's past, present and future" uncorrected transcript (WBUR Here and Now)

2
1. peter_murray 13 Apr 2024
  
  in Public
  
  Vaughn says the temperatures along with carbon dioxide levels have naturally fluctuated over earth's history inside lasting between 144,000 years. Well, over the last million years, co two in the atmosphere has never really gone despite its ups and downs never gone above maybe 280 parts per million. Until now. As of January 2024 the amount of heat trapping carbon dioxide is a whopping 422 parts per million. We've had a wonderful party with fossil fuels for a couple of centuries. We've, we have changed the world at a cost that's now only becoming evident.
  
  Ice cores provide a history of carbon dioxide in the atmosphere
  
  climate change
2. peter_murray 13 Apr 2024
  
  in Public
  
  Each ice core is kind of unique and shows you a different climatic window. Vaughn uses water isotopes to determine what the temperature was when each layer of ice was formed. Isotopes are molecules that have the same number of protons and electrons, but a different number of neutrons affecting their mass. For example, water h2o has oxygen that has either a molecular rate of 16 or 18. And so it's a heavy and light water precipitation that falls in warmer temperature tends to be heavier water. He says, but in colder air like at the poles, the snow that falls is generally lighter water by looking at these ratios of ice tops in ice cores. We were able to infer the temperature from when it fell. As snow.
  
  Using ratio of the molecular weight of water to determine temperature
  
  climate change
Visit annotations in context

Tags

climate change

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20240413T172627-WBUR_Here_and_Now--Researchers_study_ice_to_explore_our_planets_past-_present_and_future/index.html
Mar 2024
media.dltj.org media.dltj.org

Video: From Hot Metal to HTML: The Story of Typography by Dylan Beattie - NDC Porto 2023, annotated

1
1. peter_murray 20 Mar 2024
  
  in Public
  
  Posted to YouTube on 12-Mar-2024
  
  Abstract
  
  Arial, Times New Roman, Consolas, Comic Sans... digital typography has turned us all into typesetters. The tools we use, the apps we build, the emails we send: with so much of our lives mediated by technology, something as seemingly innocuous as picking a typeface can end up defining our relationship with the systems we use, and become part of the identity that we project into the world. Typography is a fundamental part of modern information design, with implications for user experience, accessibility, even performance - and when it goes wrong, it can produce some of the most baffling bugs you've ever seen.
  
  Join Dylan Beattie for a journey into the weird and wonderful history of digital typography, from the origins of movable type in 8th century Asia, to the world of e-ink displays and web typography. We'll look at the relationship between technology and typography over the centuries: the Gutenberg Press, Linotype machines, WYSIWYG and the desktop publishing revolution. What was so special about the Apple II? How do you design a pixel font? We'll learn why they're called upper and lower case, we'll talk about why so many developers find CSS counter-intuitive - and we'll find out why so many emails used to end with the letter J.
  
  typography publishing
Visit annotations in context

Tags

typography

publishing

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240320T150336-qbCniw-BcW0-From_Hot_Metal_to_HTML-_The_Story_of_Typography/index.html
media.dltj.org media.dltj.org

Video: Millions of Patient Records at Risk: The Perils of Legacy Protocols by Black Hat Europe 2023, annotated

4
1. peter_murray 17 Mar 2024
  
  in Public
  
  many of those servers are of course honeypots some of them are staging and demo environments of the vendors it's interesting to not that this protocol and technology is also used for animals so many of the records on the internet actually are for cats and dogs and many of the records are exposed via universities or research centers that they just share anonymized data with other research centers
  
  Some DICOM servers are intended to be public
2. peter_murray 17 Mar 2024
  
  in Public
  
  many hospitals started moving their dcom infrastructures to the cloud because it's cheaper it's easier it's faster it's good so they did the shift and they used the Legacy protocol dcom without sufficient security
  
  Protocol intended for closed networks now found on open cloud servers
  
  cloud computing
3. peter_murray 17 Mar 2024
  
  in Public
  
  dcom is the standard that defines how these images should be digitally structured and stored also it defines a network protocol that says how these images can be transferred in a network
  
  DICOM is a file structure and a network protocol
  
  I knew of DICOM as using JPEG2000 for image formats, but I didn't know it was a network protocol, too.
4. peter_murray 17 Mar 2024
  
  in Public
  
  Millions of Patient Records at Risk: The Perils of Legacy Protocols
  
  Sina Yazdanmehr | Senior IT Security Consultant, Aplite GmbH Ibrahim Akkulak | Senior IT Security Consultant, Aplite GmbH Date: Wednesday, December 6, 2023
  
  Abstract
  
  Currently, a concerning situation is unfolding online: a large amount of personal information and medical records belonging to patients is scattered across the internet. Our internet-wide research on DICOM, the decade-old standard protocol for medical imaging, has revealed a distressing fact – Many medical institutions have unintentionally made the private data and medical histories of millions of patients accessible to the vast realm of the internet.
  
  Medical imaging encompasses a range of techniques such as X-Rays, CT scans, and MRIs, used to visualize internal body structures, with DICOM serving as the standard protocol for storing and transmitting these images. The security problems with DICOM are connected to using legacy protocols on the internet as industries strive to align with the transition towards Cloud-based solutions.
  
  This talk will explain the security shortcomings of DICOM when it is exposed online and provide insights from our internet-wide research. We'll show how hackers can easily find, access, and exploit the exposed DICOM endpoints, extract all patients' data, and even alter medical records. Additionally, we'll explain how we were able to bypass DICOM security controls by gathering information from the statements provided by vendors and service providers regarding their adherence to DICOM standards.
  
  We'll conclude by providing practical recommendations for medical institutions, healthcare providers, and medical engineers to mitigate these security issues and safeguard patients' data.
  
  medical privacy digital imaging
Visit annotations in context

Tags

cloud computing

digital imaging

medical privacy

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240317T150339-CgJIxTP8ydQ-Millions_of_Patient_Records_at_Risk-_The_Perils_of_Legacy_Protocols/index.html
storage.courtlistener.com storage.courtlistener.com

OCLC Complaint

15
1. peter_murray 16 Mar 2024
  
  in Public
  
  109. On information and belief, in addition to her extensive online presence, she has aGitHub (a software code hosting platform) account called, “anarchivist,” and she developed arepository for a python module for interacting with OCLC’s WorldCat® Affiliate web services.
  
  Matienzo has a GitHub account with code that interacts with OCLC’s API
  
  Is this really the extent of the connection between Matienzo and Anna’s Archive? I don’t know what is required at the Complaint stage of a civil lawsuit to prove someone is connected to an anonymous collective, but surely something more than this plus public statements and an internet handle (“anarchivist”) is required to convict. Does Ohio have SLAPP protections?
2. peter_murray 16 Mar 2024
  
  in Public
  
  99. This includes metadata unique to WorldCat® records and created by OCLC. Forexample, the Anna’s Archive’s blog post indicates that Defendants harvested metadata that denotesassociations between records, such as between an original work and a parodying work. OCLCadds these associations data as part of its enrichment process.
  
  Example of enrichment process: association between original works and parodies
3. peter_murray 16 Mar 2024
  
  in Public
  
  1792. In total, WorldCat® has 1.4 billion OCNs, meaning Defendants claim that theywere able to harvest, to some extent, 97.4% of unique WorldCat® records
  
  In complaint, OCLC says it has 1.4b OCNs
4. peter_murray 16 Mar 2024
  
  in Public
  
  78. The bots also harvested data from WorldCat.org by pretending to be an internetbrowser, directly calling or “pinging” OCLC’s servers, and bypassing the search, or user interface,of WorldCat.org. More robust WorldCat® data was harvested directly from OCLC’s servers,including enriched data not available through the WorldCat.org user interface.
  
  Web scrapers web-scraped, but more robust data?
  
  The first sentence is the definition of a web scraper…having done an analysis of the URL structure, it goes directly to the page it is interested in rather than going through the search engine. (Does going through the search engine make a web scraper’s activities somehow legitimate?)
  
  The second sentence is weird…how did the web scraper harvest “more robust WorldCat data” that wasn’t available through WorldCat.org?
5. peter_murray 16 Mar 2024
  
  in Public
  
  38. This includes adding OCLC’s own unique identifying number, the “OCN,” whichenhances queries and serves as an authoritative index for specific items or works.
  
  OCN, OCLC’s unique identifier
  
  Remember…OCNs are in the public domain: OCLC Control Numbers - Lots of them; all public domain
6. peter_murray 16 Mar 2024
  
  in Public
  
  1476. These attacks were accomplished with bots (automated software applications) that“scraped” and harvested data from WorldCat.org and other WorldCat®-based research sites andthat called or pinged the server directly. These bots were initially masked to appear as legitimatesearch engine bots from Bing or Google.
  
  Bots initially masked themselves as search engine bots
  
  web search engine
7. peter_murray 16 Mar 2024
  
  in Public
  
  58. When an individual searches on WorldCat.org, the individual agrees to the OCLCWorldCat.org Services Terms and Conditions (attached here as Exhibit B).
  
  Terms and Conditions
  
  I just tried in a private browser window to be sure, but there isn’t an up-front display of the terms and conditions to click-through. I’m also not sure what the state of validity of website terms and conditions is. The Terms and Conditions link at the bottom of the page says it hasn’t been updated since 2009.
  
  click through licenses
8. peter_murray 16 Mar 2024
  
  in Public
  
  51. The information available through WorldCat.org on a result page includes data thatis freely accessible on the web, such as title, publication, copyright, author, and editor, and limiteddata “enriched” by OCLC, such as OCN, International Standard Book Number (“ISBN”),International Standard Serial Number (“ISSN”), and pagination. This enriched data is moredifficult to find outside of WorldCat® and varies by each result in WorldCat.org.52. Most WorldCat® data available in a WorldCat® record is unavailable to anindividual on WorldCat.org. This is because a full WorldCat® record is part of a member library’ssubscription for cataloging and other library services.
  
  Subset of data from WorldCat is on the public WorldCat.org site
  
  “Most WorldCat data” is not available on the public-facing website? That would be an interesting comparison study. Notably missing from the list of publicly available data is subject headings and notes…I’m pretty sure those fields are available, too.
9. peter_murray 16 Mar 2024
  
  in Public
  
  39. Of the entire WorldCat® collection, more than 93% of the records have beenmodified, improved, and/or enhanced by OCLC.
  
  Percentage of WorldCat that has been “modified, improved, and/or enhanced”
  
  worldcat
10. peter_murray 16 Mar 2024
  
  in Public
  
  OCLC is a non-profit, membership, computer library service and researchorganization dedicated to the public purposes of furthering access to the world’s information andreducing the rate of the rise in library costs
  
  How OCLC defines itself
  
  …with that phrase “reducing the rate of rise in library costs” — how about flat out reducing library costs, OCLC?
  
  OCLC
11. peter_murray 16 Mar 2024
  
  in Public
  
  By hacking WorldCat.org, scraping and harvesting OCLC’s valuable WorldCat
  
  Complain equates “hacking” with “scraping and harvesting”
  
  This is a matter of some debate—notably the recent LLM web scraping cases.
  
  web scraping
12. peter_murray 16 Mar 2024
  
  in Public
  
  In the blog post announcing their hacking and scraping of the data viaWorldCat.org, Defendants publicly thanked OCLC for “the decades of hard work you put intobuilding the collections that we now liberate. Truly: thank you.”
  
  Anna’s Archive blog post announcing data
  
  1.3B WorldCat scrape & data science mini-competition. Lead paragraph:
  
  TL;DR: Anna’s Archive scraped all of WorldCat (the world’s largest library metadata collection) to make a TODO list of books that need to be preserved, and is hosting a data science mini-competition.
13. peter_murray 16 Mar 2024
  
  in Public
  
  7. When OCLC member libraries subscribe to WorldCat® through OCLC’sWorldCat® Discovery Services/FirstSearch, the subscription includes the WorldCat.org service.Libraries are willing to pay for WorldCat.org as part of their WorldCat® subscription
  
  Libraries pay for WorldCat.org visibility
  
  The first sentence of the paragraph says that a subscription to WorldCat Discovery Services includes visibility on WorldCat.org. The second sentence says that libraries are willing to pay for this visibility. I’m not sure what else is included in a WorldCat Discovery Services subscription…is there a contradiction in these two sentences?
  
  worldcat
14. peter_murray 16 Mar 2024
  
  in Public
  
  6. To accomplish this, WorldCat.org allows individuals to search member libraries’catalogs as represented by their corresponding WorldCat® records in the WorldCat® database.When an individual views a search result from WorldCat.org, they see a more limited view of aWorldCat® record, i.e., with less metadata than is available for the record in the WorldCat®database for cataloging purposes
  
  WorldCat database is a subset of WorldCat.org
15. peter_murray 16 Mar 2024
  
  in Public
  
  Complaint
  
  OCLC Online Computer Library Center, Inc. v. Anna's Archive (2:24-cv-00144)
  
  District Court, Southern District of Ohio
  
  library metadata
Visit annotations in context

Tags

web search engine

click through licenses

web scraping

library metadata

worldcat

OCLC

Annotators

peter_murray

URL

storage.courtlistener.com/recap/gov.uscourts.ohsd.287709/gov.uscourts.ohsd.287709.1.0.pdf
www.bl.uk www.bl.uk

british-library-cyber-incident-review-8-march-2024.pdf

15
1. peter_murray 09 Mar 2024
  
  in Public
  
  Proactively manage staff and user wellbeing: Cyber-incident management plans shouldinclude provisions for managing staff and user wellbeing. Cyber-attacks are deeply upsettingfor staff whose data is compromised and whose work is disrupted, and for users whoseservices are interrupted
  
  Cybersecurity is a group effort
  
  It would be easy to pin this all on the tech who removed that block on the account that may have been the beachhead for this attack. As this report shows, the organization allowed the environment to flourish that culminated in that one bit-flip to bring the organization down.
  
  I’ve never been in that position. I’m mindful that I could someday be in that position looking back at what my action or inaction allowed to happen. I’ll probably be risk being in that position until the day I retire and destroy my production work credentials.
2. peter_murray 09 Mar 2024
  
  in Public
  
  Manage systems lifecycles to eliminate legacy technology: ‘Legacy’ systems are not justhard to maintain and secure, they are extremely hard to restore. Regular investment in thelifecycle of all critical systems – both infrastructure and applications – is essential toguarantee not just security but also organisational resilience
  
  What is cutting edge today is legacy tomorrow
  
  As our layers of technology get stacked higher, the bottom layers get squeezed and compressed to thin layers that we assume will always exist. We must maintain visibility in those layers and invest in their maintenance and robustness.
  
  change management
3. peter_murray 09 Mar 2024
  
  in Public
  
  Enhance intrusion response processes: An in-depth security review should be commissionedafter even the smallest signs of network intrusion. It is relatively easy for an attacker toestablish persistence after gaining access to a network, and thereafter evade routinesecurity precautions
  
  You have to be right 100% of the time; your attacker needs to be lucky once
4. peter_murray 09 Mar 2024
  
  in Public
  
  The need to embed security more deeply than ever into everything we do will requireinvestment in culture change across different parts of the Library. There is a risk that thedesire to return to ‘business as usual’ as fast as possible will compromise the changes intechnology, policy, and culture that will be necessary to secure the Library for the future. Astrong change management component in the Rebuild & Renew Programme will beessential to mitigate this risk, as will firm and well considered leadership from seniormanagers
  
  Actively avoiding a return to normal
  
  This will be among the biggest challenges, right? The I-could-do this-before-why-can’t-I-do-it-now question. Somewhere I read that the definition of “personal character” is the ability to see an action through after the emotion of the commitment to the action has passed. The British Library was a successful institution and will and to return to that position of being seen as a successful instituting as quick as it possibly can.
  
  change management
5. peter_murray 09 Mar 2024
  
  in Public
  
  a robust and resilient backup service, providing immutable and air-gapped copies, offsitecopies, and hot copies of data with multiple restoration points on a 4/3/2/1 model
  
  Backup models
  
  I’m familiar with the 3-2-1 strategy for backups (three copies of your data on two distinct media with one stored off-site), but I hadn’t heard of the 4-3-2-1 strategy. Judging from this article from Backblaze, the additional layer accounts for a fully air-gapped or unavailable-online copy. The AWS S3 “Object Lock” option noted earlier is one example: although the backed up object is online and can be read, there are technical controls that prevent its modification until a set period of time elapses. (Presumably a time period long enough for you to find and extricate anyone that has compromised your systems before the object lock expires.)
6. peter_murray 09 Mar 2024
  
  in Public
  
  The substantial disruption of the attack creates an opportunity to implement a significant number ofchanges to policy, processes, and technology that will address structural issues in ways that wouldpreviously have been too disruptive to countenance
  
  Never let a good crisis go to waste
  
  Oh, yeah.
7. peter_murray 09 Mar 2024
  
  in Public
  
  our reliance on legacy infrastructure is the primary contributor to the length of time that theLibrary will require to recover from the attack. These legacy systems will in many cases needto be migrated to new versions, substantially modified, or even rebuilt from the ground up,either because they are unsupported and therefore cannot be repurchased or restored, orbecause they simply will not operate on modern servers or with modern security controls
  
  Legacy infrastructure lengthens recovery time
  
  I wonder how much of this “legacy infrastructure” are bespoke software systems that were created internally and no longer have relevant or reliable documentation. Yes, they may have the data, but they can’t reconstruct the software development environments that would be used to upgrade or migrate to a new commercial or internally-developed system.
8. peter_murray 09 Mar 2024
  
  in Public
  
  some of our older applications rely substantially on manual extract, transform and load (ETL)processes to pass data from one system to another. This substantially increases the volumeof customer and staff data in transit on the network, which in a modern data managementand reporting infrastructure would be encapsulated in secure, automated end-to-end
  
  Reliance on ETL seen as risky
  
  I’m not convinced about this. Real-time API connectivity between systems is a great goal…very responsive to changes filtering through disparate systems. But a lot of “modern” processing is still done by ETL batches (sometimes daily, sometimes hourly, sometimes every minute).
  
  data processing
9. peter_murray 09 Mar 2024
  
  in Public
  
  our historically complex network topology (ie. the ‘shape’ of our network and how itscomponents connect to each other) allowed the attackers wider access to our network thanwould have been possible in a more modern network design, allowing them to compromisemore systems and services
  
  Historically complex network topology
  
  Reading between the lines, I think they dealt with the complexity by having a flat network…no boundaries between functions. If one needed high level access to perform a function on one system, hey had high level access across large segments of the network.
10. peter_murray 09 Mar 2024
  
  in Public
  
  viable sources of backups had been identified that wereunaffected by the cyber-attack and from which the Library’s digital and digitised collections,collection metadata and other corporate data could be recovered
  
  Viable backups
  
  I suddenly have a new respect for write-once-read-many (WORM) block storage like AWS’ Object Lock: https://aws.amazon.com/blogs/storage/protecting-data-with-amazon-s3-object-lock/
  
  digital storage
11. peter_murray 09 Mar 2024
  
  in Public
  
  The Library has not made any payment to the criminal actors responsible for the attack, nor engagedwith them in any way. Ransomware gangs contemplating future attacks such as this on publicly-funded institutions should be aware that the UK’s national policy, articulated by NCSC, isunambiguously clear that no such payments should be made
  
  Government policy not to reward or engage with cyber attackers
12. peter_murray 09 Mar 2024
  
  in Public
  
  The lack of MFA on thedomain was identified and raised as a risk at this time, but the possible consequences were perhapsunder-appraised.
  
  No MFA on the remote access server
  
  If you, dear reader, are in the same boat now, seriously consider reprioritizing your MFA rollout.
13. peter_murray 09 Mar 2024
  
  in Public
  
  The intrusion was first identified as a major incident at 07:35 on 28 October 2023
  
  Attack started overnight Friday-to-Saturday
  
  If there were network and service availability alarms, were they disabled in the attack? Were they on internal or external systems? Was the overnight hours into a weekend a factor in how fast the problem was found?
14. peter_murray 09 Mar 2024
  
  in Public
  
  The criminal gang responsible for the attack copied and exfiltrated (illegally removed) some 600GBof files, including personal data of Library users and staff. When it became clear that no ransomwould be paid, this data was put up for auction and subsequently dumped on the dark web. OurCorporate Information Management Unit is conducting a detailed review of the material included inthe data-dump, and where sensitive material is identified they are contacting the individualsaffected with advice and support.
  
  Ransom not paid and data published
  
  Not sure yet whether they will go into their thinking behind why they didn’t pay, but that is the recommended course of action.
15. peter_murray 09 Mar 2024
  
  in Public
  
  LEARNING LESSONS FROM THE CYBER-ATTACK British Library cyber incident review 8 MARCH 2024
  
  national libraries cybersecurity
Visit annotations in context

Tags

cybersecurity

change management

data processing

national libraries

digital storage

Annotators

peter_murray

URL

bl.uk/home/british-library-cyber-incident-review-8-march-2024.pdf
media.dltj.org media.dltj.org

Video: Actually, ChatGPT is INCREDIBLY Useful (15 Surprising Examples) by ThioJoe, annotated

1
1. peter_murray 03 Mar 2024
  
  in Public
  
  Actually, ChatGPT is INCREDIBLY Useful (15 Surprising Examples) by ThioJoe on YouTube, 8-Feb-2024
  
  0:00 - Intro
  
  0:28 - An Important Point
  
  1:26 - What If It's Wrong?
  
  1:54 - Explain Command Line Parameters
  
  2:36 - Ask What Command to Use
  
  3:04 - Parse Unformatted Data
  
  4:54 - Use As A Reverse Dictionary
  
  6:16 - Finding Hard-To-Search Information
  
  7:48 - Finding TV Show Episodes
  
  8:20 - A Quick Note
  
  8:37 - Multi-Language Translations
  
  9:21 - Figuring Out the Correct Software Version
  
  9:58 - Adding Code Comments
  
  10:18 - Adding Debug Print Statements
  
  10:42 - Calculate Subscription Break-Even
  
  11:40 - Programmatic Data Processing
  
  large language model
Visit annotations in context

Tags

large language model

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240216T100824-zIWqTJu9HaA-Actually-_ChatGPT_is_INCREDIBLY_Useful_-15_Surprising_Examples-/index.html
Feb 2024
nyuengelberg.org nyuengelberg.org

The Anti-Ownership Ebook Economy

2
1. peter_murray 12 Feb 2024
  
  in Public
  
  Bobbs-Merrill Company v. Straus, the 1908 Supreme Court case that established the First Sale Doctrine in United States common law, flowed directly from a publisher’s attempt to control the minimum price that the novel The Castaway could be sold for on the secondary market.15 In that case, The Castaway’s publisher, the Bobbs-Merrill Company, added a notice to each copy of the book that no dealer was “authorized” to sell the book for less than $1. When the Straus brothers purchased a number of copies and decided to sell them for less than $1, Bobbs-Merrill sued to enforce its $1 price floor. Ultimately, the US Supreme Court ruled that Straus did not need “authorization” from Bobbs-Merrill (or anyone else) to sell the books at whatever price they chose. Once Bobbs-Merrill sold the books, their preferences for how the books were used did not matter.
  
  1908 Supreme Court case established First Sale Doctrine
2. peter_murray 12 Feb 2024
  
  in Public
  
  Over the years, publishers have made many attempts to avoid this exchange, controlling both the purchase price and what purchasers do with the books after they are sold. For example, in the early 1900s, publishers tried to control resale prices on the books people bought from retailers by stamping mandatory resale prices on a book’s front page.6 (That attempt was rejected by the US Supreme Court).7 Publishers also tried to limit where people could resell books they bought, in one case claiming that a book sold in Thailand couldn’t be resold in the US.8 (That attempt was also rejected by the US Supreme Court, in 2013).9 These attempts failed because the publisher’s copyright does not give them absolute control of a book in perpetuity; the copyright system is a balance between publishers and purchasers.10 If publishers want the benefits of the copyright system, they also have to accept the limits it imposes on their power.
  
  Attempts by publishers to limit post-sale activities
  
  publishing copyright
Visit annotations in context

Tags

publishing copyright

Annotators

peter_murray

URL

nyuengelberg.org/outputs/the-anti-ownership-ebook-economy/
academic.oup.com academic.oup.com

Bugs in our pockets: the risks of client-side scanning

5
1. peter_murray 12 Feb 2024
  
  in Public
  
  Moving scanning from the server to the client pushes it across the boundary between what is shared (the cloud) and what is private (the user device). By creating the capability to scan files that would never otherwise leave a user device, CSS thus erases any boundary between a user’s private sphere and their shared (semi-)public sphere [6]. It makes what was formerly private on a user’s device potentially available to law enforcement and intelligence agencies, even in the absence of a warrant. Because this privacy violation is performed at the scale of entire populations, it is a bulk surveillance technology.
  
  Client-side scanning is a bulk surveillance technology
  
  government surveillance
2. peter_murray 12 Feb 2024
  
  in Public
  
  Many scanning systems make use of perceptual hash functions, which have several features that make them ideal for identifying pictures. Most importantly, they are resilient to small changes in the image content, such as re-encoding or changing the size of an image. Some functions are even resilient to image cropping and rotation.
  
  Perceptual hash function for content scanning
  
  One way to scan for target material: run a function on the content that results in a manipulation-resistant identifier that is easy to compare.
3. peter_murray 12 Feb 2024
  
  in Public
  
  The alternative approach to image classification uses machine-learning techniques to identify targeted content. This is currently the best way to filter video, and usually the best way to filter text. The provider first trains a machine-learning model with image sets containing both innocuous and target content. This model is then used to scan pictures uploaded by users. Unlike perceptual hashing, which detects only photos that are similar to known target photos, machine-learning models can detect completely new images of the type on which they were trained.
  
  Machine learning for content scanning
  
  machine learning
4. peter_murray 12 Feb 2024
  
  in Public
  
  In what follows, we refer to text, audio, images, and videos as “content,” and to content that is to be blocked by a CSS system as “targeted content.” This generalization is necessary. While the European Union (EU) and Apple have been talking about child sex-abuse material (CSAM)—specifically images—in their push for CSS [12], the EU has included terrorism and organized crime along with sex abuse [13]. In the EU’s view, targeted content extends from still images through videos to text, as text can be used for both sexual solicitation and terrorist recruitment. We cannot talk merely of “illegal” content, because proposed UK laws would require the blocking online of speech that is legal but that some actors find upsetting [14].
  
  Defining "content"
  
  How you define "content" in client-side scanning is key. The scope of any policies will depend on the national (and local?) laws in place.
  
  law enforcement
5. peter_murray 12 Feb 2024
  
  in Public
  
  Harold Abelson, Ross Anderson, Steven M Bellovin, Josh Benaloh, Matt Blaze, Jon Callas, Whitfield Diffie, Susan Landau, Peter G Neumann, Ronald L Rivest, Jeffrey I Schiller, Bruce Schneier, Vanessa Teague, Carmela Troncoso, Bugs in our pockets: the risks of client-side scanning, Journal of Cybersecurity, Volume 10, Issue 1, 2024, tyad020, https://doi.org/10.1093/cybsec/tyad020
  
  Abstract
  
  Our increasing reliance on digital technology for personal, economic, and government affairs has made it essential to secure the communications and devices of private citizens, businesses, and governments. This has led to pervasive use of cryptography across society. Despite its evident advantages, law enforcement and national security agencies have argued that the spread of cryptography has hindered access to evidence and intelligence. Some in industry and government now advocate a new technology to access targeted data: client-side scanning (CSS). Instead of weakening encryption or providing law enforcement with backdoor keys to decrypt communications, CSS would enable on-device analysis of data in the clear. If targeted information were detected, its existence and, potentially, its source would be revealed to the agencies; otherwise, little or no information would leave the client device. Its proponents claim that CSS is a solution to the encryption versus public safety debate: it offers privacy—in the sense of unimpeded end-to-end encryption—and the ability to successfully investigate serious crime. In this paper, we argue that CSS neither guarantees efficacious crime prevention nor prevents surveillance. Indeed, the effect is the opposite. CSS by its nature creates serious security and privacy risks for all society, while the assistance it can provide for law enforcement is at best problematic. There are multiple ways in which CSS can fail, can be evaded, and can be abused.
  
  Right off the bat, these authors are highly experienced and plugged into what is happening with technology.
  
  digital privacy client side scanning
Visit annotations in context

Tags

client side scanning

digital privacy

government surveillance

machine learning

law enforcement

Annotators

peter_murray

URL

academic.oup.com/cybersecurity/article/10/1/tyad020/7590463

Peter Murray

Annotations: 1,236

Joined: October 24, 2012

Location: Columbus, Ohio, United States

Link: dltj.org/

ORCID: 0000-0003-4284-508X

Tags

Annotators

URL

Unit costs for pennies and nickels is more than their face value

Tags

Annotators

URL

The difference between constructive action and destructive action

Tags

Annotators

URL

Starlink reported 49k risk-avoidance maneuvers in 6 months in 2024

Around a dozen orbital break-ups per year

Origin of the phrase "Kessler Syndrome"

Tags

Annotators

URL

Canada looks for its own path in response to Trump

The real-time death and resurrection story from the disciples' perspective

Tags

Annotators

URL

Covered data includes procured databases owned by a private company

Builds on the FOIA

From the Foundations for Evidence-Based Policymaking Act

Tags

Annotators

URL

First damage from a commercial space launch in US

Tags

Annotators

URL

“Definitions are a human conceit, not a natural occurrence”

IAU’s definitions of “planet” and “dwarf planet”

Tags

Annotators

URL

Distillation

Multi-head Latent Attention

Mixture-of-Experts

Tags

Annotators

URL

Empathy is antithetical to christian nationalism

Empathy is essential for the Christian life

Tags

Annotators

URL

"Whispering" technique of propaganda

Exploring archives, and knowing what you are looking for

Research, and selling/explaining the research

CIA recruits at the American Library Association conference

A writer was embedded in the Manhattan Project to make a book about the atomic bomb

Annotators

URL

Tags

Annotators

URL

Astronauts in orbit don't feel weightless, they feel constant falling

The Common Swift flies for 10 straight months

Episode Description

Tags

Annotators

URL

LLMs taking the joy out of the search for information

Religions use new technologies

LLM Confabulation of Religious Ideas

Sermon-writing as an art form

"Can God speak through A.I.?"

Tags

Annotators

URL

Number of jobs requiring physical strength decreasing

No gender gap in standardized tests, but gap in GPA

Tags

Annotators

URL

Proposed countermeasures

"Artificial Intelligent Personalities", forth generation warfare

Bot farms—automation, third generation mass information warfare