1,276 Matching Annotations

Mar 2024
media.dltj.org media.dltj.org

Video: Millions of Patient Records at Risk: The Perils of Legacy Protocols by Black Hat Europe 2023, annotated

2
1. peter_murray 17 Mar 2024
  
  in Public
  
  dcom is the standard that defines how these images should be digitally structured and stored also it defines a network protocol that says how these images can be transferred in a network
  
  DICOM is a file structure and a network protocol
  
  I knew of DICOM as using JPEG2000 for image formats, but I didn't know it was a network protocol, too.
2. peter_murray 17 Mar 2024
  
  in Public
  
  Millions of Patient Records at Risk: The Perils of Legacy Protocols
  
  Sina Yazdanmehr | Senior IT Security Consultant, Aplite GmbH Ibrahim Akkulak | Senior IT Security Consultant, Aplite GmbH Date: Wednesday, December 6, 2023
  
  Abstract
  
  Currently, a concerning situation is unfolding online: a large amount of personal information and medical records belonging to patients is scattered across the internet. Our internet-wide research on DICOM, the decade-old standard protocol for medical imaging, has revealed a distressing fact – Many medical institutions have unintentionally made the private data and medical histories of millions of patients accessible to the vast realm of the internet.
  
  Medical imaging encompasses a range of techniques such as X-Rays, CT scans, and MRIs, used to visualize internal body structures, with DICOM serving as the standard protocol for storing and transmitting these images. The security problems with DICOM are connected to using legacy protocols on the internet as industries strive to align with the transition towards Cloud-based solutions.
  
  This talk will explain the security shortcomings of DICOM when it is exposed online and provide insights from our internet-wide research. We'll show how hackers can easily find, access, and exploit the exposed DICOM endpoints, extract all patients' data, and even alter medical records. Additionally, we'll explain how we were able to bypass DICOM security controls by gathering information from the statements provided by vendors and service providers regarding their adherence to DICOM standards.
  
  We'll conclude by providing practical recommendations for medical institutions, healthcare providers, and medical engineers to mitigate these security issues and safeguard patients' data.
  
  medical privacy digital imaging
Visit annotations in context

Tags

digital imaging

medical privacy

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240317T150339-CgJIxTP8ydQ-Millions_of_Patient_Records_at_Risk-_The_Perils_of_Legacy_Protocols/index.html
storage.courtlistener.com storage.courtlistener.com

OCLC Complaint

15
1. peter_murray 16 Mar 2024
  
  in Public
  
  109. On information and belief, in addition to her extensive online presence, she has aGitHub (a software code hosting platform) account called, “anarchivist,” and she developed arepository for a python module for interacting with OCLC’s WorldCat® Affiliate web services.
  
  Matienzo has a GitHub account with code that interacts with OCLC’s API
  
  Is this really the extent of the connection between Matienzo and Anna’s Archive? I don’t know what is required at the Complaint stage of a civil lawsuit to prove someone is connected to an anonymous collective, but surely something more than this plus public statements and an internet handle (“anarchivist”) is required to convict. Does Ohio have SLAPP protections?
2. peter_murray 16 Mar 2024
  
  in Public
  
  99. This includes metadata unique to WorldCat® records and created by OCLC. Forexample, the Anna’s Archive’s blog post indicates that Defendants harvested metadata that denotesassociations between records, such as between an original work and a parodying work. OCLCadds these associations data as part of its enrichment process.
  
  Example of enrichment process: association between original works and parodies
3. peter_murray 16 Mar 2024
  
  in Public
  
  1792. In total, WorldCat® has 1.4 billion OCNs, meaning Defendants claim that theywere able to harvest, to some extent, 97.4% of unique WorldCat® records
  
  In complaint, OCLC says it has 1.4b OCNs
4. peter_murray 16 Mar 2024
  
  in Public
  
  78. The bots also harvested data from WorldCat.org by pretending to be an internetbrowser, directly calling or “pinging” OCLC’s servers, and bypassing the search, or user interface,of WorldCat.org. More robust WorldCat® data was harvested directly from OCLC’s servers,including enriched data not available through the WorldCat.org user interface.
  
  Web scrapers web-scraped, but more robust data?
  
  The first sentence is the definition of a web scraper…having done an analysis of the URL structure, it goes directly to the page it is interested in rather than going through the search engine. (Does going through the search engine make a web scraper’s activities somehow legitimate?)
  
  The second sentence is weird…how did the web scraper harvest “more robust WorldCat data” that wasn’t available through WorldCat.org?
5. peter_murray 16 Mar 2024
  
  in Public
  
  38. This includes adding OCLC’s own unique identifying number, the “OCN,” whichenhances queries and serves as an authoritative index for specific items or works.
  
  OCN, OCLC’s unique identifier
  
  Remember…OCNs are in the public domain: OCLC Control Numbers - Lots of them; all public domain
6. peter_murray 16 Mar 2024
  
  in Public
  
  1476. These attacks were accomplished with bots (automated software applications) that“scraped” and harvested data from WorldCat.org and other WorldCat®-based research sites andthat called or pinged the server directly. These bots were initially masked to appear as legitimatesearch engine bots from Bing or Google.
  
  Bots initially masked themselves as search engine bots
  
  web search engine
7. peter_murray 16 Mar 2024
  
  in Public
  
  58. When an individual searches on WorldCat.org, the individual agrees to the OCLCWorldCat.org Services Terms and Conditions (attached here as Exhibit B).
  
  Terms and Conditions
  
  I just tried in a private browser window to be sure, but there isn’t an up-front display of the terms and conditions to click-through. I’m also not sure what the state of validity of website terms and conditions is. The Terms and Conditions link at the bottom of the page says it hasn’t been updated since 2009.
  
  click through licenses
8. peter_murray 16 Mar 2024
  
  in Public
  
  51. The information available through WorldCat.org on a result page includes data thatis freely accessible on the web, such as title, publication, copyright, author, and editor, and limiteddata “enriched” by OCLC, such as OCN, International Standard Book Number (“ISBN”),International Standard Serial Number (“ISSN”), and pagination. This enriched data is moredifficult to find outside of WorldCat® and varies by each result in WorldCat.org.52. Most WorldCat® data available in a WorldCat® record is unavailable to anindividual on WorldCat.org. This is because a full WorldCat® record is part of a member library’ssubscription for cataloging and other library services.
  
  Subset of data from WorldCat is on the public WorldCat.org site
  
  “Most WorldCat data” is not available on the public-facing website? That would be an interesting comparison study. Notably missing from the list of publicly available data is subject headings and notes…I’m pretty sure those fields are available, too.
9. peter_murray 16 Mar 2024
  
  in Public
  
  39. Of the entire WorldCat® collection, more than 93% of the records have beenmodified, improved, and/or enhanced by OCLC.
  
  Percentage of WorldCat that has been “modified, improved, and/or enhanced”
  
  worldcat
10. peter_murray 16 Mar 2024
  
  in Public
  
  OCLC is a non-profit, membership, computer library service and researchorganization dedicated to the public purposes of furthering access to the world’s information andreducing the rate of the rise in library costs
  
  How OCLC defines itself
  
  …with that phrase “reducing the rate of rise in library costs” — how about flat out reducing library costs, OCLC?
  
  OCLC
11. peter_murray 16 Mar 2024
  
  in Public
  
  By hacking WorldCat.org, scraping and harvesting OCLC’s valuable WorldCat
  
  Complain equates “hacking” with “scraping and harvesting”
  
  This is a matter of some debate—notably the recent LLM web scraping cases.
  
  web scraping
12. peter_murray 16 Mar 2024
  
  in Public
  
  In the blog post announcing their hacking and scraping of the data viaWorldCat.org, Defendants publicly thanked OCLC for “the decades of hard work you put intobuilding the collections that we now liberate. Truly: thank you.”
  
  Anna’s Archive blog post announcing data
  
  1.3B WorldCat scrape & data science mini-competition. Lead paragraph:
  
  TL;DR: Anna’s Archive scraped all of WorldCat (the world’s largest library metadata collection) to make a TODO list of books that need to be preserved, and is hosting a data science mini-competition.
13. peter_murray 16 Mar 2024
  
  in Public
  
  7. When OCLC member libraries subscribe to WorldCat® through OCLC’sWorldCat® Discovery Services/FirstSearch, the subscription includes the WorldCat.org service.Libraries are willing to pay for WorldCat.org as part of their WorldCat® subscription
  
  Libraries pay for WorldCat.org visibility
  
  The first sentence of the paragraph says that a subscription to WorldCat Discovery Services includes visibility on WorldCat.org. The second sentence says that libraries are willing to pay for this visibility. I’m not sure what else is included in a WorldCat Discovery Services subscription…is there a contradiction in these two sentences?
  
  worldcat
14. peter_murray 16 Mar 2024
  
  in Public
  
  6. To accomplish this, WorldCat.org allows individuals to search member libraries’catalogs as represented by their corresponding WorldCat® records in the WorldCat® database.When an individual views a search result from WorldCat.org, they see a more limited view of aWorldCat® record, i.e., with less metadata than is available for the record in the WorldCat®database for cataloging purposes
  
  WorldCat database is a subset of WorldCat.org
15. peter_murray 16 Mar 2024
  
  in Public
  
  Complaint
  
  OCLC Online Computer Library Center, Inc. v. Anna's Archive (2:24-cv-00144)
  
  District Court, Southern District of Ohio
  
  library metadata
Visit annotations in context

Tags

web search engine

worldcat

OCLC

library metadata

click through licenses

web scraping

Annotators

peter_murray

URL

storage.courtlistener.com/recap/gov.uscourts.ohsd.287709/gov.uscourts.ohsd.287709.1.0.pdf
www.bl.uk www.bl.uk

british-library-cyber-incident-review-8-march-2024.pdf

15
1. peter_murray 09 Mar 2024
  
  in Public
  
  Proactively manage staff and user wellbeing: Cyber-incident management plans shouldinclude provisions for managing staff and user wellbeing. Cyber-attacks are deeply upsettingfor staff whose data is compromised and whose work is disrupted, and for users whoseservices are interrupted
  
  Cybersecurity is a group effort
  
  It would be easy to pin this all on the tech who removed that block on the account that may have been the beachhead for this attack. As this report shows, the organization allowed the environment to flourish that culminated in that one bit-flip to bring the organization down.
  
  I’ve never been in that position. I’m mindful that I could someday be in that position looking back at what my action or inaction allowed to happen. I’ll probably be risk being in that position until the day I retire and destroy my production work credentials.
2. peter_murray 09 Mar 2024
  
  in Public
  
  Manage systems lifecycles to eliminate legacy technology: ‘Legacy’ systems are not justhard to maintain and secure, they are extremely hard to restore. Regular investment in thelifecycle of all critical systems – both infrastructure and applications – is essential toguarantee not just security but also organisational resilience
  
  What is cutting edge today is legacy tomorrow
  
  As our layers of technology get stacked higher, the bottom layers get squeezed and compressed to thin layers that we assume will always exist. We must maintain visibility in those layers and invest in their maintenance and robustness.
  
  change management
3. peter_murray 09 Mar 2024
  
  in Public
  
  Enhance intrusion response processes: An in-depth security review should be commissionedafter even the smallest signs of network intrusion. It is relatively easy for an attacker toestablish persistence after gaining access to a network, and thereafter evade routinesecurity precautions
  
  You have to be right 100% of the time; your attacker needs to be lucky once
4. peter_murray 09 Mar 2024
  
  in Public
  
  The need to embed security more deeply than ever into everything we do will requireinvestment in culture change across different parts of the Library. There is a risk that thedesire to return to ‘business as usual’ as fast as possible will compromise the changes intechnology, policy, and culture that will be necessary to secure the Library for the future. Astrong change management component in the Rebuild & Renew Programme will beessential to mitigate this risk, as will firm and well considered leadership from seniormanagers
  
  Actively avoiding a return to normal
  
  This will be among the biggest challenges, right? The I-could-do this-before-why-can’t-I-do-it-now question. Somewhere I read that the definition of “personal character” is the ability to see an action through after the emotion of the commitment to the action has passed. The British Library was a successful institution and will and to return to that position of being seen as a successful instituting as quick as it possibly can.
  
  change management
5. peter_murray 09 Mar 2024
  
  in Public
  
  a robust and resilient backup service, providing immutable and air-gapped copies, offsitecopies, and hot copies of data with multiple restoration points on a 4/3/2/1 model
  
  Backup models
  
  I’m familiar with the 3-2-1 strategy for backups (three copies of your data on two distinct media with one stored off-site), but I hadn’t heard of the 4-3-2-1 strategy. Judging from this article from Backblaze, the additional layer accounts for a fully air-gapped or unavailable-online copy. The AWS S3 “Object Lock” option noted earlier is one example: although the backed up object is online and can be read, there are technical controls that prevent its modification until a set period of time elapses. (Presumably a time period long enough for you to find and extricate anyone that has compromised your systems before the object lock expires.)
6. peter_murray 09 Mar 2024
  
  in Public
  
  The substantial disruption of the attack creates an opportunity to implement a significant number ofchanges to policy, processes, and technology that will address structural issues in ways that wouldpreviously have been too disruptive to countenance
  
  Never let a good crisis go to waste
  
  Oh, yeah.
7. peter_murray 09 Mar 2024
  
  in Public
  
  our reliance on legacy infrastructure is the primary contributor to the length of time that theLibrary will require to recover from the attack. These legacy systems will in many cases needto be migrated to new versions, substantially modified, or even rebuilt from the ground up,either because they are unsupported and therefore cannot be repurchased or restored, orbecause they simply will not operate on modern servers or with modern security controls
  
  Legacy infrastructure lengthens recovery time
  
  I wonder how much of this “legacy infrastructure” are bespoke software systems that were created internally and no longer have relevant or reliable documentation. Yes, they may have the data, but they can’t reconstruct the software development environments that would be used to upgrade or migrate to a new commercial or internally-developed system.
8. peter_murray 09 Mar 2024
  
  in Public
  
  some of our older applications rely substantially on manual extract, transform and load (ETL)processes to pass data from one system to another. This substantially increases the volumeof customer and staff data in transit on the network, which in a modern data managementand reporting infrastructure would be encapsulated in secure, automated end-to-end
  
  Reliance on ETL seen as risky
  
  I’m not convinced about this. Real-time API connectivity between systems is a great goal…very responsive to changes filtering through disparate systems. But a lot of “modern” processing is still done by ETL batches (sometimes daily, sometimes hourly, sometimes every minute).
  
  data processing
9. peter_murray 09 Mar 2024
  
  in Public
  
  our historically complex network topology (ie. the ‘shape’ of our network and how itscomponents connect to each other) allowed the attackers wider access to our network thanwould have been possible in a more modern network design, allowing them to compromisemore systems and services
  
  Historically complex network topology
  
  Reading between the lines, I think they dealt with the complexity by having a flat network…no boundaries between functions. If one needed high level access to perform a function on one system, hey had high level access across large segments of the network.
10. peter_murray 09 Mar 2024
  
  in Public
  
  viable sources of backups had been identified that wereunaffected by the cyber-attack and from which the Library’s digital and digitised collections,collection metadata and other corporate data could be recovered
  
  Viable backups
  
  I suddenly have a new respect for write-once-read-many (WORM) block storage like AWS’ Object Lock: https://aws.amazon.com/blogs/storage/protecting-data-with-amazon-s3-object-lock/
  
  digital storage
11. peter_murray 09 Mar 2024
  
  in Public
  
  The Library has not made any payment to the criminal actors responsible for the attack, nor engagedwith them in any way. Ransomware gangs contemplating future attacks such as this on publicly-funded institutions should be aware that the UK’s national policy, articulated by NCSC, isunambiguously clear that no such payments should be made
  
  Government policy not to reward or engage with cyber attackers
12. peter_murray 09 Mar 2024
  
  in Public
  
  The lack of MFA on thedomain was identified and raised as a risk at this time, but the possible consequences were perhapsunder-appraised.
  
  No MFA on the remote access server
  
  If you, dear reader, are in the same boat now, seriously consider reprioritizing your MFA rollout.
13. peter_murray 09 Mar 2024
  
  in Public
  
  The intrusion was first identified as a major incident at 07:35 on 28 October 2023
  
  Attack started overnight Friday-to-Saturday
  
  If there were network and service availability alarms, were they disabled in the attack? Were they on internal or external systems? Was the overnight hours into a weekend a factor in how fast the problem was found?
14. peter_murray 09 Mar 2024
  
  in Public
  
  The criminal gang responsible for the attack copied and exfiltrated (illegally removed) some 600GBof files, including personal data of Library users and staff. When it became clear that no ransomwould be paid, this data was put up for auction and subsequently dumped on the dark web. OurCorporate Information Management Unit is conducting a detailed review of the material included inthe data-dump, and where sensitive material is identified they are contacting the individualsaffected with advice and support.
  
  Ransom not paid and data published
  
  Not sure yet whether they will go into their thinking behind why they didn’t pay, but that is the recommended course of action.
15. peter_murray 09 Mar 2024
  
  in Public
  
  LEARNING LESSONS FROM THE CYBER-ATTACK British Library cyber incident review 8 MARCH 2024
  
  national libraries cybersecurity
Visit annotations in context

Tags

digital storage

national libraries

change management

cybersecurity

data processing

Annotators

peter_murray

URL

bl.uk/home/british-library-cyber-incident-review-8-march-2024.pdf
media.dltj.org media.dltj.org

Video: Actually, ChatGPT is INCREDIBLY Useful (15 Surprising Examples) by ThioJoe, annotated

1
1. peter_murray 03 Mar 2024
  
  in Public
  
  Actually, ChatGPT is INCREDIBLY Useful (15 Surprising Examples) by ThioJoe on YouTube, 8-Feb-2024
  
  0:00 - Intro
  
  0:28 - An Important Point
  
  1:26 - What If It's Wrong?
  
  1:54 - Explain Command Line Parameters
  
  2:36 - Ask What Command to Use
  
  3:04 - Parse Unformatted Data
  
  4:54 - Use As A Reverse Dictionary
  
  6:16 - Finding Hard-To-Search Information
  
  7:48 - Finding TV Show Episodes
  
  8:20 - A Quick Note
  
  8:37 - Multi-Language Translations
  
  9:21 - Figuring Out the Correct Software Version
  
  9:58 - Adding Code Comments
  
  10:18 - Adding Debug Print Statements
  
  10:42 - Calculate Subscription Break-Even
  
  11:40 - Programmatic Data Processing
  
  large language model
Visit annotations in context

Tags

large language model

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240216T100824-zIWqTJu9HaA-Actually-_ChatGPT_is_INCREDIBLY_Useful_-15_Surprising_Examples-/index.html
Feb 2024
nyuengelberg.org nyuengelberg.org

The Anti-Ownership Ebook Economy

2
1. peter_murray 12 Feb 2024
  
  in Public
  
  Bobbs-Merrill Company v. Straus, the 1908 Supreme Court case that established the First Sale Doctrine in United States common law, flowed directly from a publisher’s attempt to control the minimum price that the novel The Castaway could be sold for on the secondary market.15 In that case, The Castaway’s publisher, the Bobbs-Merrill Company, added a notice to each copy of the book that no dealer was “authorized” to sell the book for less than $1. When the Straus brothers purchased a number of copies and decided to sell them for less than $1, Bobbs-Merrill sued to enforce its $1 price floor. Ultimately, the US Supreme Court ruled that Straus did not need “authorization” from Bobbs-Merrill (or anyone else) to sell the books at whatever price they chose. Once Bobbs-Merrill sold the books, their preferences for how the books were used did not matter.
  
  1908 Supreme Court case established First Sale Doctrine
2. peter_murray 12 Feb 2024
  
  in Public
  
  Over the years, publishers have made many attempts to avoid this exchange, controlling both the purchase price and what purchasers do with the books after they are sold. For example, in the early 1900s, publishers tried to control resale prices on the books people bought from retailers by stamping mandatory resale prices on a book’s front page.6 (That attempt was rejected by the US Supreme Court).7 Publishers also tried to limit where people could resell books they bought, in one case claiming that a book sold in Thailand couldn’t be resold in the US.8 (That attempt was also rejected by the US Supreme Court, in 2013).9 These attempts failed because the publisher’s copyright does not give them absolute control of a book in perpetuity; the copyright system is a balance between publishers and purchasers.10 If publishers want the benefits of the copyright system, they also have to accept the limits it imposes on their power.
  
  Attempts by publishers to limit post-sale activities
  
  publishing copyright
Visit annotations in context

Tags

publishing copyright

Annotators

peter_murray

URL

nyuengelberg.org/outputs/the-anti-ownership-ebook-economy/
academic.oup.com academic.oup.com

Bugs in our pockets: the risks of client-side scanning

5
1. peter_murray 12 Feb 2024
  
  in Public
  
  Moving scanning from the server to the client pushes it across the boundary between what is shared (the cloud) and what is private (the user device). By creating the capability to scan files that would never otherwise leave a user device, CSS thus erases any boundary between a user’s private sphere and their shared (semi-)public sphere [6]. It makes what was formerly private on a user’s device potentially available to law enforcement and intelligence agencies, even in the absence of a warrant. Because this privacy violation is performed at the scale of entire populations, it is a bulk surveillance technology.
  
  Client-side scanning is a bulk surveillance technology
  
  government surveillance
2. peter_murray 12 Feb 2024
  
  in Public
  
  Many scanning systems make use of perceptual hash functions, which have several features that make them ideal for identifying pictures. Most importantly, they are resilient to small changes in the image content, such as re-encoding or changing the size of an image. Some functions are even resilient to image cropping and rotation.
  
  Perceptual hash function for content scanning
  
  One way to scan for target material: run a function on the content that results in a manipulation-resistant identifier that is easy to compare.
3. peter_murray 12 Feb 2024
  
  in Public
  
  The alternative approach to image classification uses machine-learning techniques to identify targeted content. This is currently the best way to filter video, and usually the best way to filter text. The provider first trains a machine-learning model with image sets containing both innocuous and target content. This model is then used to scan pictures uploaded by users. Unlike perceptual hashing, which detects only photos that are similar to known target photos, machine-learning models can detect completely new images of the type on which they were trained.
  
  Machine learning for content scanning
  
  machine learning
4. peter_murray 12 Feb 2024
  
  in Public
  
  In what follows, we refer to text, audio, images, and videos as “content,” and to content that is to be blocked by a CSS system as “targeted content.” This generalization is necessary. While the European Union (EU) and Apple have been talking about child sex-abuse material (CSAM)—specifically images—in their push for CSS [12], the EU has included terrorism and organized crime along with sex abuse [13]. In the EU’s view, targeted content extends from still images through videos to text, as text can be used for both sexual solicitation and terrorist recruitment. We cannot talk merely of “illegal” content, because proposed UK laws would require the blocking online of speech that is legal but that some actors find upsetting [14].
  
  Defining "content"
  
  How you define "content" in client-side scanning is key. The scope of any policies will depend on the national (and local?) laws in place.
  
  law enforcement
5. peter_murray 12 Feb 2024
  
  in Public
  
  Harold Abelson, Ross Anderson, Steven M Bellovin, Josh Benaloh, Matt Blaze, Jon Callas, Whitfield Diffie, Susan Landau, Peter G Neumann, Ronald L Rivest, Jeffrey I Schiller, Bruce Schneier, Vanessa Teague, Carmela Troncoso, Bugs in our pockets: the risks of client-side scanning, Journal of Cybersecurity, Volume 10, Issue 1, 2024, tyad020, https://doi.org/10.1093/cybsec/tyad020
  
  Abstract
  
  Our increasing reliance on digital technology for personal, economic, and government affairs has made it essential to secure the communications and devices of private citizens, businesses, and governments. This has led to pervasive use of cryptography across society. Despite its evident advantages, law enforcement and national security agencies have argued that the spread of cryptography has hindered access to evidence and intelligence. Some in industry and government now advocate a new technology to access targeted data: client-side scanning (CSS). Instead of weakening encryption or providing law enforcement with backdoor keys to decrypt communications, CSS would enable on-device analysis of data in the clear. If targeted information were detected, its existence and, potentially, its source would be revealed to the agencies; otherwise, little or no information would leave the client device. Its proponents claim that CSS is a solution to the encryption versus public safety debate: it offers privacy—in the sense of unimpeded end-to-end encryption—and the ability to successfully investigate serious crime. In this paper, we argue that CSS neither guarantees efficacious crime prevention nor prevents surveillance. Indeed, the effect is the opposite. CSS by its nature creates serious security and privacy risks for all society, while the assistance it can provide for law enforcement is at best problematic. There are multiple ways in which CSS can fail, can be evaded, and can be abused.
  
  Right off the bat, these authors are highly experienced and plugged into what is happening with technology.
  
  digital privacy client side scanning
Visit annotations in context

Tags

client side scanning

machine learning

law enforcement

government surveillance

digital privacy

Annotators

peter_murray

URL

academic.oup.com/cybersecurity/article/10/1/tyad020/7590463
slate.com slate.com

College Students Don’t Know How to Read Anymore. We’re in Denial Over How Bad It’s Gotten.

1
1. peter_murray 11 Feb 2024
  
  in Public
  
  Less discussed than these broader cultural trends over which educators have little control are the major changes in reading pedagogy that have occurred in recent decades—some motivated by the ever-increasing demand to “teach to the test” and some by fads coming out of schools of education. In the latter category is the widely discussed decline in phonics education in favor of the “balanced literacy” approach advocated by education expert Lucy Calkins (who has more recently come to accept the need for more phonics instruction). I started to see the results of this ill-advised change several years ago, when students abruptly stopped attempting to sound out unfamiliar words and instead paused until they recognized the whole word as a unit. (In a recent class session, a smart, capable student was caught short by the word circumstances when reading a text out loud.) The result of this vibes-based literacy is that students never attain genuine fluency in reading. Even aside from the impact of smartphones, their experience of reading is constantly interrupted by their intentionally cultivated inability to process unfamiliar words.
  
  Vibe-based literacy
  
  Ouch! That is a pretty damming label.
  
  literacy
Visit annotations in context

Tags

literacy

Annotators

peter_murray

URL

slate.com/human-interest/2024/02/literacy-crisis-reading-comprehension-college.html
Jan 2024
media.dltj.org media.dltj.org

Video: How Concert LED Wristbands Work by WSJ Tech Behind, annotated

4
1. peter_murray 29 Jan 2024
  
  in Public
  
  And those light sticks aren't handed out as part of the event, they're mementos that fans will sometimes spend more than $100 on.
  
  Bluetooth technology
  
  Lighting devices are tied to an app on a phone via Bluetooth. The user also puts their location into the app.
  
  bluetooth
2. peter_murray 29 Jan 2024
  
  in Public
  
  But the more advanced wristbands, like what you see at the Super Bowl or Lady Gaga, use infrared technology.
  
  Infrared technology
  
  Transmitters on towers sweep over the audience with infrared signals. Masks in the transmitters can be used to create designs.
3. peter_murray 29 Jan 2024
  
  in Public
  
  Let's start off with the simplest, RF wristbands, that receive a radio frequency communicating the precise timing and colors for each band.
  
  RF technology
  
  Bands distributed to seating areas are assigned to one of several channels. The RF transmitter transmits the channel plus color information across a broad area.
4. peter_murray 29 Jan 2024
  
  in Public
  
  Jun 1, 2023
  
  Abstract
  
  WSJ goes behind the scenes with PixMob, a leading concert LED company, to see how they use “old tech” to build creative light shows, essentially turning the crowd into a video canvas.
  
  entertainment
Visit annotations in context

Tags

bluetooth

entertainment

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240128T201746-GCsmZA08oD8-How_Concert_LED_Wristbands_Work/index.html
media.dltj.org media.dltj.org

Video: How This Building Powers the Internet by Stewart Hicks, annotated

3
1. peter_murray 29 Jan 2024
  
  in Public
  
  it seems that one Wilshire has always had a crisis of location curiously the building isn't actually located at the address one Wilshire Boulevard it actually sits at 624 South Grand Avenue when Wilshire was a marketing name developed afterward
  
  One Wilshire isn't really on Wilshire Drive
2. peter_murray 29 Jan 2024
  
  in Public
  
  in 2013 this building sold for 437 million dollars the 660 dollars per square foot of leasable space that's by far the highest price paid of any office building in Downtown LA
  
  Most expensive commercial real estate in the US
  
  As a carrier hotel and co-location space for internet companies. 250 network service providers.
3. peter_murray 29 Jan 2024
  
  in Public
  
  Jun 1, 2023
  
  Abstract
  
  Sometimes buildings just don't look as important as they are. This the case of One Wilshire Blvd in Los Angeles. At first glance, its a generic office building in downtown. But, that blank facade is hiding one of the most important pieces of digital infrastructure within the United States. In this video we visit 1 Wilshire Blvd, explain how it works, and chat with Jimenez Lai who wrote a story about the building which explores its outsized role in our digital lives.
  
  internet infrastructure
Visit annotations in context

Tags

internet infrastructure

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240128T195325--wlCQ4g93oY-How_This_Building_Powers_the_Internet/index.html
media.dltj.org media.dltj.org

Video: How The World's Largest Paper Company Makes 1/3 of Cardboard Boxes In America by Business Insider, annotated

4
1. peter_murray 28 Jan 2024
  
  in Public
  
  Part of it is that old cardboard can't be recycled indefinitely. The EPA says it can only go through the process about seven times. Each time it goes through pulping and blending, the long, strong pine fibers get a bit shorter and weaker, and eventually the degraded paper bits simply wash through the screens and out of the process. So recycling is very important, but even if 100% of boxes got reused, making new ones would still mean cutting down trees.
  
  Paper degrades and can't be indefinitely recycled
2. peter_murray 28 Jan 2024
  
  in Public
  
  The southern US, sometimes called "America's wood basket," is home to 2% of the world's forested land, yet it produces nearly 20% of our pulp and paper products.
  
  Pulp and paper products produced overwhelmingly in the southern U.S.
3. peter_murray 28 Jan 2024
  
  in Public
  
  Forester Alex Singleton walked us through an area whose trees were sold to International Paper two years ago. Alex: It has since been replanted with longleaf pine. Narrator: But it will still take decades for the new crop to mature. For many foresters, we only see a site harvested once during our careers. From this stage to there would probably be around 30 years.
  
  30 years to get mature trees for corrugated packaging
4. peter_murray 28 Jan 2024
  
  in Public
  
  Sep 14, 2023
  
  Abstract
  
  Cardboard has a high recycling rate in the US. But it can't be reused forever, so the massive paper companies that make it also consume millions of trees each year.
  
  environment
Visit annotations in context

Tags

environment

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240128T180723--lsC0aXyY6g-How_The_World's_Largest_Paper_Company_Makes_1-3_of_Cardboard_Boxes_In_America/index.html
crsreports.congress.gov crsreports.congress.gov

Law Enforcement and Technology: The “Lawful Access” Debate

2
1. peter_murray 28 Jan 2024
  
  in Public
  
  Law enforcement contends that they want front dooraccess, where there is a clear understanding of when theyare accessing a device, as the notion of a back door soundssecretive. This front door could be opened by whomeverholds the key once investigators have demonstrated a lawfulbasis for access, such as probable cause that a crime isbeing committed. Whether front or back, however, buildingin an encrypted door that can be unlocked with a key—nomatter who maintains the key—adds a potentialvulnerability to exploitation by hackers, criminals, andother malicious actors. Researchers have yet to demonstratehow it would be possible to create a door that could only beaccessed in lawful circumstances.
  
  Rebranding law enforcement access as "front-door"
  
  ...because "back door" sounds secretive. And it would be secretive if the user didn't know that their service provider opened the door for law enforcement.
2. peter_murray 28 Jan 2024
  
  in Public
  
  Some observers say law enforcement’sinvestigative capabilities may be outpaced by the speed oftechnological change, preventing investigators fromaccessing certain information they may otherwise beauthorized to obtain. Specifically, law enforcement officialscite strong, end-to-end encryption, or what they have calledwarrant-proof encryption, as preventing lawful access tocertain data.
  
  "warrant-proof" encryption
  
  Law enforcement's name for "end-to-end encryption"
  
  end to end encryption law enforcement
Visit annotations in context

Tags

law enforcement

end to end encryption

Annotators

peter_murray

URL

crsreports.congress.gov/product/pdf/IF/IF11769
crsreports.congress.gov crsreports.congress.gov

Legal Consequences of Rescheduling Marijuana

4
1. peter_murray 22 Jan 2024
  
  in Public
  
  With respect to medical marijuana, a key difference between placement in Schedule I and ScheduleIII is that substances in Schedule III have an accepted medical use and may lawfully be dispensed byprescription, while Substances in Schedule I cannot.
  
  Legal issues remain even if marijuana is rescheduled
  
  Schedule III allows for "accepted medical use", but the FDA has not approved marijuana as a drug.
2. peter_murray 22 Jan 2024
  
  in Public
  
  In each budget cycle since FY2014, Congress has passed an appropriations rider barring theDepartment of Justice (DOJ) from using taxpayer funds to prevent states from “implementing their ownlaws that authorize the use, distribution, possession, or cultivation of medical marijuana.” Courts haveinterpreted the appropriations rider to prohibit federal prosecution of state-legal activities involvingmedical marijuana. However, it poses no bar to federal prosecution of activities involving recreationalmarijuana.
  
  Marijuana still illegal from a federal standpoint, but federal prosecution is prohibited
  
  In states that have passed medical marijuana statutes, Congress has said that DOJ cannot prosecute through an annual appropriations rider. (e.g., "no money can be used...")
3. peter_murray 22 Jan 2024
  
  in Public
  
  Congress placed marijuana in Schedule I in 1970 when it enacted the CSA. A lower schedulenumber carries greater restrictions under the CSA, with controlled substances in Schedule I subject to themost stringent controls. Schedule I controlled substances have no currently accepted medical use. It isillegal to produce, dispense, or possess such substances except in the context of federally approvedscientific studies, subject to CSA regulatory requirements designed to prevent abuse and diversion.
  
  Marijuana on CSA from the start
  
  Schedule I substances in the Controlled Substances Act have the most stringent regulations, and have no acceptable medical uses. Congress put Marijuana on the Schedule I list when it passed the CSA.
4. peter_murray 22 Jan 2024
  
  in Public
  
  Cannabis and its derivatives generally fall within one of two categories under federal law: marijuana orhemp.
  
  CSA definitions for marijuana and hemp
  
  Hemp is cannabis with a delta-9 tetrahydrocannobinol (THC) of less than 0.3%. Marijuana is everything else. Hemp is not a controlled substance while marijuana is.
Visit annotations in context

Annotators

peter_murray

URL

crsreports.congress.gov/product/pdf/LSB/LSB11105
media.dltj.org media.dltj.org

"How do you prevent a political campaign from getting hacked?" uncorrected transcript (Safe Mode from CyberScoop)

1
1. peter_murray 21 Jan 2024
  
  in Public
  
  So we have 50 independent electoral systems that kind of work in conjunction in tandem, but they're all slightly different and they're all run by the state.
  
  It is worse than that. In Ohio, each county has its own election system. Rules are set at the state level, but each county buys and maintains the equipment, hires and does training, and reports its results.
  
  election security
Visit annotations in context

Tags

election security

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20240121T152800-Safe_Mode_from_CyberScoop--How_do_you_prevent_a_political_campaign_from_getting_hacked-/index.html
www.eff.org www.eff.org

AI Art Generators and the Online Image Market

1
1. peter_murray 21 Jan 2024
  
  in Public
  
  Images of women are more likely to be coded as sexual in nature than images of men in similar states of dress and activity, because of widespread cultural objectification of women in both images and its accompanying text. An AI art generator can “learn” to embody injustice and the biases of the era and culture of the training data on which it is trained.
  
  Objectification of women as an example of AI bias
  
  bias in LLMs
Visit annotations in context

Tags

bias in LLMs

Annotators

peter_murray

URL

eff.org/deeplinks/2023/04/ai-art-generators-and-online-image-market
bylinetimes.com bylinetimes.com

Elon Musk's War Against Science, Evidence and Objective Truth – Byline Times

1
1. peter_murray 21 Jan 2024
  
  in Public
  
  “Information has become a battlespace, like naval or aerial”, Carl Miller, Research Director of the Centre for the Analysis of Social Media, once explained to me. Information warfare impacts how information is used, shared and amplified. What matters for information combatants is not the truth, reliability, relevance, contextuality or accuracy of information, but its strategic impact on the battlespace; that is, how well it manipulates citizens into adopting desired actions and beliefs.
  
  Information battlespace
  
  Sharing “information” not to spread truth, but to influence behavior
  
  disinformation
Visit annotations in context

Tags

disinformation

Annotators

peter_murray

URL

bylinetimes.com/2024/01/16/elon-musks-war-against-science-evidence-and-objective-truth/
media.dltj.org media.dltj.org

Video: The "Great Ivy League Nude Posture Photos Scandal" by The History Guy, annotated

3
1. peter_murray 18 Jan 2024
  
  in Public
  
  Harvard had posture tests as early as 1880 and many other colleges would follow suit Harvard's program was developed by da Sergeant who created a template of the statistically average American and measured Harvard students in an effort to get students to reach a perfect muscular form women's college Vasser began keeping posture and other physical records in 1884 the ideal test was for a patient to stand naked in front of a mirror or have a nude photo taken and for an expert to comment and offer remedies for poor posture
  
  Posture tests in Harvard (1880s) and Vasser (1884)
2. peter_murray 18 Jan 2024
  
  in Public
  
  Lord Chesterfield's widely published 1775 book lord Chesterfield's advice to his son on men and manners recommended new standards advising against odd motions strange postures and ental carriage
  
  Lord Chesterfield's advice to his son, on men and manners; in ... - Full View | HathiTrust Digital Library
3. peter_murray 18 Jan 2024
  
  in Public
  
  Dec 29, 2023
  
  Abstract
  
  Most of us can probably say we struggle with posture, but for a long period after the turn of the twentieth century an American obsession with posture led to dramatic efforts to make students “straighten up."
  
  higher education
Visit annotations in context

Tags

higher education

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240118T174404-njE7y-oOHi4-The_"Great_Ivy_League_Nude_Posture_Photos_Scandal"/index.html
nap.nationalacademies.org nap.nationalacademies.org

Read "Facial Recognition Technology: Current Capabilities, Future Prospects, and Governance" at NAP.edu

1
1. peter_murray 17 Jan 2024
  
  in Public
  
  Facial Recognition Technology: Current Capabilities, Future Prospects, and Governance (2024) 120 pages | 8.5 x 11 | PAPERBACK ISBN 978-0-309-71320-7 | DOI 10.17226/27397
  
  http://nap.nationalacademies.org/27397
  
  facial recognition
Visit annotations in context

Tags

facial recognition

Annotators

peter_murray

URL

nap.nationalacademies.org/read/27397/chapter/2
www.wired.com www.wired.com

In Defense of AI Hallucinations

1
1. peter_murray 15 Jan 2024
  
  in Public
  
  Santosh Vempala, a computer science professor at Georgia Tech, has also studied hallucinations. “A language model is just a probabilistic model of the world,” he says, not a truthful mirror of reality. Vempala explains that an LLM’s answer strives for a general calibration with the real world—as represented in its training data—which is “a weak version of accuracy.” His research, published with OpenAI’s Adam Kalai, found that hallucinations are unavoidable for facts that can’t be verified using the information in a model’s training data.
  
  “A language model is just a probabilistic model of the world”
  
  Hallucinations are a result of an imperfect model, or attempting answers without the necessary data in the model.
  
  LLM research
Visit annotations in context

Tags

LLM research

Annotators

peter_murray

URL

wired.com/story/plaintext-in-defense-of-ai-hallucinations-chatgpt/
spectrum.ieee.org spectrum.ieee.org

Generative AI Has a Visual Plagiarism Problem

7
1. peter_murray 15 Jan 2024
  
  in Public
  
  As with Midjourney, DALL-E 3 was capable of creating plagiaristic (near-identical) representations of trademarked characters, even when those characters were not mentioned by name.DALL-E 3 also created a whole universe of potential trademark infringements with this single two-word prompt: “animated toys” [bottom right].
  
  DALL-E 3 produced the same kinds of plagiaristic output
2. peter_murray 15 Jan 2024
  
  in Public
  
  Put slightly differently, if this speculation is correct, the very pressure that drives generative AI companies to gather more and more data and make their models larger and larger (in order to make the outputs more humanlike) may also be making the models more plagiaristic.
  
  Does the amount of training data affect the likelihood of plagiaristic output?
3. peter_murray 15 Jan 2024
  
  in Public
  
  Moreover, Midjourney apparently sought to suppress our findings, banning Southen from its service (without even a refund of his subscription fee) after he reported his first results, and again after he created a new account from which additional results were reported. It then apparently changed its terms of service just before Christmas by inserting new language: “You may not use the Service to try to violate the intellectual property rights of others, including copyright, patent, or trademark rights. Doing so may subject you to penalties including legal action or a permanent ban from the Service.” This change might be interpreted as discouraging or even precluding the important and common practice of red-team investigations of the limits of generative AI—a practice that several major AI companies committed to as part of agreements with the White House announced in 2023. (Southen created two additional accounts in order to complete this project; these, too, were banned, with subscription fees not returned.)
  
  Midjourney bans researchers and changes terms of service
  
  LLM red team
4. peter_murray 15 Jan 2024
  
  in Public
  
  One user on X pointed to the fact that Japan has allowed AI companies to train on copyright materials. While this observation is true, it is incomplete and oversimplified, as that training is constrained by limitations on unauthorized use drawn directly from relevant international law (including the Berne Convention and TRIPS agreement). In any event, the Japanese stance seems unlikely to be carry any weight in American courts.
  
  Specifics in Japan for training LLMs on copyrighted material
  
  LLM copyright
5. peter_murray 15 Jan 2024
  
  in Public
  
  Such examples are particularly compelling because they raise the possibility that an end user might inadvertently produce infringing materials. We then asked whether a similar thing might happen in the visual domain.
  
  Can a user inadvertently produce infringing material
  
  Presumably, the LLM has been trained on copyrighted material because it is producing these textual (New York Times) and visual (motion pictures) plagiaristic outputs.
6. peter_murray 14 Jan 2024
  
  in Public
  
  After a bit of experimentation (and in a discovery that led us to collaborate), Southen found that it was in fact easy to generate many plagiaristic outputs, with brief prompts related to commercial films (prompts are shown).
  
  Plagiaristic outputs from blockbuster films in Midjourney v6
  
  Was the LLM trained on copyrighted material?
  
  AI art LLM copyright
7. peter_murray 14 Jan 2024
  
  in Public
  
  We will call such near-verbatim outputs “plagiaristic outputs,” because if a human created them we would call them prima facie instances of plagiarism.
  
  Defining “plagiaristic outputs”
  
  LLM research
Visit annotations in context

Tags

LLM red team

LLM research

LLM copyright

AI art

Annotators

peter_murray

URL

spectrum.ieee.org/midjourney-copyright
cacm.acm.org cacm.acm.org

The Dilemma of Scale

1
1. peter_murray 14 Jan 2024
  
  in Public
  
  Newspaper and magazine publishers could curate their content, as could the limited number of television and radio broadcasters. As cable television advanced, there were many more channels available to specialize and reach smaller audiences. The Internet and WWW exploded the information source space by orders of magnitude. For example, platforms such as YouTube receive hundreds of hours of video per minute. Tweets and Facebook updates must number in the hundreds of millions if not billions per day. Traditional media runs out of time (radio and television) or space (print media), but the Internet and WWW run out of neither. I hope that a thirst for verifiable or trustable facts will become a fashionable norm and part of the soluti
  
  Broadcast/Print are limited by time and space; is digital infinite?
  
  broadcast media print media publishing
Visit annotations in context

Tags

publishing

print media

broadcast media

Annotators

peter_murray

URL

cacm.acm.org/magazines/2023/10/276618-the-dilemma-of-scale/fulltext
media.dltj.org media.dltj.org

"Do we see 10,000 adverts per day?" uncorrected transcript (BBC More or Less)

3
1. peter_murray 14 Jan 2024
  
  in Public
  
  So, Sam decided, why not count the adverts he himself saw. It's just one number and applies only to the editor at a marketing magazine living in London on one arbitrary day. And what I saw were 93 ads I tried to be as open as I could about the fact that it's likely that I didn't notice every ad I could have done, but equally, I didn't miss that many, I don't think Sam also persuaded other people in the industry to do their own count. The most I've seen is 100 and 54. And I think I was quite generous. The lowest I've seen is 26. The most interesting version of the experiment was that I tasked someone to see as many as he could in a day and he got to 512 what a way to spend the day. And you will have noticed it's nowhere close to 10,000 ads.
  
  One person counted 93 per day
2. peter_murray 14 Jan 2024
  
  in Public
  
  Sam Anderson didn't marketing gurus have been making claims about advertising numbers for a very long time and Sam followed the trail all the way back to the 19 sixties. The very start of it was this piece of research by a man called Edwin Abel, who was a marketer for General Foods. Edwin Abel wanted to do a rough calculation on how many adverts people saw. He looked at how many hours of TV and radio people watched or listened to every day and worked out the average number of ads per hour on those mediums. So he multiplied those two numbers together to come up with this number. And this is our 1500 this 1500 ads a day number is still kicking around today. Often as the lowest of the big numbers in the blogosphere. And it is potentially a kind of legitimate calculation for the number of ads seen or heard, albeit from a quite different time in history. But there's some fine print to consider that is a number for a family of four. So if you divided that between a family of four, you'd actually be looking at something like 375 ads in this estimation.
  
  Research from the 1960s suggests 375 ads/day
3. peter_murray 14 Jan 2024
  
  in Public
  
  One of the numbers we quoted was a high estimate at the time, which was 5000 ads per day and that's the number that got latched on to. So this 5000 number was not for the number of adverts that a consumer actually sees and registers each day, but rather for the clutter as Walker calls it, these large numbers are not counts of the number of ads that people pay attention to.
  
  Number of impressions of advertising
  
  This advertising impressions may roll over us, but we don't actually see and register them.
  
  advertising
Visit annotations in context

Tags

advertising

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20240114T074101-BBC_More_or_Less--Do_we_see_10-000_adverts_per_day-/index.html
www.wnycstudios.org www.wnycstudios.org

The Real Mission Behind Moms for Liberty | On the Media | WNYC Studios

6
1. peter_murray 09 Jan 2024
  
  in Public
  
  One of the arguments that they make is that the original parents' rights crusade in the US was actually in opposition to the effort to ban child labor. The groups really driving that opposition, like the National Association of Manufacturers, these conservative industry groups, what they were opposed to was an effort to muck up what they saw as the natural state of affairs that would be inequality.
  
  Earlier, "parents rights" promoted by National Association of Manufacturers
  
  The "conservative industry group" didn't want to have an enforced public education mandate take away from the inequitable "natural state of affairs".
2. peter_murray 09 Jan 2024
  
  in Public
  
  Well, there are some people who have never liked the idea of public education because it's the most socialist thing that we do in this country. We tax ourselves to pay for it and everybody gets to access it. That's not a very American thing to do. Then you have conservative religious activists. They see a real opening, thanks to a whole string of Supreme Court cases to use public dollars to fund religious education. Then you have people who don't believe in public education for other reasons. Education is the single largest budget item in most states. If your goal is to cut taxes way back, if your goal is to give a handout to the wealthiest people in your state, spending less on education is going to be an absolute requirement. The same states that are enacting these sweeping school voucher programs, if you look at states like Iowa and Arkansas, they've ushered in huge tax cuts for their wealthiest residents. That means that within the next few years, there will no longer be enough funds available to fund their public schools, even at a time when they have effectively picked up the tab for affluent residents of the state who already send their kids to private schools.
  
  Public education is the most socialist program in the U.S.
  
  Defunding it is seen as a positive among conservative/libertarian wings. Add tax cuts for the wealthy to sway the government more towards affluent people.
3. peter_murray 09 Jan 2024
  
  in Public
  
  As unregulated as these programs are, that minimal data is something we have access to. There's been great coverage of this, including a recent story in The Wall Street Journal by an education reporter named Matt Barnum, that what we are seeing in state after state is that in the early phases of these new programs, that the parents who are most likely to take advantage of them are not the parents of low-income and minority kids in the public schools despite that being the big sales pitch, that instead they are affluent parents whose kids already attended private school. When lawmakers are making the case for these programs, they are making the Moms for Liberty arguments.
  
  Benefits of state-based school choice programs are going to affluent parents
  
  The kinds are already going to private schools; the money isn't going to low-income parents. As a result, private schools are emboldened to raise tuition.
  
  private education
4. peter_murray 09 Jan 2024
  
  in Public
  
  The Heritage Foundation has been an early and very loud backer of Moms for Liberty. There, I think it's really instructive to see that they are the leader of the project 2025 that's laying out the agenda for a next Trump administration. You can look at their education platform, it is not about taking back school boards. It's about dismantling public education entirely.
  
  Heritage Foundation backing this effort as part of a goal to "dismantle" public education
5. peter_murray 09 Jan 2024
  
  in Public
  
  I would point you to something like recent Gallup polling. We know, it's no secret that American trust and institutions has plummeted across the board, but something like only 26% of Americans say that they have faith in public schools. Among Republicans, it's even lower, it's 14%. Groups like Moms for Liberty have played a huge part in exacerbating the erosion of that trust.
  
  Gallup polling shows drop in confidence in public schools, especially among Republicans
  
  Losses by Moms-for-Liberty candidates reinforce the notion that there is partisanship in public education.
  
  public education
6. peter_murray 09 Jan 2024
  
  in Public
  
  Annotations are on the Transcript tab of this web page
  
  Abstract
  
  Last month, it seemed like Moms for Liberty, the infamous political group behind the recent push for book bans in schools across the country, might be on the wane. In November, a series of Moms for Liberty endorsed candidates lost school board elections, and in local district elections, the group took hit after hit. In Iowa, 12 of 13 candidates backed by the Moms were voted out, and in Pennsylvania, Democrats won against at least 11 of their candidates. But recently, Moms for Liberty co-founder Tiffany Justice claimed in an interview, "we're just getting started," boasting about the group's plans to ramp up efforts in 2024.
  
  book banning
Visit annotations in context

Tags

book banning

public education

private education

Annotators

peter_murray

URL

wnycstudios.org/podcasts/otm/segments/real-mission-behind-moms-liberty-on-the-media
media.dltj.org media.dltj.org

Video: Email vs Capitalism, or, Why We Can't Have Nice Things by Dylan Beattie at NDC Oslo 2023, annotated

9
1. peter_murray 06 Jan 2024
  
  in Public
  
  mail trap
  
  Mailtrap email testing
  
  https://mailtrap.io/email-sandbox/
2. peter_murray 06 Jan 2024
  
  in Public
  
  Papercut is a Windows application that just sits in the corner of your screen and your system notification tray and it's an SMTP server that doesn't send email every email that you send it intercepts
  
  Papercut: Windows-based SMTP server/interceptor
  
  ChangemakerStudios/Papercut-SMTP: Papercut SMTP -- The Simple Desktop Email Server
3. peter_murray 06 Jan 2024
  
  in Public
  
  mailjet markup language
  
  Origins of Mailjet Markup Language for richly formatted emails
  
  MJML → CSHTML (Razor) → HTML
4. peter_murray 06 Jan 2024
  
  in Public
  
  John Gilmore John is a very interesting person he's one of those people who I agree with everything he does right up to the point where I think he turns into a bit of a dick and then he kind of stops just past that point John was employee number five at Sun Microsystems he was one of the founders of the Electronic Frontier Foundation he is a uh I've seen him described as an extreme libertarian cipherpunk activist and uh the most famous quote I've seen from John is this one the net interprets censorship as damage and Roots around it if you start blocking ports because you don't like what people are doing the internet is designed to find another way rounded and you know he has taken this philosophy to an extreme he runs an open mail relay if you go to hop.toad.com it will accept email from anyone on the planet on Port 25 and it will deliver it doesn't care who you are doesn't care where you came from which is kind of the libertarian ethos in a nutshell
  
  John Gilmore's open SMTP relay
5. peter_murray 06 Jan 2024
  
  in Public
  
  I went through to see how many email addresses I could register
  
  Attempt to register quirky usernames at major email providers
  
  The RFC allows for "strange" usernames, but some mail providers are more restrictive.
6. peter_murray 06 Jan 2024
  
  in Public
  
  in 1978 Gary Turk was working for the digital Equipment Corporation he was a sales rep and his job was to sell these the deck system 20. now this thing had built-in arpanet protocol support it was like you don't have to do anything special you could plug it into a network and it would just work and rightly or wrongly Gary thought well I reckon people who are on the upper net might be interested in knowing about this computer and digital didn't have a whole lot of sales going on on the US West Coast they had a big office on the East Coast but West Coast you know California Portland those kind of places they didn't really have much of a presence so he got his assistant to go through the arpanet directory and type in the email addresses of everybody on the American West Coast who had an email address 393 of them and put them now at this point they overflowed the header field so all the people who got this email got an email which started with about 250 other people's email addresses and then right down at the end of it it says hey we invite you to come and see the deck system 2020
  
  Gary Turk "invented" spam in 1978
  
  internet history
7. peter_murray 06 Jan 2024
  
  in Public
  
  one person whose Innovation is still a significant part of the way we work with it was this guy it's Ray Tomlinson and he was working on an opponent Mail system in 1971 and Rey invented at Rey is the person who went well hang on if we know the user's name and we know the arpanet host where they host their email we could put an at in the middle because it's Alice at the machine
  
  Ray Tomlinson invented the use of @ in 1971
  
  internet history
8. peter_murray 06 Jan 2024
  
  in Public
  
  this is the SMTP specification latest version October 2008 about how we should try to deliver email and what it says is if it doesn't work first time you should wait at least 30 minutes and then you should keep trying for four or five days before you finally decide that the email didn't work
  
  Email was not intended to be instantly delivered
  
  The standards say wait at least 30 minutes if initial delivery failed, then try again for 4-5 days.
9. peter_murray 06 Jan 2024
  
  in Public
  
  June 23, 2023
  
  Abstract
  
  We're not quite sure exactly when email was invented. Sometime around 1971. We do know exactly when spam was invented: May 3rd, 1978, when Gary Thuerk emailed 400 people an advertisement for DEC computers. It made a lot of people very angry... but it also sold a few computers, and so junk email was born.
  
  Fast forward half a century, and the relationship between email and commerce has never been more complicated. In one sense, the utopian ideal of free, decentralised, electronic communication has come true. Email is the ultimate cross-network, cross-platform communication protocol. In another sense, it's an arms race: mail providers and ISPs implement ever more stringent checks and policies to prevent junk mail, and if that means the occasional important message gets sent to junk by mistake, then hey, no big deal - until you're sending out event tickets and discover that every company who uses Mimecast has decided your mail relay is sending junk. Marketing teams want beautiful, colourful, responsive emails, but their customers' mail clients are still using a subset of HTML 3.2 that doesn't even support CSS rules. And let's not even get started on how you design an email when half your readers will be using "dark mode" so everything ends up on a black background.
  
  Email is too big to change, too broken to fix... and too important to ignore. So let's look at what we need to know to get it right. We'll learn about DNS, about MX and DKIM and SPF records. We'll learn about how MIME actually works (and what happens when it doesn't). We'll learn about tools like Papercut, Mailtrap, Mailjet, Foundation, and how to incorporate them into your development process. If you're lucky, you'll even learn about UTF-7, the most cursed encoding in the history of information systems. Modern email is hacks top of hacks on top of hacks... but, hey, it's also how you got your ticket to be here today, so why not come along and find out how it actually works?
  
  email network standards
Visit annotations in context

Tags

internet history

email

network standards

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240105T102744-mrGfahzt-4Q-Email_vs_Capitalism-_or-_Why_We_Can't_Have_Nice_Things/index.html
media.dltj.org media.dltj.org

Video: A Brief But Spectacular take on the future of the internet, from Vint Cerf by PBS NewsHour, annotated

3
1. peter_murray 04 Jan 2024
  
  in Public
  
  Jan 3, 2024
  
  Vint Cerf is known for his pioneering work as one of the fathers of the internet. He now serves as the vice president and chief internet evangelist for Google where he furthers global policy development and accessibility of the internet. He shares his Brief But Spectacular take on the future of the internet.
  
  This must have been cut up from a much longer piece. The way this has been edited together makes for a short video, but the concepts are all over the place.
  
  Vint Cerf
2. peter_murray 04 Jan 2024
  
  in Public
  
  About two-thirds of the world's population have access to it. We have to understand how to make all of these applications literally accessible to everyone.
  
  Universally accessible
  
  accessible technology
3. peter_murray 04 Jan 2024
  
  in Public
  
  The good part is that voices that might not have been heard can be heard in the Internet environment. The not-so-good thing is that some voices that you don't want to hear are also amplified, including truths and untruths. So we're being asked in some sense to pay for the powerful tool that we have available by using our brains to think critically about the content that we see.
  
  Combating disinformation with literacy
  
  For better or for worse—whether the driver was an idealistic view of the world or the effect of an experimental network that got bigger than anyone could imagine—the internet is a permissive technology. The intelligence of the network is built into the edges...the center core routes packets of data without understanding the contents. I think Cerf is arguing here that the evaluation of content is something best done at the edge, too...in the minds of the internet participants.
  
  information literacy
Visit annotations in context

Tags

Vint Cerf

information literacy

accessible technology

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240104T085500-F45bQrV1ix0-A_Brief_But_Spectacular_take_on_the_future_of_the_internet-_from_Vint_Cerf/index.html
Dec 2023
www.semafor.com www.semafor.com

Meta's CTO on how the generative AI craze has spurred the company to ‘change it up’ | Semafor

1
1. peter_murray 25 Dec 2023
  
  in Public
  
  Texas has a law called CUBI and Illinois has BIPA. They prevent me from even doing the scan on somebody to determine if they’re in the set. I think these are bad laws. They prevent a very useful, reasonable, completely sensible thing.The thing that people are worried about, I don’t think anyone is building. No one has been trying to build this ‘who are these random people?’
  
  Meta’s CTO doesn’t know about Clearview AI
  
  There are companies that are trying to build systems to recognize random faces.
  
  facial recognition
Visit annotations in context

Tags

facial recognition

Annotators

peter_murray

URL

semafor.com/article/12/20/2023/meta-cto-andrew-bosworth-on-the-generative-ai-craze
ital.corejournals.org ital.corejournals.org

View of Towards an Open Source-first Praxis in Libraries

1
1. peter_murray 20 Dec 2023
  
  in Public
  
  Open Source kinship with Librarianship
  
  The open-source movement, while sharing some of the same civic ideals as librarianship, is not as motivationally coherent. Some corners of the movement are motivated by industrial or market concerns. Therefore, as open source emerges as a common option for many libraries, it is in the interests of the profession to establish, early on, the terms on which it will critically engage with open source.
  
  The synergy between open source and librarianship seem natural, but this author points to a different motivation.
Visit annotations in context

Annotators

peter_murray

URL

ital.corejournals.org/index.php/ital/article/view/16025/11883
99percentinvisible.org 99percentinvisible.org

You Ain't Nothin But a Postmark - 99% Invisible

2
1. peter_murray 11 Dec 2023
  
  in Public
  
  When the U.S. implemented standardized postage rates, the cost to mail letters became more predictable and less expensive. Now a person could pay to send a letter in advance and the recipient could get it without a fee. A stamp was proof that a sender had paid for the mail.
  
  Postage stamps as a government document, proof of postalized rate paid
2. peter_murray 11 Dec 2023
  
  in Public
  
  In the early years of the post office, mail was often sent cash on delivery, like a collect call. If someone sent you a letter, you paid for it when you picked it up at the post office. But postage was expensive and the system for calculating fees was complicated. It was hard to know what a letter would end up costing in the end.
  
  Early mail was delivered C.O.D.
  
  postal service
Visit annotations in context

Tags

postal service

Annotators

peter_murray

URL

99percentinvisible.org/episode/you-aint-nothin-but-a-postmark/
Nov 2023
thehill.com thehill.com

Science is littered with zombie studies. Here’s how to stop their spread.

1
1. peter_murray 26 Nov 2023
  
  in Public
  
  Tracking down citations to retracted publications has gotten easier.
  
  NISO’s CREC project should make it even easier by automating some of the notification process: https://www.niso.org/standards-committees/crec
  
  scholarly communication
Visit annotations in context

Tags

scholarly communication

Annotators

peter_murray

URL

thehill.com/opinion/education/4326865-science-is-littered-with-zombie-studies-heres-how-to-stop-their-spread/
arstechnica.com arstechnica.com

Data broker’s “staggering” sale of sensitive info exposed in unsealed FTC filing

1
1. peter_murray 23 Nov 2023
  
  in Public
  
  The FTC has accused Kochava of violating the FTC Act by amassing and disclosing "a staggering amount of sensitive and identifying information about consumers," alleging that Kochava's database includes products seemingly capable of identifying nearly every person in the United States. According to the FTC, Kochava's customers, ostensibly advertisers, can access this data to trace individuals' movements—including to sensitive locations like hospitals, temporary shelters, and places of worship, with a promised accuracy within "a few meters"—over a day, a week, a month, or a year. Kochava's products can also provide a "360-degree perspective" on individuals, unveiling personally identifying information like their names, home addresses, phone numbers, as well as sensitive information like their race, gender, ethnicity, annual income, political affiliations, or religion, the FTC alleged.
  
  “Capable of identifying nearly every person in the U.S.”
  
  So you have nothing to hide?
  
  surveillance capitalism
Visit annotations in context

Tags

surveillance capitalism

Annotators

peter_murray

URL

arstechnica.com/tech-policy/2023/11/data-brokers-staggering-sale-of-sensitive-info-exposed-in-unsealed-ftc-filing/
media.dltj.org media.dltj.org

"Geoffrey Hinton: “It’s Far Too Late” to Stop Artificial Intelligence" uncorrected transcript (The New Yorker Radio Hour)

1
1. peter_murray 20 Nov 2023
  
  in Public
  
  One of the ways that, that chat G BT is very powerful is that uh if you're sufficiently educated about computers and you want to make a computer program and you can instruct uh chat G BT in what you want with enough specificity, it can write the code for you. It doesn't mean that every coder is going to be replaced by Chad GP T, but it means that a competent coder uh with an imagination can accomplish a lot more than she used to be able to, uh maybe she could do the work of five coders. Um So there's a dynamic where people who can master the technology can get a lot more done.
  
  ChatGPT augments, not replaces
  
  You have to know what you want to do before you can provide the prompt for the code generation.
  
  LLMs for programming
Visit annotations in context

Tags

LLMs for programming

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20231120T143812-The_New_Yorker_Radio_Hour--Geoffrey_Hinton-_Its_Far_Too_Late_to_Stop_Artificial_Intelligence/index.html
baekdal.com baekdal.com

How independent publishing has changed from the 1990s until today

4
1. peter_murray 19 Nov 2023
  
  in Public
  
  Up until this point, every publisher had focused on 'traffic at scale', but with the new direct funding focus, every individual publisher realized that traffic does not equal money, and you could actually make more money by having an audience who paid you directly, rather than having a bunch of random clicks for the sake of advertising.The ratio was something like 1:10,000. Meaning that for every one person you could convince to subscribe, donate, become a member, or support you on Patreon ... you would need 10,000 visitors to make the same amount from advertising. Or to put that into perspective, with only 100 subscribers, I could make the same amount of money as I used to earn from having one million visitors.
  
  Direct subscription to independent publishers can beat advertising revenue
2. peter_murray 19 Nov 2023
  
  in Public
  
  Again, this promise that personalized advertising would generate better results was just not happening. Every year, the ad performance dropped, and the amount of scale needed to make up for that decline was just out of reach for anyone doing any form of niche.So, yes, the level of traffic was up by a lot, but that still didn't mean we could make more money as smaller publishers. Third party display advertising has just never worked for smaller niche publishers.
  
  Third-party personal ads drove traffic to sites, but not income
3. peter_murray 19 Nov 2023
  
  in Public
  
  We have the difference between amateurs just publishing as a hobby, or professionals publishing as a business. And on the other vector we have whether you are publishing for yourself, or whether you are publishing for others.And these differences create very different focuses. For instance, someone who publishes professionally, but 'for themselves' is a brand. That's what defines a brand magazine.Meanwhile, independent publishers are generally professionals (or trying to be), who are producing a publication for others. In fact, in terms of focus, there is no difference between being an independent publisher and a regular traditional publisher. It's exactly the same focus, just at two very different sizes.Bloggers, however, were mostly amateurs, who posted about things as hobbies, often for their own sake, which is not publishing in the business sense.Finally, we have the teachers. This is the group of people who are not trying to run a publishing business, but who are publishing for the sake of helping others.
  
  Publishing: profession versus amateur and for-you versus for-others
  
  I think I aim DLTJ mostly for the amateur/for-others quadrant
  
  publishing
4. peter_murray 19 Nov 2023
  
  in Public
  
  There was no automatic advertising delivery. There was no personalization, or any kind of tracking. Instead, I go through all of this every morning, picking which ads I thought looked interesting today, and manually changing and updating the pages on my site.This also meant that, because there was no tracking, the advertising companies had no idea how many times an ad was viewed, and as such, we would only get paid per click.Now, the bigger sites had started to do dynamic advertising, which allowed them to sell advertising per view, but, as an independent publisher, I was limited to only click-based advertising.However, that was actually a good thing. Because I had to pick the ads manually, I needed to be very good at understanding my audience and what they needed when they visited my site. And so there was a link between audience focus and the advertising.Also, because it was click based, it forced me as an independent publisher to optimize for results, whereas a 'per view' model often encouraged publishers to lower their value to create more ad views.
  
  Per-click versus per-view advertising in the 1900s internet
  
  digital advertising internet history
Visit annotations in context

Tags

publishing

digital advertising

internet history

Annotators

peter_murray

URL

baekdal.com/newsletter/how-independent-publishing-has-changed-from-the-1990s-until-today/
themarkup.org themarkup.org

‘Unmasking AI’ and the Fight for Algorithmic Justice – The Markup

5
1. peter_murray 19 Nov 2023
  
  in Public
  
  How will people build professional callouses if the early work that may be viewed as mundane essentials are taken over by AI systems? Do we risk living in the age of the last masters, the age of the last experts?
  
  Professional callouses
  
  This is a paragraph too far. There are many unnecessary "callouses" that have been removed from work, and we are better for it. Should we go back to the "computers" of the 1950s and 1960s...women whose jobs were to make mathematical calculations?
  
  As technology advances, there are actions that are "pushed down the complexity stack" of what is assumed to exist and can be counted on.
  
  human computers
2. peter_murray 19 Nov 2023
  
  in Public
  
  I am even more attuned to creative rights. We can address algorithms of exploitation by establishing creative rights that uphold the four C’s: consent, compensation, control, and credit. Artists should be paid fairly for their valuable content and control whether or how their work is used from the beginning, not as an afterthought.
  
  Consent, compensation, control, and credit for creators whose content is used in AI models
  
  LLM ethics
3. peter_murray 19 Nov 2023
  
  in Public
  
  Generative AI systems that allow for biometric clones can easily exploit our likeness through the creation of synthetic media that propagate deep fakes. We need biometric rights that protect our faces and voices from algorithms of exploitation.
  
  On the need for biometric rights to prevent activities like deep fakes
  
  deepfake
4. peter_murray 19 Nov 2023
  
  in Public
  
  The nightmares of AI discrimination and exploitation are the lived reality of those I call the excoded
  
  Defining 'excoded'
  
  automated decision systems
5. peter_murray 19 Nov 2023
  
  in Public
  
  AI raises the stakes because now that data is not only used to make decisions about you, but rather to make deeply powerful inferences about people and communities. That data is training models that can be deployed, mobilized through automated systems that affect our fundamental rights and our access to whether you get a mortgage, a job interview, or even how much you’re paid. Thinking individually is only part of the equation now; you really need to think in terms of collective harm. Do I want to give up this data and have it be used to make decisions about people like me—a woman, a mother, a person with particular political beliefs?
  
  Adding your data to AI models is a collective decision
  
  automated decision systems
Visit annotations in context

Tags

LLM ethics

deepfake

human computers

automated decision systems

Annotators

peter_murray

URL

themarkup.org/hello-world/2023/11/18/unmasking-ai-and-the-fight-for-algorithmic-justice
Oct 2023
media.dltj.org media.dltj.org

Video: Why does Aviation have SO Many UNITS of Measurement? by Mentour Now, annotated

3
1. peter_murray 17 Oct 2023
  
  in Public
  
  the name knot is a direct reference to actual knots tied on the rope. One end of the rope was then attached to a big spool which the rope was wound up on and the other end of the rope was attached to a type of triangular wooden board in a very specific way. The idea was actually pretty simple. The wooden board which was called the chip log was then thrown overboard from the back of the ship and one of the sides of the wooden ship had a lead weight to keep it vertical on the surface of the sea. As the ship then sped forward, the wood was supposed to stay mostly still on the surface of the water, with the rope quickly unwinding from the spool. On the rope, there were knots spaced out on regular intervals either every 47 feet and 3 inches or every 48 ft, depending on which source you use. The idea was that the sailors would use a small hourglass which measured either 28 seconds or 30 seconds, again, depending on the source and they would count how many knots went by in that time and the answer that they then derived from this exercise was the ship's speed, measured in literal knots
  
  Origin of "knot" as a unit of length
  
  Also "knot" is nautical miles per hour.
  
  standards
2. peter_murray 17 Oct 2023
  
  in Public
  
  one being that a nautical mile is the meridian arc length, corresponding to one minute of a degree of latitude. In other words, the full circumference of the earth is 360° and a nautical mile is one 60th of one degree at the equator but that's a historical definition. Today, a nautical mile is defined using the metric system like everything else and it is exactly 1,852 meters.
  
  Historic and current definitions of "nautical mile"
  
  standards
3. peter_murray 17 Oct 2023
  
  in Public
  
  the United States officially started defining its own units by using the metric system as a reference, way back in 1893 with something called the Mendenhall Order. And that means that, for example, today, the inch is not defined as the length of three barley corns which genuinely was its official definition for centuries, instead the inch is now officially defined as 25.4 millimeters and all other US customary units have similar metric definitions
  
  United States Customary Units defined with metric measurements
  
  Also referred to as the Traditional System of Weights and Measures
  
  standards
Visit annotations in context

Tags

standards

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20231017T162235-BBq-WwmQ8KM-Why_does_Aviation_have_SO_Many_UNITS_of_Measurement-/index.html
Aug 2023
hangingtogether.org hangingtogether.org

OCLC Control Numbers - Lots of them; all public domain - Hanging Together

2
1. peter_murray 04 Aug 2023
  
  in Public
  
  Some people thought that the Control Number represented a mechanism for identifying a record as having originated with OCLC and therefore subject to the cooperative’s record use policy.
  
  Control number as mechanism for identifying a WorldCat record
  
  Yet isn't this what the [[OCLCvClarivate]] lawsuit said?
2. peter_murray 04 Aug 2023
  
  in Public
  
  Recently we recommended that OCLC declare OCLC Control Numbers (OCN) as dedicated to the public domain. We wanted to make it clear to the community of users that they could share and use the number for any purpose and without any restrictions. Making that declaration would be consistent with our application of an open license for our own releases of data for re-use and would end the needless elimination of the number from bibliographic datasets that are at the foundation of the library and community interactions. I’m pleased to say that this recommendation got unanimous support and my colleague Richard Wallis spoke about this declaration during his linked data session during the recent IFLA conference. The declaration now appears on the WCRR web page and from the page describing OCNs and their use.
  
  OCLC Control Numbers are in the public domain
  
  An updated link for the "page describing OCNs and their use" says:
  
  The OCLC Control Number is a unique, sequentially assigned number associated with a record in WorldCat. The number is included in a WorldCat record when the record is created. The OCLC Control Number enables successful implementation and use of many OCLC products and services, including WorldCat Discovery and WorldCat Navigator. OCLC encourages the use of the OCLC Control Number in any appropriate library application, where it can be treated as if it is in the public domain.
  
  WorldCat record use policy bibliographic tools
Visit annotations in context

Tags

bibliographic tools

WorldCat record use policy

Annotators

peter_murray

URL

hangingtogether.org/oclc-control-numbers-lots-of-them-all-public-domain/
Jul 2023
www.rfc-editor.org www.rfc-editor.org

RFC 8783: Distributed Denial-of-Service Open Threat Signaling (DOTS) Data Channel Specification

2
1. peter_murray 26 Jul 2023
  
  in Public
  
  It's noteworthy that RFC 7258 doesn't consider that bad actors are limited to governments, and personally, I think many advertising industry schemes for collecting data are egregious examples of pervasive monitoring and hence ought also be considered an attack on the Internet that ought be mitigated where possible. However, the Internet technical community clearly hasn't acted in that way over the last decade.
  
  Advertising industry schemes considered an attack
  
  Stephen Farrell's perspective.
  
  surveillance capitalism
2. peter_murray 26 Jul 2023
  
  in Public
  
  Many have written about how being under constant surveillance changes a person. When you know you're being watched, you censor yourself. You become less open, less spontaneous. You look at what you write on your computer and dwell on what you've said on the telephone, wonder how it would sound taken out of context, from the perspective of a hypothetical observer. You're more likely to conform. You suppress your individuality. Even though I have worked in privacy for decades, and already knew a lot about the NSA and what it does, the change was palpable. That feeling hasn't faded. I am now more careful about what I say and write. I am less trusting of communications technology. I am less trusting of the computer industry.
  
  How constant surveillance changes a person
  
  Bruce Schneier's perspective.
  
  surveillance
Visit annotations in context

Tags

surveillance capitalism

surveillance

Annotators

peter_murray

URL

rfc-editor.org/rfc/rfc9457
arxiv.org arxiv.org

2306.04141.pdf

11
1. peter_murray 21 Jul 2023
  
  in Public
  
  A second, complementary, approach relies on post-hoc machine learning and forensic anal-ysis to passively identify statistical and physical artifacts left behind by media manipulation.For example, learning-based forensic analysis techniques use machine learning to automati-cally detect manipulated visual and auditory content (see e.g. [94]). However, these learning-based approaches have been shown to be vulnerable to adversarial attacks [95] and contextshift [96]. Artifact-based techniques exploit low-level pixel artifacts introduced during synthe-sis. But these techniques are vulnerable to counter-measures like recompression or additivenoise. Other approaches involve biometric features of an individual (e.g., the unique motionproduced by the ears in synchrony with speech [97]) or behavioral mannerisms [98]). Biomet-ric and behavioral approaches are robust to compression changes and do not rely on assump-tions about the moment of media capture, but they do not scale well. However, they may bevulnerable to future generative-AI systems that may adapt and synthesize individual biometricsignals.
  
  Examples of methods for detecting machine generated visual media
  
  LLM detection
2. peter_murray 21 Jul 2023
  
  in Public
  
  tabula rasa
  
  Latin for "scraped tablet" meaning "clean slate". Tabula rasa | Britannica
3. peter_murray 21 Jul 2023
  
  in Public
  
  he new tools have sparked employment concerns for creative occupations suchas composers, designers, and writers. This conflict arises because SBTC fails to differentiatebetween cognitive activities like analytical work and creative ideation. Recent research [82, 83]demonstrates the need to quantify the specific activities of various artistic workers before com-paring them to the actual capabilities of technology. A new framework is needed to characterize
  
  Generative AI straddles analytical work and creative ideation
  
  the specific steps of the creative process, precisely which and how those steps might be impacted by generative AI tools, and the resulting effects on workplace requirements and activities of varying cognitive occupations.
  
  Unlike previous automation tools (which took on repetitive processes), generative AI encroaches on some parts of the creative process.
  
  SBTC: Skill-Biased Technological Change framework
4. peter_murray 21 Jul 2023
  
  in Public
  
  First, under a highly permissive view, theuse of training data could be treated as non-infringing because protected works are not directlycopied. Second, the use of training data could be covered by a fair-use exception because atrained AI represents a significant transformation of the training data [63, 64, 65, 66, 67, 68].1Third, the use of training data could require an explicit license agreement with each creatorwhose work appears in the training dataset. A weaker version of this third proposal, is to atleast give artists the ability to opt-out of their data being used for generative AI [69]. Finally,a new statutory compulsory licensing scheme that allows artworks to be used as training databut requires the artist to be remunerated could be introduced to compensate artists and createcontinued incentives for human creation [70].
  
  For proposals for how copyright affects generative AI training data
  
  Consider training data a non-infringing use
  
  Fair use exception
  
  Require explicit license agreement with each creator (or an opt-out ability)
  
  Create a new "statutory compulsory licensing scheme"
  
  LLM copyright
5. peter_murray 21 Jul 2023
  
  in Public
  
  AI-generated content may also feed future generative models, creating a self-referentialaesthetic flywheel that could perpetuate AI-driven cultural norms. This flywheel may in turnreinforce generative AI’s aesthetics, as well as the biases these models exhibit.
  
  AI bias becomes self-reinforcing
  
  Does this point to a need for more diversity in AI companies? Different aesthetic/training choices leads to opportunities for more diverse output. To say nothing of identifying and segregating AI-generated output from being used i the training data of subsequent models.
  
  building LLMs
6. peter_murray 21 Jul 2023
  
  in Public
  
  In traditional artforms characterized by direct manipulation [32]of a material (e.g., painting, tattoo, or sculpture), the creator has a direct hand in creating thefinal output, and therefore it is relatively straightforward to identify the creator’s intentions andstyle in the output. Indeed, previous research has shown the relative importance of “intentionguessing” in the artistic viewing experience [33, 34], as well as the increased creative valueafforded to an artwork if elements of the human process (e.g., brushstrokes) are visible [35].However, generative techniques have strong aesthetics themselves [36]; for instance, it hasbecome apparent that certain generative tools are built to be as “realistic” as possible, resultingin a hyperrealistic aesthetic style. As these aesthetics propagate through visual culture, it can bedifficult for a casual viewer to identify the creator’s intention and individuality within the out-puts. Indeed, some creators have spoken about the challenges of getting generative AI modelsto produce images in new, different, or unique aesthetic styles [36, 37].
  
  Traditional artforms (direct manipulation) versus AI (tools have a built-in aesthetic)
  
  Some authors speak of having to wrestle control of the AI output from its trained style, making it challenging to create unique aesthetic styles. The artist indirectly influences the output by selecting training data and manipulating prompts.
  
  As use of the technology becomes more diverse—as consumer photography did over the last century, the authors point out—how will biases and decisions by the owners of the AI tools influence what creators are able to make?
  
  To a limited extent, this is already happening in photography. The smartphones are running algorithms on image sensor data to construct the picture. This is the source of controversy; see Why Dark and Light is Complicated in Photographs | Aaron Hertzmann’s blog and Putting Google Pixel's Real Tone to the test against other phone cameras - The Washington Post.
  
  AI art photography algorithm bias
7. peter_murray 21 Jul 2023
  
  in Public
  
  n order to be considered meaningful human control, a generative system shouldbe capable of incorporating a human author’s intent into its output. If a user starts with no spe-cific goal, the system should allow for open-ended, curiosity-driven exploration. As the user’sgoal becomes clearer through interaction, the system should be able to both guide and deliverthis intent. Such systems should have a degree of predictability, allowing users to graduallyunderstand the system to the extent that they can learn to anticipate the results of their actions.Given these conditions, we can consider the human user as accountable for the outputs of thegenerative system. In other words, MHC is achieved if human creators can creatively expressthemselves through the generative system, leading to an outcome that aligns with their inten-tions and carries their personal, expressive signature. Future work is needed to investigate inwhat ways generative systems and interfaces can be developed that allow more meaningful hu-man control by adding input streams that provide users fine-grained causal manipulation overoutputs.
  
  Meaningful Human Control of AI
  
  A concept originally from autonomous weapons, MHC is a design concept where the tool gradually adapts its output to the expectations of its users. The results are the a creative output that "aligns with [the users'] intentions and carries their personal, expressive signature."
8. peter_murray 21 Jul 2023
  
  in Public
  
  Anthropomorphizing AI can pose challenges to the ethicalusage of this technology [12]. In particular, perceptions of human-like agency can underminecredit to the creators whose labor underlies the system’s outputs [13] and deflect responsibil-ity from developers and decision-makers when these systems cause harm [14]. We, therefore,discuss generative AI as a tool to support human creators [15], rather than an agent capableof harboring its own intent or authorship. In this view, there is little room for autonomousmachines being “artists” or “creative” in their own right.
  
  Problems with anthropomorphizing AI
9. peter_murray 21 Jul 2023
  
  in Public
  
  Unlike past disruptions, however, generative AI relies on training data made by people
  
  Generative AI is different from past innovations
  
  The output of creators is directly input into the technology, which mages generative AI different. And creates questions that don't have parallels to past innovations
10. peter_murray 21 Jul 2023
  
  in Public
  
  Generative AI tools, at first glance, seem to fully automate artistic production—an impres-sion that mirrors past instances when traditionalists viewed new technologies as threatening“art itself.” In fact, these moments of technological change did not indicate the “end of art,” buthad much more complex effects, recasting the roles and practices of creators and shifting theaesthetics of contemporary media [3].
  
  Examples of how new technology displaced traditional artists
  
  photography versus painting: replacing portrait painters
  
  music production: digital sampling and sound synthesis
  
  computer animation and digital photography
11. peter_murray 21 Jul 2023
  
  in Public
  
  Epstein, Ziv, Hertzmann, Aaron, Herman, Laura, Mahari, Robert, Frank, Morgan R., Groh, Matthew, Schroeder, Hope et al. "Art and the science of generative AI: A deeper dive." ArXiv, (2023). Accessed July 21, 2023. https://doi.org/10.1126/science.adh4451.
  
  Abstract
  
  A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation. The generative capabilities of these tools are likely to fundamentally alter the creative processes by which creators formulate ideas and put them into production. As creativity is reimagined, so too may be many sectors of society. Understanding the impact of generative AI - and making policy decisions around it - requires new interdisciplinary scientific inquiry into culture, economics, law, algorithms, and the interaction of technology and creativity. We argue that generative AI is not the harbinger of art's demise, but rather is a new medium with its own distinct affordances. In this vein, we consider the impacts of this new medium on creators across four themes: aesthetics and culture, legal questions of ownership and credit, the future of creative work, and impacts on the contemporary media ecosystem. Across these themes, we highlight key research questions and directions to inform policy and beneficial uses of the technology.
  
  generative artificial intelligence
Visit annotations in context

Tags

building LLMs

generative artificial intelligence

LLM detection

LLM copyright

photography

algorithm bias

AI art

Annotators

peter_murray

URL

arxiv.org/pdf/2306.04141.pdf
newlaborforum.cuny.edu newlaborforum.cuny.edu

An Accident in History: The Rana Plaza collapse | | New Labor Forum

4
1. peter_murray 19 Jul 2023
  
  in Public
  
  The original law stipulated a fifty–fifty split of the regular minimum wage: employers paid a subminimum cash wage that was half of the regular minimum wage; the other half was provided via customers’ tips. Even today, many customers do not know that a tip intended as a gratuity is often a wage-subsidy provided to the employer.
  
  Effect of the 1966 Fair Labor Standards Act amendment
  
  Introduction of the sub-minimum wage, which was originally at 50% but is now lower.
2. peter_murray 19 Jul 2023
  
  in Public
  
  The Pullman Train Company, for instance, hired many formerly enslaved people and fought hard to keep paid wages very low. When investigated by the Railroad Commission of California in 1914, Pullman argued that they “paid adequate wages and did not expect their employees to exact tips”—an assertion unfounded by payroll data and strongly rejected by the commission.[3] Pullman, in fact, left it to the mostly white customers to determine the workers’ compensation through voluntary and unpredictable tips.
  
  Tipping of service workers in post-slavery America
  
  …allowing those served—whites—to determine the compensation of those serving—blacks.
3. peter_murray 19 Jul 2023
  
  in Public
  
  The practice of tipping is traced to the Middle Ages and the European feudal system, when masters would sporadically give pocket change to their servants. The practice outlasted the feudal era, becoming a habit between customers—often upper class—and service workers. It also spread more generally. The modern custom of tipping was imported to the United States in the nineteenth century by upper-class American travelers to Europe. At the same time, an influx of European workers more acquainted with the practice helped to establish and spread the practice of tipping in the United States.
  
  Origins of tipping
4. peter_murray 19 Jul 2023
  
  in Public
  
  Found via https://www.npr.org/2023/07/17/1187275511/tipping-minimum-wage-tips-tip-screen
  
  employment
Visit annotations in context

Tags

employment

Annotators

peter_murray

URL

newlaborforum.cuny.edu/2015/10/15/marxism-and-consumer-culture/
media.dltj.org media.dltj.org

Video: Win-win in the Hallertau: sow hops, harvest electricity by Der Spiegel, annotated

2
1. peter_murray 05 Jul 2023
  
  in Public
  
  which cost around 1.5 million euros
  
  Average German household electricity consumption: 3,113 kilowatt-hours (2018, source)
  
  Energy consumption for 250 households: 778,250 kilowatt-hours (or 778.25 mwh).
  
  Wholesale electricity price in Germany is 102.4 euros per megawatt-hour (2023, source)
  
  Yearly revenue: €79,692.80.
  
  Payback period on €1.5M: about 19 years, not including maintenance.
2. peter_murray 05 Jul 2023
  
  in Public
  
  4-Jul-2023 — Transcript is translated from German by YouTube.
  
  Description (translated from German):
  
  The problem of a Bavarian farmer: his hops are thirsty and the energy transition is progressing too slowly. The solution: a solar system that provides shade over the fields - and a sustainable second source of income.
  
  climate change
Visit annotations in context

Tags

climate change

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20230705T134203-XWe4GPw87NQ-Win-win_in_the_Hallertau-_sow_hops-_harvest_electricity/index.html
media.dltj.org media.dltj.org

Video: UMass Professor Explains the Internet in 5 Levels of Difficulty by Wired, annotated

5
1. peter_murray 03 Jul 2023
  
  in Public
  
  in that sense the internet is very decentralized that the control of the network is up to whoever owns the network the two percent where you said there's nobody in control there's a little bit of centralized control there's an organization called the internet Corporation for assigned names and numbers
  
  ICANN as the single centralized point on the internet
2. peter_murray 03 Jul 2023
  
  in Public
  
  if you think about sending letters through the US Postal Service how you've got an address on it so every packet that flows from the Netflix server to you has an address on it says this is going to Jenna it's going to the What's called the Internet Protocol address of your device think of all the the range of devices that are hooked up to the Internet it's totally amazing right every single one of them has one thing in common and that is they speak the IP protocol the Internet Protocol
  
  IP address networking like postal addresses
3. peter_murray 03 Jul 2023
  
  in Public
  
  the internet is a lot like that it's an interconnection of local roads local networks like the network in your house for example how does like all of the um networks in my house connect to all the city networks
  
  Internetworking as a network of roads
4. peter_murray 03 Jul 2023
  
  in Public
  
  Protocols are you up for one yeah knock knock who's there lettuce lettuce who let us go on a knock knock joke is an example of a protocol
  
  Explaining protocols as a knock-knock joke
5. peter_murray 03 Jul 2023
  
  in Public
  
  Nov 23, 2022
  
  The internet is the most technically complex system humanity has ever built. Jim Kurose, Professor at UMass Amherst, has been challenged to explain the internet to 5 different people; a child, a teen, a college student, a grad student, and an expert.
  
  internet infrastructure
Visit annotations in context

Tags

internet infrastructure

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20230703T191658-0EqKnvzo3no-UMass_Professor_Explains_the_Internet_in_5_Levels_of_Difficulty/index.html
www.theverge.com www.theverge.com

Who killed Google Reader?

5
1. peter_murray 01 Jul 2023
  
  in Public
  
  But Reader was also very much a product of Google’s infrastructure. Outside Google, there wouldn’t have been access to the company’s worldwide network of data centers, web crawlers, and excellent engineers. Reader existed and worked because of Google’s search stack, because of the work done by Blogger and Feedburner and others, and most of all, the work done by dozens of Google employees with 20 percent of their time to spare and some ideas about how to make Reader better. Sure, Google killed Reader. But nearly everyone I spoke to agreed that without Google, Reader could never have been as good as it was.
  
  Reader could not have existed without Google infrastructure and talent
2. peter_murray 01 Jul 2023
  
  in Public
  
  At its peak, Reader had just north of 30 million users, many of them using it every day. That’s a big number — by almost any scale other than Google’s. Google scale projects are about hundreds of millions and billions of users, and executives always seemed to regard Reader as a rounding error. Internally, lots of workers used and loved it, but the company’s leadership began to wonder whether Reader was ever going to hit Google scale. Almost nothing ever hits Google scale, which is why Google kills almost everything.
  
  Google Scale is needed for a product to survive
  
  Google
3. peter_murray 01 Jul 2023
  
  in Public
  
  One feature took off immediately, for power users and casual readers alike: a simple sharing system that let users subscribe to see someone else’s starred items or share their collection of subscriptions with other people. The Reader team eventually built comments, a Share With Note feature, and more.
  
  Simple social sharing made the product take off
  
  social networks
4. peter_murray 01 Jul 2023
  
  in Public
  
  It wasn’t until the team launched a redesign in 2006 that added infinite scrolling, unread counts, and some better management tools for heavy readers that Reader took off.
  
  Be prepared to throw out the first version
5. peter_murray 01 Jul 2023
  
  in Public
  
  he bristles thinking about the fight and the fact that Google Reader is known as “an RSS reader” and not the ultra-versatile information machine it could have become. Names matter, and Reader told everyone that it was for reading when it could have been for so much more
  
  Product names matter
  
  It sets the perception for the boundaries of what something can be.
  
  marketing
Visit annotations in context

Tags

Google

marketing

social networks

Annotators

peter_murray

URL

theverge.com/23778253/google-reader-death-2013-rss-social
Jun 2023
media.dltj.org media.dltj.org

Video: The Magnitude 9.1 Meltdown at Fukushima by Nickolas Means (RubyConf 2022), annotated

4
1. peter_murray 26 Jun 2023
  
  in Public
  
  think that the folks at the top of the org chart know more than we do and they often do have helpful holistic perspective about the company and the industry it operates in but all of the actual work of an organization all of its output happens at the bottom of the org chart and the teams at the edge of the organization leaders at the top may have a wide perspective but the edges are where an organization's detailed knowledge lives
  
  Knowledge of the organization lives at the edges, not at the top
2. peter_murray 26 Jun 2023
  
  in Public
  
  documenting the steps conversations tools and other activities required to complete a Core Business activity like processing an insurance claim or launching a new product these Maps were always surprising for the teams that created them and they often raised existential questions about employees roles in the organization the reason was that they revealed the actual structure of the organization the relationships the information Pathways that were responsible for the organization actually being able to get work done they learned the actual structure of the organization was an organic emergent phenomenon constantly shifting and changing based on the work to be done and often bearing little resemblance to the formal hierarchy of the organization
  
  Mapping organizational processes
  
  In the process of mapping these processes, the organization learns the real structure of the organization—beyond what is in the org chart.
3. peter_murray 26 Jun 2023
  
  in Public
  
  paper published by Dr Ruth Ann heising has some insight for us
  
  Ruthanne Huising (2019) Moving off the Map: How Knowledge of Organizational Operations Empowers and Alienates. Organization Science 30(5):1054-1075. https://doi.org/10.1287/orsc.2018.1277
4. peter_murray 26 Jun 2023
  
  in Public
  
  Description
  
  It was mid-afternoon on Friday, March 11, 2011 when the ground in Tōhoku began to shake. At Fukushima Daiichi nuclear power plant, it seemed like the shaking would never stop. Once it did, the reactors had automatically shut down, backup power had come online, and the operators were well on their way to having everything under control. And then the tsunami struck. They found themselves facing something beyond any worse-case scenario they imagined, and their response is a study in contrasts. We can learn a lot from the extremes they experienced about finding happiness and satisfaction at work.
  
  systemic thinking
Visit annotations in context

Tags

systemic thinking

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20230625T225546-sTxQ5x54JO0-The_Magnitude_9-1_Meltdown_at_Fukushima/index.html
media.dltj.org media.dltj.org

Video: Taking the 737 to the Max by Nickolas Means (RubyConf 2021), annotated

3
1. peter_murray 26 Jun 2023
  
  in Public
  
  In the intro to thinking and its, meadows says everyone on everything in a system can act dutifully and rationally, yet all of these well meaning actions often lead up to a perfectly terrible result. This is what happened here. Every actor in the system borrowed a little bit of safety to optimize something else
  
  Rational actions can lead to disastrous results
  
  "All of life is a system"
2. peter_murray 26 Jun 2023
  
  in Public
  
  n her fantastic book, thinking in systems, we are given great tools to pick apart the situation using systems thinking. Meadows introduces the stocks and flows.
  
  Stocks and Flows
  
  From Donella Meadows' Thinking in Systems. In this case, the "stock" is safety and "flows" are actions that make the reservoir of safety go up (more rigorous review of the aircraft design) or down (increased focus on the bottom line over engineering concerns).
  
  systemic thinking
3. peter_murray 26 Jun 2023
  
  in Public
  
  one of the first I learned to spot was the Boeing 737. It's the best selling commercial aircraft of all time, it's everywhere. Once you know the trick, it's incredibly easy to identify in the air. It has no doors over the landing gear, if it's too big to be a regional, it's a 737. You can see how the gears swing out from the wheel shaped cavities in the center of the fuselage.
  
  Boeing 737s don't have doors for the main landing gear
  
  commercial aviation
Visit annotations in context

Tags

systemic thinking

commercial aviation

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20230625T222349-Taking_the_737_to_the_Max/index.html
deliverypdf.ssrn.com deliverypdf.ssrn.com

SSRN-id1998119.pdf

2
1. peter_murray 18 Jun 2023
  
  in Public
  
  On the advice of my lawyer, I respectfullydecline to answer on the basis of the FifthAmendment, which—according to theUnited States Supreme Court—protects ev-eryone, even innocent people, from the needto answer questions if the truth might beused to help create the misleading impres-sion that they were somehow involved in acrime that they did not commit.
  
  Suggested wording
  
  Exercise a Fifth Amendment right without using the word "incriminating"
2. peter_murray 18 Jun 2023
  
  in Public
  
  Citation
  
  Duane, James, The Right to Remain Silent: A New Answer to an Old Question (February 2, 2012). Criminal Justice, Vol. 25, No. 2, 2010, Available at SSRN: https://ssrn.com/abstract=1998119
  
  fifth amendment law enforcement
Visit annotations in context

Tags

law enforcement

fifth amendment

Annotators

peter_murray

URL

deliverypdf.ssrn.com/delivery.php
media.dltj.org media.dltj.org

"NISO Unfettered Access: A Conversation Series - Featuring Marshall Breeding" uncorrected transcript (NISO Media Library)

3
1. peter_murray 09 Jun 2023
  
  in Public
  
  to what extent is the vendor community adopting bib frame in their systems architectures
  
  On BIBFRAME adoption
  
  Library of Congress and ShareVDE. The big guys are doing it, but will there become a split between the big libraries and small libraries over BIBFRAME? Will the diversity of metadata models inhibit interoperability?
  
  BIBFRAME
2. peter_murray 09 Jun 2023
  
  in Public
  
  10% more or less of academic libraries in the US use an open source system after all that time. And about either 17 or 14, I'd have the number in front of me for and to public libraries are using an open source I L S
  
  Percentage of open source ILS in academic and public libraries
  
  open source software
3. peter_murray 09 Jun 2023
  
  in Public
  
  The industry has changed from being fragmented to consolidated
  
  On the consolidation of the library automation field
  
  library automation
Visit annotations in context

Tags

BIBFRAME

library automation

open source software

Annotators

peter_murray

URL

media.dltj.org/unchecked-transcript/20230609T153516-NISO_Media_Library--NISO_Unfettered_Access-_A_Conversation_Series_-_Featuring_Marshall_Breeding/index.html
crsreports.congress.gov crsreports.congress.gov

R47583

4
1. peter_murray 02 Jun 2023
  
  in Public
  
  Such global and regional climate statementsdiffer from attributing specific extremeweather events to specific human influences,which scientists once considered infeasiblewith then-existing data and methods.18 Thischanged with the publication of an article in2003 proposing a method of establishing legalliability for climate change by determining how much human influence had changed theprobability of an undesirable event, such as flooding.
  
  Extreme Event Attribution begins in 2003
  
  Allen, Myles. "Liability for climate change." Nature 421, no. 6926 (2003): 891-892. Accessed June 2, 2023. https://doi.org/10.1038/421891a
2. peter_murray 02 Jun 2023
  
  in Public
  
  The use of probability and risk interchangeably can causeconfusion. For example, two common methods toestimate the probability of occurrence of a naturalhazard event include the term risk in their names: theRisk Ratio (RR) and the Fraction of Attributable Risk(FAR). In this report, when referring to RR and FAR,the term risk refers to the climatic or meteorologicalprobability of an event of a specific magnitude, not tothe potential impact of the event on human systems.Apart from discussing these specific terms that use riskin their definitions, this report uses the term hazard asthe probability of a particular event occurring, such as ahurricane, and risk as the hazard combined with thevulnerability of humans and human systems to thathazard. In this sense, the risk is the likelihood ofadverse outcomes from the hazard. For example, thehazard of a major hurricane striking the Florida coasttoday and 100 years ago may be the same, but the riskis much higher today because of the growth in theamount of exposed infrastructure.
  
  Definitions of probability/risk and hazard
  
  Hazard == probability of a particular event occurring
  
  Risk == hazard plus impact on humans and human systems
3. peter_murray 02 Jun 2023
  
  in Public
  
  Climate change attribution is the study of whether, or to what degree, human influence may havecontributed to extreme climate or weather events.Advances in the science of climate attribution now allow scientists to make estimates of thehuman contribution to such events.
  
  Definition of Climate change attribution
4. peter_murray 02 Jun 2023
  
  in Public
  
  Is That Climate Change? The Science of Extreme Event Attribution
  
  Congressional Research Service R47583 June 1, 2023
  
  climate change
Visit annotations in context

Tags

climate change

Annotators

peter_murray

URL

crsreports.congress.gov/product/pdf/R/R47583
Apr 2023
doctorow.medium.com doctorow.medium.com

How To Make the Least-Worst Mastodon Threads

1
1. peter_murray 17 Apr 2023
  
  in Public
  
  Twitter is a neat illustration of the problem with benevolent dictatorships: they work well, but fail badly. Because they are property — not protocols — they can change hands, and overnight, you get a new, malevolent dictator who wants to retool the system for extraction, rather than collaboration.
  
  Benevolent dictatorships: work well; fail badly
  
  Twitter is the example listed here. But I wonder about benevolent dictatorships in open source. One example: does Linus have a sound succession plan for Linux? (Can such a succession plan even be tested and adjusted?)
  
  open source software online community management
Visit annotations in context

Tags

open source software

online community management

Annotators

peter_murray

URL

doctorow.medium.com/how-to-make-the-least-worst-mastodon-threads-daa33943ac31
arxiv.org arxiv.org

Eight Things to Know about Large Language Models

10
1. peter_murray 16 Apr 2023
  
  in Public
  
  LLMs predictably get more capable withincreasing investment, even withouttargeted innovation
  
  Three variables guide capability: the amount of data ingested, the number of parameters, and the computing power used to train the model. This assumes there are no fundamental changes in the system design. This allows engineers to predict the rough capabilities before the effort and expense of building the model.
2. peter_murray 16 Apr 2023
  
  in Public
  
  Discussion and Limitations
  
  The author identifies these limitations to the current knowledge and predicting advancement:
  
  We should expect some of the prominent flaws of current LLMs to improve significantly
  
  There will be incentives to deploy LLMs as agents that flexibly pursue goals
  
  LLM developers have limited influence over what is developed
  
  LLMs are likely to produce a rapidly growing array of risks
  
  Negative results with LLMs can be difficult to interpret but point to areas of real weakness
  
  The science and scholarship around LLMs is especially immature
3. peter_murray 16 Apr 2023
  
  in Public
  
  Brief interactions with LLMs are oftenmisleading
  
  Instruction-following behavior aren't native to the models, and changes in prompt phrasing can have a dramatic impact on the output.
4. peter_murray 16 Apr 2023
  
  in Public
  
  LLMs need not express the values of theircreators nor the values encoded in web text
  
  How a model is pre-trained has a greater influence over the output than the text it was trained on. This opens the possibility for interventions in the form of "constitutional AI"—a set of norms and values as constraints in the pre-training stages. There remains the problem of how there are no reliable ways to guarantee behavior (see the fourth point).
5. peter_murray 16 Apr 2023
  
  in Public
  
  Human performance on a task isn’t anupper bound on LLM performance
  
  Models process far more information that any human can see. Also, "LLMs appear to be much better than humans at their pretraining task...and humans can teach LLMs to do some simple tasks more accurately than the humans themselves."
6. peter_murray 16 Apr 2023
  
  in Public
  
  Experts are not yet able to interpret theinner workings of LLMs
  
  The number of connections between tokens (billions) make a deterministic understanding of how an answer is derived impossible for humans to understand. There are techniques that, at some level, help with understanding models, but that understanding breaks down with later models.
7. peter_murray 16 Apr 2023
  
  in Public
  
  There are no reliable techniques forsteering the behavior of LLMs
  
  Fine-tuning and reinforced learning clearly effect the output of models, but they are not completely effective and the effect of such training cannot be predicted with sufficient certainty. This is the source of concern by many researchers for losing control over LLMs (presumably when LLMs are more tightly integrated with external actions.
8. peter_murray 16 Apr 2023
  
  in Public
  
  LLMs often appear to learn and userepresentations of the outside world
  
  The models show evidence of reasoning about abstract concepts, including color perception, adaptation based on what an author knows or believes, spatial layouts, and distinguishing misconceptions from facts. The paper notes that they conflicts with the "next-word-predictor" way that LLMs are explained.
9. peter_murray 16 Apr 2023
  
  in Public
  
  Specific important behaviors in LLM tendto emerge unpredictably as a byproduct ofincreasing investment
  
  Engineers can not (yet?) predict the capabilities that will emerge for a given quantity of data, model size, and computing power. Although they know if will be more capable, they don't know what those capabilities will be. The paper notes that surveys of researchers underestimated the capabilities of emerging models. Researchers were surveyed in 2021; the capabilities that were expected to be possible in 2024 were actually seen in 2022, and GPT-4's capabilities were not expected until 2025.
10. peter_murray 16 Apr 2023
  
  in Public
  
  Bowman, Samuel R.. "Eight Things to Know about Large Language Models." arXiv, (2023). https://doi.org/https://arxiv.org/abs/2304.00612v1.
  
  Abstract
  
  The widespread public deployment of large language models (LLMs) in recent months has prompted a wave of new attention and engagement from advocates, policymakers, and scholars from many fields. This attention is a timely response to the many urgent questions that this technology raises, but it can sometimes miss important considerations. This paper surveys the evidence for eight potentially surprising such points: 1. LLMs predictably get more capable with increasing investment, even without targeted innovation. 2. Many important LLM behaviors emerge unpredictably as a byproduct of increasing investment. 3. LLMs often appear to learn and use representations of the outside world. 4. There are no reliable techniques for steering the behavior of LLMs. 5. Experts are not yet able to interpret the inner workings of LLMs. 6. Human performance on a task isn't an upper bound on LLM performance. 7. LLMs need not express the values of their creators nor the values encoded in web text. 8. Brief interactions with LLMs are often misleading.
  
  Found via: Taiwan's Gold Card draws startup founders, tech workers | Semafor
  
  large language models
Visit annotations in context

Tags

large language models

Annotators

peter_murray

URL

arxiv.org/pdf/2304.00612.pdf
techcrunch.com techcrunch.com

Researchers discover a way to make ChatGPT consistently toxic

2
1. peter_murray 14 Apr 2023
  
  in Public
  
  ai bias
2. peter_murray 14 Apr 2023
  
  in Public
  
  study co-authored by scientists at the Allen Institute for AI,
  
  https://doi.org/10.48550/arXiv.2304.05335
Visit annotations in context

Tags

ai bias

Annotators

peter_murray

URL

techcrunch.com/2023/04/12/researchers-discover-a-way-to-make-chatgpt-consistently-toxic/
crsreports.congress.gov crsreports.congress.gov

Generative Artificial Intelligence and Copyright Law

5
1. peter_murray 12 Apr 2023
  
  in Public
  
  Do AI Outputs Infringe Copyrights in Other Works?
2. peter_murray 12 Apr 2023
  
  in Public
  
  Does the AI Training Process Infringe Copyright in Other Works?
3. peter_murray 12 Apr 2023
  
  in Public
  
  Who Owns the Copyright to Generative AI Outputs?
4. peter_murray 12 Apr 2023
  
  in Public
  
  Do AI Outputs Enjoy Copyright Protection?
5. peter_murray 12 Apr 2023
  
  in Public
  
  Abstract
  
  Recent innovations in artificial intelligence (AI) are raising new questions about how copyright law principles such as authorship, infringement, and fair use will apply to content created or used by AI. So-called “generative AI” computer programs—such as Open AI’s DALL-E 2 and ChatGPT programs, Stability AI’s Stable Diffusion program, and Midjourney’s self-titled program—are able to generate new images, texts, and other content (or “outputs”) in response to a user’s textual prompts (or “inputs”). These generative AI programs are “trained” to generate such works partly by exposing them to large quantities of existing works such as writings, photos, paintings, and other artworks. This Legal Sidebar explores questions that courts and the U.S. Copyright Office have begun to confront regarding whether the outputs of generative AI programs are entitled to copyright protection as well as how training and using these programs might infringe copyrights in other works.
  
  generative artificial intelligence copyright
Visit annotations in context

Tags

generative artificial intelligence

copyright

Annotators

peter_murray

URL

crsreports.congress.gov/product/pdf/LSB/LSB10922
Mar 2023
arxiv.org arxiv.org

2302.07459.pdf

3
1. peter_murray 29 Mar 2023
  
  in Public
  
  On the other hand, our results are surprising in that they show we can steer models to avoid bias and dis-crimination by requesting an unbiased or non-discriminatory response in natural language. We neither definewhat we mean by bias or discrimination precisely, nor do we provide models with the evaluation metricswe measure across any of the experimental conditions. Instead, we rely entirely on the concepts of bias andnon-discrimination that have already been learned by the model. This is in contrast to classical machinelearning models used in automated decision making, where precise definitions of fairness must be describedin statistical terms, and algorithmic interventions are required to make models fair.
  
  Reduction in bias comes without defining bias
2. peter_murray 29 Mar 2023
  
  in Public
  
  Taken together, our experiments suggest that models with more than 22B parameters, and a sufficient amountof RLHF training, are indeed capable of a form of moral self-correction. In some ways, our findings areunsurprising. Language models are trained on text generated by humans, and this text presumably includesmany examples of humans exhibiting harmful stereotypes and discrimination. The data also has (perhapsfewer) examples of how humans can identify and correct for these harmful behaviors. The models can learnto do both.
  
  22B parameters and sufficient RLHF training enable self-correction
3. peter_murray 29 Mar 2023
  
  in Public
  
  Ganguli, Deep, Askell, Amanda, Schiefer, Nicholas, Liao, Thomas I., Lukošiūtė, Kamilė, Chen, Anna, Goldie, Anna et al. "The Capacity for Moral Self-Correction in Large Language Models." arXiv, (2023). https://doi.org/https://arxiv.org/abs/2302.07459v2.
  
  Abstract
  
  We test the hypothesis that language models trained with reinforcement learning from human feedback (RLHF) have the capability to "morally self-correct" -- to avoid producing harmful outputs -- if instructed to do so. We find strong evidence in support of this hypothesis across three different experiments, each of which reveal different facets of moral self-correction. We find that the capability for moral self-correction emerges at 22B model parameters, and typically improves with increasing model size and RLHF training. We believe that at this level of scale, language models obtain two capabilities that they can use for moral self-correction: (1) they can follow instructions and (2) they can learn complex normative concepts of harm like stereotyping, bias, and discrimination. As such, they can follow instructions to avoid certain kinds of morally harmful outputs. We believe our results are cause for cautious optimism regarding the ability to train language models to abide by ethical principles.
  
  computing ethics large language models
Visit annotations in context

Tags

large language models

computing ethics

Annotators

peter_murray

URL

arxiv.org/pdf/2302.07459.pdf
www.quantamagazine.org www.quantamagazine.org

The Unpredictable Abilities Emerging From Large AI Models | Quanta Magazine

7
1. peter_murray 28 Mar 2023
  
  in Public
  
  In an analysis of LLMs released last June, researchers at Anthropic looked at whether the models would show certain types of racial or social biases, not unlike those previously reported in non-LLM-based algorithms used to predict which former criminals are likely to commit another crime. That study was inspired by an apparent paradox tied directly to emergence: As models improve their performance when scaling up, they may also increase the likelihood of unpredictable phenomena, including those that could potentially lead to bias or harm.
  
  "Larger models abrupty become more biased"
  
  Since it isn't understood how LLMs work, this becomes an unquantifiable risk when using LLMs.
2. peter_murray 28 Mar 2023
  
  in Public
  
  But the researchers quickly realized that a model’s complexity wasn’t the only driving factor. Some unexpected abilities could be coaxed out of smaller models with fewer parameters — or trained on smaller data sets — if the data was of sufficiently high quality. In addition, how a query was worded influenced the accuracy of the model’s response.
  
  Influence of data quality and better prompts
  
  Models with fewer parameters show better abilities when they trained with better data and had a quality prompt. Improvements to the prompt, including "chain-of-the-thought reasoning" where the model can explain how it reached an answer, improved the results of BIG-bench testing.
  
  prompt engineering
3. peter_murray 28 Mar 2023
  
  in Public
  
  In 2020, Dyer and others at Google Research predicted that LLMs would have transformative effects — but what those effects would be remained an open question. So they asked the research community to provide examples of difficult and diverse tasks to chart the outer limits of what an LLM could do. This effort was called the Beyond the Imitation Game Benchmark (BIG-bench) project, riffing on the name of Alan Turing’s “imitation game,” a test for whether a computer could respond to questions in a convincingly human way. (This would later become known as the Turing test.) The group was especially interested in examples where LLMs suddenly attained new abilities that had been completely absent before.
  
  Origins of "BIG-bench"
  
  AI researchers were asked to create a catalog of tasks that would challenge LLMs. This benchmark is used to assess the effectiveness of model changes and scaling up of the number of parameters.
4. peter_murray 28 Mar 2023
  
  in Public
  
  Many of these emergent behaviors illustrate “zero-shot” or “few-shot” learning, which describes an LLM’s ability to solve problems it has never — or rarely — seen before.
  
  Defining "zero-shot"
  
  The ability for a model to solve a problem it hasn't seen before.
5. peter_murray 28 Mar 2023
  
  in Public
  
  In 2017, researchers at Google Brain introduced a new kind of architecture called a transformer. While a recurrent network analyzes a sentence word by word, the transformer processes all the words at the same time. This means transformers can process big bodies of text in parallel.
  
  Introduction of Google's "transformer" architecture
  
  The introduction of the "transformer" architecture out of research from Google changed how LLMs were created. Instead of a "recurrent" approach—where sentences were processed word-by-word, the transformer process looks at large groups of words at the same time. That enabled parallel processing on the text.
6. peter_murray 28 Mar 2023
  
  in Public
  
  Biologists, physicists, ecologists and other scientists use the term “emergent” to describe self-organizing, collective behaviors that appear when a large collection of things acts as one. Combinations of lifeless atoms give rise to living cells; water molecules create waves; murmurations of starlings swoop through the sky in changing but identifiable patterns; cells make muscles move and hearts beat. Critically, emergent abilities show up in systems that involve lots of individual parts. But researchers have only recently been able to document these abilities in LLMs as those models have grown to enormous sizes.
  
  Definition of Emergent Behavior
  
  From smaller components, larger and more complex systems are built. In other fields, emergent behavior can be predicted. In LLMs, this ability has been unpredictable as of yet.
  
  emergent behavior
7. peter_murray 28 Mar 2023
  
  in Public
  
  It’s surprising because these models supposedly have one directive: to accept a string of text as input and predict what comes next, over and over, based purely on statistics. Computer scientists anticipated that scaling up would boost performance on known tasks, but they didn’t expect the models to suddenly handle so many new, unpredictable ones.
  
  Unexpected emergent abilities from large LLMs
  
  Larger models can complete tasks that smaller models can't. An increase in complexity can also increase bias and inaccuracies. Researcher Jason Wei has cataloged 137 emergent abilities of large language models.
  
  large language model
Visit annotations in context

Tags

prompt engineering

emergent behavior

large language model

Annotators

peter_murray

URL

quantamagazine.org/the-unpredictable-abilities-emerging-from-large-ai-models-20230316/
www.federalregister.gov www.federalregister.gov

Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence

4
1. peter_murray 23 Mar 2023
  
  in Public
  
  For example, when an AI technology receives solely a prompt [27] from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user.
  
  LLMs meet Copyright guidance
  
  See comparison later in the paragraph to "commissioned artist" and the prompt "write a poem about copyright law in the style of William Shakespeare"
  
  large language model
2. peter_murray 23 Mar 2023
  
  in Public
  
  And in the current edition of the Compendium, the Office states that “to qualify as a work of `authorship' a work must be created by a human being” and that it “will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.”
  
  Copyright Office's definition of authorship
  
  From the Compendium of Copyright Office Practices, section 313.2.
3. peter_murray 23 Mar 2023
  
  in Public
  
  The Court defined an “author” as “he to whom anything owes its origin; originator; maker; one who completes a work of science or literature.” [14] It repeatedly referred to such “authors” as human, describing authors as a class of “persons” [15] and a copyright as “the exclusive right of a man to the production of his own genius or intellect.”
  
  Supreme Court definition of "Author"
  
  From Burrow-Giles Lithographic Co. v. Sarony, 111 U.S. 53 (1884)
4. peter_murray 23 Mar 2023
  
  in Public
  
  In the Office's view, it is well-established that copyright can protect only material that is the product of human creativity. Most fundamentally, the term “author,” which is used in both the Constitution and the Copyright Act, excludes non-humans. The Office's registration policies and regulations reflect statutory and judicial guidance on this issue.
  
  Copyright Office's Statement of Human Creativity
  
  copyright
Visit annotations in context

Tags

copyright

large language model

Annotators

peter_murray

URL

federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence
idlewords.com idlewords.com

Remarks at the SASE Panel On The Moral Economy of Tech

6
1. peter_murray 13 Mar 2023
  
  in Public
  
  we have turned to machine learning, an ingenious way of disclaiming responsibility for anything. Machine learning is like money laundering for bias. It's a clean, mathematical apparatus that gives the status quo the aura of logical inevitability. The numbers don't lie.
  
  Machine learning like money laundering for bias
  
  machine learning algorithm bias
2. peter_murray 12 Mar 2023
  
  in Public
  
  Techies will complain that trivial problems of life in the Bay Area are hard because they involve politics. But they should involve politics. Politics is the thing we do to keep ourselves from murdering each other. In a world where everyone uses computers and software, we need to exercise democratic control over that software.
  
  Politics defy modeling, but that makes its reality so important
  
  politics
3. peter_murray 12 Mar 2023
  
  in Public
  
  Companies that perform surveillance are attempting the same mental trick. They assert that we freely share our data in return for valuable services. But opting out of surveillance capitalism is like opting out of electricity, or cooked foods—you are free to do it in theory. In practice, it will upend your life.
  
  Opting-out of surveillance capitalism?
  
  digital privacy
4. peter_murray 12 Mar 2023
  
  in Public
  
  We started out collecting this information by accident, as part of our project to automate everything, but soon realized that it had economic value. We could use it to make the process self-funding. And so mechanized surveillance has become the economic basis of the modern tech industry.
  
  Surveillance Capitalism by Accident
  
  surveillance capitalism
5. peter_murray 12 Mar 2023
  
  in Public
  
  First we will instrument, then we will analyze, then we will optimize. And you will thank us. But the real world is a stubborn place. It is complex in ways that resist abstraction and modeling. It notices and reacts to our attempts to affect it. Nor can we hope to examine it objectively from the outside, any more than we can step out of our own skin. The connected world we're building may resemble a computer system, but really it's just the regular old world from before, with a bunch of microphones and keyboards and flat screens sticking out of it. And it has the same old problems. Approaching the world as a software problem is a category error that has led us into some terrible habits of mind.
  
  Reality actively resists modeling
6. peter_murray 12 Mar 2023
  
  in Public
  
  this intellectual background can also lead to arrogance. People who excel at software design become convinced that they have a unique ability to understand any kind of system at all, from first principles, without prior training, thanks to their superior powers of analysis. Success in the artificially constructed world of software design promotes a dangerous confidence.
  
  Risk of thinking software design experience is generally transferable
  
  software development
Visit annotations in context

Tags

surveillance capitalism

machine learning

politics

algorithm bias

digital privacy

software development

Annotators

peter_murray

URL

idlewords.com/talks/sase_panel.htm
www.wired.com www.wired.com

How to Start an AI Panic

1
1. peter_murray 12 Mar 2023
  
  in Public
  
  the apocalypse they refer to is not some kind of sci-fi takeover like Skynet, or whatever those researchers thought had a 10 percent chance of happening. They’re not predicting sentient evil robots. Instead, they warn of a world where the use of AI in a zillion different ways will cause chaos by allowing automated misinformation, throwing people out of work, and giving vast power to virtually anyone who wants to abuse it. The sin of the companies developing AI pell-mell is that they’re recklessly disseminating this mighty force.
  
  Not Skynet, but social disruption
  
  artificial intelligence social structure
Visit annotations in context

Tags

artificial intelligence

social structure

Annotators

peter_murray

URL

wired.com/story/plaintext-how-to-start-an-ai-panic/

Peter Murray

Annotations: 1,276

Joined: October 24, 2012

Location: Columbus, Ohio, United States

Link: dltj.org/

ORCID: 0000-0003-4284-508X

DICOM is a file structure and a network protocol

Abstract

Tags

Annotators

URL

Matienzo has a GitHub account with code that interacts with OCLC’s API

Example of enrichment process: association between original works and parodies

In complaint, OCLC says it has 1.4b OCNs

Web scrapers web-scraped, but more robust data?

OCN, OCLC’s unique identifier

Bots initially masked themselves as search engine bots

Terms and Conditions

Subset of data from WorldCat is on the public WorldCat.org site

Percentage of WorldCat that has been “modified, improved, and/or enhanced”

How OCLC defines itself

Complain equates “hacking” with “scraping and harvesting”

Anna’s Archive blog post announcing data

Libraries pay for WorldCat.org visibility

WorldCat database is a subset of WorldCat.org

Tags

Annotators

URL

Cybersecurity is a group effort

What is cutting edge today is legacy tomorrow

You have to be right 100% of the time; your attacker needs to be lucky once

Actively avoiding a return to normal

Backup models

Never let a good crisis go to waste

Legacy infrastructure lengthens recovery time

Reliance on ETL seen as risky

Historically complex network topology

Viable backups

Government policy not to reward or engage with cyber attackers

No MFA on the remote access server

Attack started overnight Friday-to-Saturday

Ransom not paid and data published

Tags

Annotators

URL

Tags

Annotators

URL

1908 Supreme Court case established First Sale Doctrine

Attempts by publishers to limit post-sale activities

Tags

Annotators

URL

Client-side scanning is a bulk surveillance technology

Perceptual hash function for content scanning

Machine learning for content scanning

Defining "content"

Abstract

Tags

Annotators

URL

Vibe-based literacy

Tags

Annotators

URL

Bluetooth technology

Infrared technology

RF technology

Abstract

Tags

Annotators

URL

One Wilshire isn't really on Wilshire Drive

Most expensive commercial real estate in the US

Abstract

Tags

Annotators

URL

Paper degrades and can't be indefinitely recycled

Pulp and paper products produced overwhelmingly in the southern U.S.

30 years to get mature trees for corrugated packaging

Abstract

Tags

Annotators

URL

Rebranding law enforcement access as "front-door"