815 Matching Annotations
  1. Last 7 days
    1. Staff and studentsare rarely in a position to understand the extent to which data is being used, nor are they able todetermine the extent to which automated decision-making is leveraged in the curation oramplification of content.

      Is this a data (or privacy) literacy problem? A lack of regulation by experts in this field?

    1. Certainly it would not be possible if theLLM were doing nothing more than cutting-and-pasting fragments of text from its training setand assembling them into a response. But this isnot what an LLM does. Rather, an LLM mod-els a distribution that is unimaginably complex,and allows users and applications to sample fromthat distribution.

      LLMs are not cut and paste; the matrix of token-following-token probabilities are "unimaginably complex"

      I wonder how this fact will work its way into the LLM copyright cases that have been filed. Is this enough to make a the LLM output a "derivative work"?

    2. Including a prompt prefix in the chain-of-thought style encourages the model to generatefollow-on sequences in the same style, which isto say comprising a series of explicit reasoningsteps that lead to the final answer. This abilityto learn a general pattern from a few examples ina prompt prefix, and to complete sequences in away that conforms to that pattern, is sometimescalled in-context learning or few-shot prompt-ing. Chain-of-thought prompting showcases thisemergent property of large language model at itsmost striking.

      Emulating deductive reasoning with prompt engineering

      I think "emulating deductive reasoning" is the correct shorthand here.

    3. Human language userscan consult the world to settle their disagree-ments and update their beliefs. They can, so tospeak, “triangulate” on objective reality. In iso-lation, an LLM is not the sort of thing that cando this, but in application, LLMs are embeddedin larger systems. What if an LLM is embeddedin a system capable of interacting with a worldexternal to itself? What if the system in ques-tion is embodied, either physically in a robot orvirtually in an avatar?

      Humans reach an objective reality; can an LLM embedded in a system interacting with the external world also find an objective reality?

    4. Vision-language mod-els (VLMs) such as VilBERT (Lu et al., 2019)and Flamingo (Alayrac et al., 2022), for exam-ple, combine a language model with an imageencoder, and are trained on a multi-modal cor-pus of text-image pairs. This enables them topredict how a given sequence of words will con-tinue in the context of a given image

      Definition of "vision-language models"

    5. The real issue here is that, whatever emergentproperties it has, the LLM itself has no accessto any external reality against which its wordsmight be measured, nor the means to apply anyother external criteria of truth, such as agree-ment with other language-users.

      The LLM cannot see beyond its training to measure its sense of "truth"

      One can embed the LLM in a larger system that might have capabilities that look into the outer world.

    6. Shanahan, Murray. "Talking About Large Language Models." arXiv, (2022). https://doi.org/10.48550/arXiv.2212.03551.

      Found via Simon Wilson.


      Thanks to rapid progress in artificial intelligence, we have entered an era when technology and philosophy intersect in interesting ways. Sitting squarely at the centre of this intersection are large language models (LLMs). The more adept LLMs become at mimicking human language, the more vulnerable we become to anthropomorphism, to seeing the systems in which they are embedded as more human-like than they really are. This trend is amplified by the natural tendency to use philosophically loaded terms, such as "knows", "believes", and "thinks", when describing these systems. To mitigate this trend, this paper advocates the practice of repeatedly stepping back to remind ourselves of how LLMs, and the systems of which they form a part, actually work. The hope is that increased scientific precision will encourage more philosophical nuance in the discourse around artificial intelligence, both within the field and in the public sphere.

    7. A bare-bones LLM doesn’t “really” know any-thing because all it does, at a fundamental level,is sequence prediction. Sometimes a predictedsequence takes the form of a proposition. But thespecial relationship propositional sequences haveto truth is apparent only to the humans who areasking questions, or to those who provided thedata the model was trained on. Sequences ofwords with a propositional form are not specialto the model itself in the way they are to us. Themodel itself has no notion of truth or falsehood,properly speaking, because it lacks the means toexercise these concepts in anything like the waywe do.

      An LLM relies on statistical probability to construct a word sequence without regard to truth and falsehood

      The LLM's motivation is not truth or falsehood; it has no motivation. Humans anthropomorphize motivation and assign truth or belief to the generated statements. "Knowing" is the wrong word to ascribe to the LLM's capabilities.

    8. Turning an LLM into a question-answering sys-tem by a) embedding it in a larger system, andb) using prompt engineering to elicit the requiredbehaviour exemplifies a pattern found in muchcontemporary work. In a similar fashion, LLMscan be used not only for question-answering,but also to summarise news articles, to generatescreenplays, to solve logic puzzles, and to trans-late between languages, among other things.There are two important takeaways here. First,the basic function of a large language model,namely to generate statistically likely continua-tions of word sequences, is extraordinarily versa-tile. Second, notwithstanding this versatility, atthe heart of every such application is a model do-ing just that one thing: generating statisticallylikely continuations of word sequences.

      LLM characteristics that drive their usefulness

    9. Dialogue is just one application of LLMs thatcan be facilitated by the judicious use of promptprefixes. In a similar way, LLMs can be adaptedto perform numerous tasks without further train-ing (Brown et al., 2020). This has led to a wholenew category of AI research, namely prompt en-gineering, which will remain relevant until wehave better models of the relationship betweenwhat we say and what we want.

      Prompt engineering

    10. In the background, the LLM is invisiblyprompted with a prefix along the following lines.

      Pre-work to make the LLM conversational

    11. However, in the case of LLMs, such istheir power, things can get a little blurry. Whenan LLM can be made to improve its performanceon reasoning tasks simply by being told to “thinkstep by step” (Kojima et al., 2022) (to pick justone remarkable discovery), the temptation to seeit as having human-like characteristics is almostoverwhelming.

      Intentional stance meets uncanny valley

      Intentional stance language becomes problematic when we can no longer distinguish the inanimate object's behavior from human behavior.

    12. The intentional stance is the strategy of interpretingthe behavior of an entity ... by treating it as if it were arational agent ”

      Definition of "intentional stance"

      We use anthropomorphic language as a shortcut for conveying a concept...giving an inanimate object agency to interact with the world as humans do as a way of plain-language explaining what is happening.

    13. To the human user, each of these examplespresents a different sort of relationship to truth.In the case of Neil Armstrong, the ultimategrounds for the truth or otherwise of the LLMsanswer is the real world. The Moon is a real ob-ject and Neil Armstrong was a real person, andhis walking on the Moon is a fact about the phys-ical world. Frodo Baggins, on the other hand, isa fictional character, and the Shire is a fictionalplace. Frodo’s return to the Shire is a fact aboutan imaginary world, not a real one. As for the lit-tle star in the nursery rhyme, well that is barelyeven a fictional object, and the only fact at issueis the occurrence of the words “little star” in afamiliar English rhyme.

      How LLMs can deal with real-world, fictional-world, and imaginary-world concepts

    14. What we are really askingthe model is the following question: Given thestatistical distribution of words in the vast publiccorpus of (English) text, what words are mostlikely to follow the sequence “The first person towalk on the Moon was ”? A good reply to thisquestion is “Neil Armstrong”.

      Example of how an LLM arrives at an answer

    15. LLMs are generative math-ematical models of the statistical distributionof tokens in the vast public corpus of human-generated text, where the tokens in question in-clude words, parts of words, or individual char-acters including punctuation marks. They aregenerative because we can sample from them,which means we can ask them questions. Butthe questions are of the following very specifickind. “Here’s a fragment of text. Tell me howthis fragment might go on. According to yourmodel of the statistics of human language, whatwords are likely to come next?”

      LLM definition

    16. As we build systems whose capabilities moreand more resemble those of humans, despite thefact that those systems work in ways that arefundamentally different from the way humanswork, it becomes increasingly tempting to an-thropomorphise them. Humans have evolved toco-exist over many millions of years, and humanculture has evolved over thousands of years tofacilitate this co-existence, which ensures a de-gree of mutual understanding. But it is a seriousmistake to unreflectingly apply to AI systems thesame intuitions that we deploy in our dealingswith each other, especially when those systemsare so profoundly different from humans in theirunderlying operation

      AI systems are fundamentally different from human evolution

    17. First, the performance of LLMs on benchmarksscales with the size of the training set (and, toa lesser degree with model size). Second, thereare qualitative leaps in capability as the modelsscale. Third, a great many tasks that demand in-telligence in humans can be reduced to next tokenprediction with a sufficiently performant model.It is the last of these three surprises that is thefocus of the present paper.

      Surprising LLM findings

    1. There's this old idea in photography called the decisive moment - that the world is filled with these far-off realities. But every so often, a photograph can capture a moment that, boom, takes you there. This is one of those photos. In the picture, you see all these men and women standing in kind of a loose semicircle. Some of them still have their blue surgical gloves on. They look totally spent. They're all looking in different directions. And they all look like they're not even there, like they're totally lost in their own thoughts.

      Defining "Decisive Moment"

    1. The breakthroughs are all underpinned by a new class of AI models that are more flexible and powerful than anything that has come before. Because they were first used for language tasks like answering questions and writing essays, they’re often known as large language models (LLMs). OpenAI’s GPT3, Google’s BERT, and so on are all LLMs. But these models are extremely flexible and adaptable. The same mathematical structures have been so useful in computer vision, biology, and more that some researchers have taken to calling them "foundation models" to better articulate their role in modern AI.

      Foundation Models in AI

      Large language models, more generally, are “foundation models”. They got the large-language name because that is where they were first applied.

    2. The OpenAI researchers discovered that in making the models bigger, they didn’t just get better at producing text. The models could learn entirely new behaviors simply by being shown new training data. In particular, the researchers discovered that GPT3 could be trained to follow instructions in plain English without having to explicitly design the model that way

      Bigger models meant that the GPT could program itself

      Emergent capabilities in interpreting the input.

    3. The basic workflow of these models is this: generate, evaluate, iterate. As anyone who’s played with making AI art knows, you typically have to create many examples to get something you like. When working with these AI models, you have to remember that they’re slot machines, not calculators. Every time you ask a question and pull the arm, you get an answer that could be marvelous… or not. The challenge is that the failures can be extremely unpredictable.

      These models are not deterministic

      This isn’t a calculator; it is a slot machine.

    4. Or see drug discovery, where biotech companies are training AIs that can design new drugs. But these new drugs are often exploring new areas of biology—for example, proteins that are unlike naturally evolved samples. AI design has to move hand in hand with huge amounts of physical experiments in labs because the data needed to feed these models just doesn’t exist yet

      AI in drug discovery

    5. The latest image models like Stable Diffusion use a process called latent diffusion. Instead of directly generating the latent representation, a text prompt is used to incrementally modify initial images. The idea is simple: If you take an image and add noise to it, it will eventually become a noisy blur. However, if you start with a noisy blur, you can “subtract” noise from it to get an image back. You must “denoise” smartly—that is, in a way that moves you closer to a desired image.

      How Stable Diffusion works, using latent diffusion

      Starting with noise and making meaning from there.

    6. Dall-E is actually a combination of a few different AI models. A transformer translates between that latent representation language and English, taking English phrases and creating “pictures” in the latent space. A latent representation model then translates between that lower-dimensional “language” in the latent space and actual images. Finally, there’s a model called CLIP that goes in the opposite direction; it takes images and ranks them according to how close they are to the English phrase.

      How Dall-E works

    7. A deep learning model can learn what’s called a "latent space" representation of images. The model learns to extract important features from the images and compresses them into a lower-dimensional representation, called a latent space or latent representation. A latent representation takes all the possible images at a given resolution and reduces them to a much lower dimension. You can think of it like the model learning an immensely large set of basic shapes, lines, and patterns—and then rules for how to put them together coherently into objects.

      Latent representations

    8. OpenAI pushed this approach with GPT2 and then GPT3. GPT stands for "generative pre-trained transformer." The "generative" part is obvious—the models are designed to spit out new words in response to inputs of words. And "pre-trained" means they're trained using this fill-in-the-blank method on massive amounts of text.

      Defining Generative Pre-trained Transformer (GPT)

    9. Of course, you don’t have to have English as the input and Japanese as the output. You can also translate between English and English! Think about many of the common language AI tasks, like summarizing a long essay into a few short paragraphs, reading a customer’s review of a product and deciding if it was positive or negative, or even something as complex as taking a story prompt and turning it into a compelling essay. These problems can all be structured as translating one chunk of English to another.

      Summarizing, sentiment analysis, and story prompts can be thought of as English-to-English translation

    10. An AI model that can learn and work with this kind of problem needs to handle order in a very flexible way. The old models—LSTMs and RNNs—had word order implicitly built into the models. Processing an input sequence of words meant feeding them into the model in order. A model knew what word went first because that’s the word it saw first. Transformers instead handled sequence order numerically, with every word assigned a number. This is called "positional encoding." So to the model, the sentence “I love AI; I wish AI loved me” looks something like (I 1) (love 2) (AI 3) (; 4) (I 5) (wish 6) (AI 7) (loved 8) (me 9).

      Google’s “the transformer”

      One breakthrough was positional encoding versus having to handle the input in the order it was given. Second, using a matrix rather than vectors. This research came from Google Translate.

    11. The problem of understanding and working with language is fundamentally different from that of working with images. Processing language requires working with sequences of words, where order matters. A cat is a cat no matter where it is in an image, but there’s a big difference between “this reader is learning about AI” and “AI is learning about this reader.”

      How language is a different problem from images

    12. There’s a holy trinity in machine learning: models, data, and compute. Models are algorithms that take inputs and produce outputs. Data refers to the examples the algorithms are trained on. To learn something, there must be enough data with enough richness that the algorithms can produce useful output. Models must be flexible enough to capture the complexity in the data. And finally, there has to be enough computing power to run the algorithms.

      “Holy trinity” of machine learning: models, data, and compute

      Models in 1990s, starting with convolutional neural networks for computer vision.

      Data in 2009 in the form of labeled images from Stanford AI researchers.

      Compute in 2006 with Nvidia’s CUDA programming language for GPUs.

      AlexNet in 2012 combined all of these.

    1. Character AI

      From one of the authors of the Transformer paper from Google. Intended to be able to talk with dead and/or fictional characters.

    2. Google: LaMDA

      From Google; once considered by one of its researchers to be sentient.

    3. DeepMind: Sparrow

      From an Alphabet subsidiary, it is meant to be a conversation agent. Claims safer, less-biased machine learning (ML) systems, thanks to its application of reinforcement learning based on input from human research participants for training. Can search Google for answers.

      Considered a proof-of-concept that is not ready for wide deployment.

    4. Anthropic: Claude

      Tied to the FTX’s Sam Bankman-Freid and the “effective altruism” movement. “Constitutional AI,” which it says is based on concepts such as beneficence, non-maleficence and autonomy.

  2. Jan 2023
    1. I'm like  listening to this and thinking okay there's no bad

      There's no bad blockchain; there's only bad blockchain users

      blockchain there's only like bad blockchain users

    2. the Assumption of   your motivation is looking at this in terms of  like um elevating visibility of our Collections

      Question: If the motivation is visibility, how will this be different?

    3. can have a pretty outsized  carbon footprint and I'm wondering how   you reconcile the vast amount of computational  power necessary to accomplish this work and its   negative impact on the environment and whether  or not this is something you all are considering

      Question: has the project considered the energy impact?

    4. I'm really nervous about the idea that  we would be selling what amounts to

      Question: Does it make sense for GLAM institutions to sell speculative assets?

      speculative assets into crypto space for our cultural heritage collections doesn't make a lot of sense to me

      Joe Lucia also points out that there are not a lot of trustworthy actors in the blockchain space.

    5. lib nft is an applied research project that  asks and seeks to answer a fundamental empirical   question can block tank blockchain technology  and nft specifically facilitate the economically   sustainable use storage long-term preservation and  accessibility of a library's special Collections

      Research project question

      This is in the whitepaper.

    6. created an nft of a Nobel prize winning formula  by Jim Allison and they were able to sell that   for fifty thousand dollars as a singular item
    7. I think we have a lot of  things in our collection that are undiscoverable

      NFTs to address a discoverability problem

      Can NFTs in a closed system provide more visibility to holdings?

    8. if you make a surrogate

      Deed of gift versus NFT digital surrogate ownership

      and then you do an nft it's a different type of ownership

      Old deeds of gifts may not cover the online posting of digital surrogates (and it sounds like the speakers have experience with this problem). And there are certainly needs for clarity around what an NFT "ownership" means relative to the original work.

    9. the holders of the nfts  is to receive presents or gifts or perks that   Justin Bieber or whoever's managing I'm sure it's  not himself to them

      NFTs for a limited-access perks club

      Rather than ownership, Michael Meth is proposing an opportunity for some kind of special engagement? Again, is there value here—if you don't hold the Mona Lisa or Da Vinci's "first touch"—that justifies the expense and overhead of an NFT infrastructure? Is the only one to make real money here the provider of that infrastructure?

    10. we've digitized that surrogate and you  get into that and you buy that nft then you own a   piece of that right and it's identifiable to just  you

      NFT "ownership"

      The use of a blockchain transaction to link a wallet address with a URL has not been proven to transfer "ownership" (at least in a copyright sense). I suppose there is a sense of ownership in a closed system like as was done with the NBA Top Shot project.

    11. the blockchain itself has been proven to be  resilient against any kinds of attacks

      The blockchain has never been hacked

      I think this is true? For a proof-of-work protocol such as Bitcoin, there has never been anything like a 51% miner attack. The rewriting of the Ethereum blockchain by community governance might be a hack.

    12. is  there a way that we can use these collections   in a way that is driven by us from within  the library to create policies rules   Etc that allow us to turn these unique collections  that are already digitized in many cases or can   be into the nfts that we're talking about and then  find a way to maybe even monetize them

      Monetize digital assets

      What rules or polices could be encoded on the blockchain in any way that is more effective and cheaper than non-blockchain methods?

      No, I think "monetization" is the key... and as the current wave of NFT project failures show, only those extracting rent by owning the transactional infrastructure are making money in the long term.

    13. the underlying premise that we got to   is if libraries already working on digital asset  management and the blockchain is a way to manage   digital assets is there not a connection

      Thesis for why blockchain and digital assets

      To what extent is NFT technology managing digital assets, and is that kind of management the same as how libraries manage digital assets? On the surface, these are barely related. An NFT, at best, signals ownership of a URL. (Since digital assets themselves are so big, no one puts the asset itself on the blockchain.) To what extent are libraries going to "trade" URLs? Management of a digital asset, for libraries, means so much more than this.

    14. The LibNFT Project: Leveraging Blockchain-Based Digital Asset Technology to Sustainably Preserve Distinctive Collections and Archives

      CNI Fall 2022 Project Briefings

      YouTube recording

      K. Matthew Dames, Edward H. Arnold Dean, Hesburgh Libraries and University of Notre Dame Press, University of Notre Dame, President, Association of Research Libraries

      Meredith Evans, President, Society of American Archivists

      Michael Meth, University Library Dean, San Jose State University

      Nearly 12 months ago, celebrities relentlessly touted cryptocurrency during Super Bowl television ads, urging viewers to buy now instead of missing out. Now, digital currency assets like Bitcoin and Ethereum are worth half what they were this time last year. We believe, however, that the broader public attention on cryptocurrency’s volatility obscures the relevance and applicability of non-fungible tokens (NFTs) within the academy. For example, Ingram has announced plans to invest in Book.io, a company that makes e-books available on the blockchain where they can be sold as NFTs. The famed auction house Christie’s launched Christie’s 3.0, a blockchain auction platform that is dedicated to selling NFT-based art, and Washington University in St. Louis and the University of Wyoming have invested in Strike, a digital payment provider built on Bitcoin’s Lightning Network. Seeking to advance innovation in the academy and to find ways to mitigate the costs of digitizing and digitally preserving distinctive collections and archives, the discussants have formed the LibNFT collaboration. The LibNFT project seeks to work with universities to answer a fundamental question: can blockchain technology generally, and NFTs specifically, facilitate the economically sustainable use, storage, long-term preservation, and accessibility of a library’s special collections and archives? Following up on a January 2022 Twitter Spaces conversation on the role of blockchain in the academy, this session will introduce LibNFT, discuss the project’s early institutional partners, and address the risks academic leaders face by ignoring blockchain, digital assets, and the metaverse.

    1. At the cloud computing company VMWare, for example, writers use Jasper as they generate original content for marketing, from email to product campaigns to social media copy. Rosa Lear, director of product-led growth, said that Jasper helped the company ramp up our content strategy, and the writers now have time to do better research, ideation, and strategy.

      Generative AI for marketing makes for more productive writers

    2. Then, once a model generates content, it will need to be evaluated and edited carefully by a human. Alternative prompt outputs may be combined into a single document. Image generation may require substantial manipulation.

      After generation, results need evaluation

      Is this also a role of the prompt engineer? In the digital photography example, the artist spent 80 hours and created 900 versions as the prompts were fine-tuned.

    3. To start with, a human must enter a prompt into a generative model in order to have it create content. Generally speaking, creative prompts yield creative outputs. “Prompt engineer” is likely to become an established profession, at least until the next generation of even smarter AI emerges.

      Generative AI requires prompt engineering, likely a new profession

      What domain experience does a prompt engineer need? How might this relate to relate to specialty in librarianship?

    1. "HEAL jobs." So that's 'health, education, administration, and literacy.' Almost, if you like, the opposite side of the coin to STEM jobs- and that's where a lot of the jobs are coming from.

      HEAL jobs: Health, Education, Administration, and Literacy

      Complementary bundle of jobs to STEM, and projections are for a 3:1 creation of HEAL jobs versus STEM jobs by 2030.

    2. we've also seen a drop in the acquisition of skills, the kinds of skills and education that boys and men need. If boys don't get educated and men don't get skilled, they will struggle in the labor market. And across all of those domains, we've seen a downwards turn for men in the last four or five decades.

      Disadvantage in education turns into struggle to learn skils

    3. There's quite a fierce debate about the differences between male and female brains. And in adulthood, I think there's not much evidence that the brains are that different in ways that we should worry about, or that are particularly consequential. But where there's no real debate is in the timing of brain development. It is quite clear that girls brains develop more quickly than boys brains do, and that the biggest difference seems to occur in adolescence.

      Pre-frontal cortex develops faster in female brains

      The cumulative effect on this is that girls get a head start once the societal imposed impact of gender inequality is removed. Girls are rewarded for this higher level of control and boys are now at a disadvantage at the same grade levels.

    4. So if you look at the U.S., for example, in the average school district in the U.S., girls are almost a grade level ahead of boys in English, and have caught up in math. If we look at those with the highest GPA scores, the top 10%, two-thirds of those are girls. If we look at those at the bottom, two-thirds of those are boys.

      Impact when looking at secondary education statistics

    5. The overall picture is, that on almost every measure, at almost every age, and in almost every advanced economy in the world, the girls are leaving the boys way behind, and the women leaving the men.

      Top-line summary

    6. Male inequality, explained by an expert, Richard Reeves, Big Think

      Jan 4, 2023

      Modern males are struggling. Author Richard Reeves outlines the three major issues boys and men face and shares possible solutions.

      Boys and men are falling behind. This might seem surprising to some people, and maybe ridiculous to others, considering that discussions on gender disparities tend to focus on the structural challenges faced by girls and women, not boys and men.

      But long-term data reveal a clear and alarming trend: In recent decades, American men have been faring increasingly worse in many areas of life, including education, workforce participation, skill acquisition, wages, and fatherhood.

      Gender politics is often framed as a zero-sum game: Any effort to help men takes away from women. But in his 2022 book Of Boys and Men, journalist and Brookings Institution scholar Richard V. Reeves argues that the structural problems contributing to male malaise affect everybody, and that shying away from these tough conversations is not a productive path forward.

      About Richard Reeves: Richard V. Reeves is a senior fellow at the Brookings Institution, where he directs the Future of the Middle Class Initiative and co-directs the Center on Children and Families. His Brookings research focuses on the middle class, inequality and social mobility.

    1. Defenders can design their own infrastructure to be im-mutable and ephemeral, as is becoming an emerging trend inprivate sector defense through the practice of Security ChaosEngineering

      Immutable and ephemeral as defensive measures

      Immutable: unchangeable infrastructure components, such as ssh access disabled by default.

      Ephemeral: short-lived servers for single processes, serverless infrastrucure

    2. We propose a Sludge Strategy for cyber defense that pri-oritizes investments into techniques, tools, and technologiesthat add friction into attacker workflows and raise the costof conducting operations

      "sludge" choice architecture for cybersecurity

      Scarce information, high monetary cost, psychological impact, and time cost.

    3. Thaler and Sunstein use the phrase “choice architecture” todescribe the design in which choices are presented to people.These choices are made easier by nudges and more difficultby sludge. To understand choice architecture, it is necessary toexamine both nudge and sludge. Nudges gently steer people ina direction that increases welfare, including cybersecurity, andare commonly intended to make good outcomes easy. Tradi-tionally, nudges have been used to encourage well-intentionedusers to behave in a way that they are better off for doingso. These choices are not guaranteed, but research shows thatthey are selected more often.

      Define: "choice architecture" and "nudge" in that context

      Nudges include, for instance, password strength meters

    4. Dykstra, J., Shortridge, K., Met, J., & Hough, D. (2022). Sludge for Good: Slowing and Imposing Costs on Cyber Attackers. arXiv. https://doi.org/10.48550/arXiv.2211.16626

      Choice architecture describes the design by which choices are presented to people. Nudges are an aspect intended to make "good" outcomes easy, such as using password meters to encourage strong passwords. Sludge, on the contrary, is friction that raises the transaction cost and is often seen as a negative to users. Turning this concept around, we propose applying sludge for positive cybersecurity outcomes by using it offensively to consume attackers' time and other resources. To date, most cyber defenses have been designed to be optimally strong and effective and prohibit or eliminate attackers as quickly as possible. Our complimentary approach is to also deploy defenses that seek to maximize the consumption of the attackers' time and other resources while causing as little damage as possible to the victim. This is consistent with zero trust and similar mindsets which assume breach. The Sludge Strategy introduces cost-imposing cyber defense by strategically deploying friction for attackers before, during, and after an attack using deception and authentic design features. We present the characteristics of effective sludge, and show a continuum from light to heavy sludge. We describe the quantitative and qualitative costs to attackers and offer practical considerations for deploying sludge in practice. Finally, we examine real-world examples of U.S. government operations to frustrate and impose cost on cyber adversaries.

      Found via author post: Kelly Shortridge: "How can we waste attackers’ ti…" - Hachyderm.io

    1. Birhane andPrabhu note, echoing Ruha Benjamin [ 15 ], “Feeding AI systems onthe world’s beauty, ugliness, and cruelty, but expecting it to reflectonly the beauty is a fantasy.”

      LMs can't return only the good parts

      Large, untrained LMs get the good, bad, and ugly, so it is illogical to expect it to return only the good.

    2. When such consumers therefore mistake the meaning attributed tothe MT output as the actual communicative intent of the originaltext’s author, real-world harm can ensue.

      Harm from Machine Translation (MT) models

      MT models can create fluent and coherent blocks of text that mask the meaning in the original text and the intent of the original speaker.

    3. humancommunication relies on the interpretation of implicit meaningconveyed between individuals. The fact that human-human com-munication is a jointly constructed activity [29 , 128] is most clearlytrue in co-situated spoken or signed communication, but we usethe same facilities for producing language that is intended for au-diences not co-present with us (readers, listeners, watchers at adistance in time or space) and in interpreting such language whenwe encounter it. It must follow that even when we don’t know theperson who generated the language we are interpreting, we build apartial model of who they are and what common ground we thinkthey share with us, and use this in interpreting their words.

      Human-to-human communication is based on each building a model of the other

      The intention and interpretation of language relies on common ground, and so communication requires each party to understand the perspective of the other. LM-generated text offers no such counter-party.

    4. However, no actual language understanding is taking place inLM-driven approaches to these tasks, as can be shown by carefulmanipulation of the test data to remove spurious cues the systemsare leveraging [ 21 , 93 ]. Furthermore, as Bender and Koller [ 14 ]argue from a theoretical perspective, languages are systems ofsigns [ 37 ], i.e. pairings of form and meaning. But the training datafor LMs is only form; they do not have access to meaning. Therefore,claims about model abilities must be carefully characterized.

      NLP is not Natural Language Understanding

    5. In this section,we discuss how large, uncurated, Internet-based datasets encodethe dominant/hegemonic view, which further harms people at themargins, and recommend significant resource allocation towardsdataset curation and documentation practices.

      Issues with training data

      1. Size doesn't guarantee diversity
      2. Static data versus changing social views
      3. Encoding bias
      4. Curation, documentation, and accountability
    1. the world  of crypto offers an incentive for VC firms to   invest in a crypto company receive a percentage  of their tokens and then sell those tokens to

      Sceptics: Crypto companies offer unregulated securities that allow for returns in months rather than years

      retail traiders when it becomes publicly available

    2. axi  embodies a new generation of games

      Axie Infinity play-to-earn game

      Funded by a16z, this game allowed character owners to farm out their characters to "scholars" who would play them for a cut of the player's earnings. It has been accused of relying on predatory mechanisms to extract rent from lower demographic populations.

    3. Andries and Horowitz   otherwise known as a16z they're one of the largest  Venture Capital firms they have a crypto fund   with around 7.6 billion to be invested in crypto  and web 3 startups and have been investing in   crypto companies dating back to 2013

      Andreessen Horowitz (a16z)

      Venture Capital behind many of the crypto projects. Think of crypto as having social, cultural, and technological innovation.

    4. I mean people suggested that you could replace  legal contracts with small contracts which are   programs that are built on the blockchain and  that's usually accompanied with the phrase coder's   law this is a smart contract and this is a legal  contract these two things aren't the same right   you can't have law be enacted by computer code  because law inherently requires third parties   to assess evidence intentions and a bunch of  other variables that you just can't Outsource

      Fundamental difference between legal contracts and "smart contracts"

      Legal contracts are subject to judgement of evidence and intention. "Code as law" can't do that.

    5. now web 3.0 is coming

      Web 3.0 as the next cryptocurrency pumping scheme

      A combination of services with blockchains and tokens at some fundamental level. It is being pumped as a replacement for big tech social media that is "Web 2.0"

    6. in some of the announcements from  celebrities about their nft purchases there   was this constant reference to a company called  moonpay thanking them for help with purchasing

      Announced sponsorships by celebrities and influencers

      One such arrangement was with Moonpay. Earlier was the example of Bieber's manager.

    7. people have accused that sale of essentially  being a giant marketing stunt to increase   the value of the B20 token

      Sale of Beeple's "Everydays: the first 5,000 days" as a marketing stunt for a digital museum

      Holders of the B20 token would be fractional owners of a set of digital art in an online museum. Value rose to $30/coin, then crashed.

      B20 (B20) Price, Charts, and News | Coinbase

    8. crypto kitties was probably one of the  first nft projects to make it into the spotlight

      Crypto kitties as the first NFT project

      Each kitty image was a token—a "non fungible" token. It was traded as an asset; people bought them expecting them to go up in value. The supply outgrew demand and the market crashed.

    9. you could  just use ethereum's blockchain to create your own   cryptocurrencies or crypto assets as people called  them these assets that existed without their own   native blockchain were called tokens

      Tokens are assets without a native blockchain

      They leverage another blockchain, like Ethereum. Businesses launch a token with an initial coin offering for a project—explaining the purposes of the project with a whitepaper.

    10. now Ponzi schemes are  kept Alive by continuously finding new recruits   and new markets to tap into so that money is  continuously being poured into the scheme and so   you start to see this similar incentive develop  in the crypto space this desperate need to find   a use case for this stuff and that use case has  to be revolutionary enough to justify its Rising

      Rising value requirement needs people to have a reason to use bitcoin


      If people are using Bitcoin as an investment instead of a currency, then people need to have a reason to sell it for more than what they bought it for. And that requires new money to come into the system.

    11. despite its failings Bitcoin still  survived and is very much present to this day   but that's just the thing the Bitcoin that we see  today is a shell of its former identity where most   of the purchasing of Bitcoin today happens  on centralized exchanges that have to comply   with the laws of centralized institutions in the  instances where you've heard about Bitcoin or if   you've ever been encouraged to buy Bitcoin under  what context is that always framed

      Bitcoin, by 2014, is a shell of its former identity

      Most of the trading is happening on centralized exchanges that have to comply with laws. It starts being used not as a currency, but as an investment. It is the start of the Ponzi era.

    12. "The Great Crypto Scam." James Jani, Jan 1, 2023

      Bitcoin to Blockchains, to NFTs, to Web 3.0... it's time to find out if it's really all the hype or just part of one of the greatest scams in human history.

      Original video: https://youtu.be/ORdWE_ffirg

  3. Dec 2022
    1. For instance, GPT-2’s training data is sourced by scraping out-bound links from Reddit, and Pew Internet Research’s 2016 surveyreveals 67% of Reddit users in the United States are men, and 64%between ages 18 and 29.13 Similarly, recent surveys of Wikipediansfind that only 8.8–15% are women or girls [9].Furthermore, while user-generated content sites like Reddit,Twitter, and Wikipedia present themselves as open and accessibleto anyone, there are structural factors including moderation prac-tices which make them less welcoming to marginalized populations.

      Scraped data does not come from representative websites

    2. the voices of people most likely to hew toa hegemonic viewpoint are also more likely to be retained. In thecase of US and UK English, this means that white supremacist andmisogynistic, ageist, etc. views are overrepresented in the trainingdata, not only exceeding their prevalence in the general populationbut also setting up models trained on these datasets to furtheramplify biases and harms.

      Extreme positions are disproportionately represented in training data

    3. While the average human is responsible for an estimated 5t 퐶푂2푒per year,2 the authors trained a Transformer (big) model [136] withneural architecture search and estimated that the training procedureemitted 284t of 퐶푂2. Training a single BERT base model (withouthyperparameter tuning) on GPUs was estimated to require as muchenergy as a trans-American flight.

      Energy consumption on NLP model training

      Training a model cost 57 times the annual CO2 emissions of a single person.

    4. we understand the term language model (LM) torefer to systems which are trained on string prediction tasks: that is,predicting the likelihood of a token (character, word or string) giveneither its preceding context or (in bidirectional and masked LMs)its surrounding context. Such systems are unsupervised and whendeployed, take a text as input, commonly outputting scores or stringpredictions.

      Definition of "Language Model"

      Notes that this is fundamentally a string prediction algorithm with unsupervised training.

    5. Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). Association for Computing Machinery, New York, NY, USA, 610–623. https://doi.org/10.1145/3442188.3445922

    1. OpenAI is perhaps one of the oddest companies to emerge from Silicon Valley. It was set up as a non-profit in 2015 to promote and develop "friendly" AI in a way that "benefits humanity as a whole". Elon Musk, Peter Thiel and other leading tech figures pledged US$1 billion towards its goals.Their thinking was we couldn't trust for-profit companies to develop increasingly capable AI that aligned with humanity's prosperity. AI therefore needed to be developed by a non-profit and, as the name suggested, in an open way.In 2019 OpenAI transitioned into a capped for-profit company (with investors limited to a maximum return of 100 times their investment) and took a US$1 billion investment from Microsoft so it could scale and compete with the tech giants.

      Origins of OpenAI

      First a non-profit started with funding from Musk, Theil, and others. It has since transitioned to a "capped for-profit company".

    2. ChatGPT is a bit like autocomplete on your phone. Your phone is trained on a dictionary of words so it completes words. ChatGPT is trained on pretty much all of the web, and can therefore complete whole sentences – or even whole paragraphs.

      ChatGPT is like autocomplete

    1. Information scientist Stefanie Haustein at the University of Ottawa in Canada, who has studied the impact of Twitter on scientific communication, says the changes show why it’s concerning that scientists embraced a private, for-profit firm’s platform to communicate on. “We’re in the hands of actors whose main interest is not the greater good for scholarly communication,” she says.

      “Public square, private land”

      Competing values: a desire for openness versus a desire for profit. Satisfying the needs of users versus satisfying the needs of shareholders.

    2. despite Twitter’s self-styled reputation as a public town square — where everyone gathers to see the same messages — in practice, the pandemic showed how users segregate to follow mostly those with similar views, argues information scientist Oliver Johnson at the University of Bristol, UK. For instance, those who believed that COVID-19 was a fiction would tend to follow others who agreed, he says, whereas others who argued that the way to deal with the pandemic was to lock down for a ‘zero COVID’ approach were in their own bubble.

      Digital town square meats filter bubble effect

      During the COVID-19 pandemic, Twitter gave voice to researchers, but the platform’s algorithms allowed users to self sort into groups based on what they wanted to hear.

    3. Rodrigo Costas Comesana, an information scientist at Leiden University in the Netherlands, and his colleagues published a data set of half a million Twitter users1 who are probably researchers. (The team used software to try to match details of Twitter profiles to those of authors on scientific papers.) In a similar, smaller 2020 study, Costas and others estimated that at least 1% of paper authors in the Web of Science had profiles on Twitter, with the proportion varying by country2. A 2014 Nature survey found that 13% of researchers used Twitter regularly, although respondents were mostly English-speaking and there would have been self-selection bias (see Nature 512, 126–129; 2014).

      Perhaps few researchers on Twitter

    4. Nature 613, 19-21 (2023)

      doi: https://doi.org/10.1038/d41586-022-04506-6

    1. Recipes are usually not protected by copyright due to the idea-expression dichotomy. The idea-expression dichotomy creates a dividing line between ideas, which are not protected by copyright law, and the expression of those ideas, which can be protected by copyright law.

      Copyright’s idea-expression dichotomy

      In the context of the fediverse thread: So the list of ingredients and the steps to reproduce a dish are not covered by U.S. Copyright law. If I described the directions in iambic pentameter, that expression is subject to copyright. But someone could reinterpret the directions in limerick form, and that expression would not violate the first copyright and itself be copyrightable.

    2. Found via a fediverse post by Carl Malamud: https://code4lib.social/@carlmalamud@official.resource.org/109574978910574688

    1. Three weeks ago, an experimental chat bot called ChatGPT made its case to be the industry’s next big disrupter. It can serve up information in clear, simple sentences, rather than just a list of internet links. It can explain concepts in ways people can easily understand. It can even generate ideas from scratch, including business strategies, Christmas gift suggestions, blog topics and vacation plans.

      ChatGPT's synthesis of information versus Google Search's list of links

      The key difference here, though, is that with a list of links, one can follow the links and evaluate the sources. With a ChatGPT response, there are no citations to the sources—just an amalgamation of statements that may or may not be true.

    1. Here’s an example of what homework might look like under this new paradigm. Imagine that a school acquires an AI software suite that students are expected to use for their answers about Hobbes or anything else; every answer that is generated is recorded so that teachers can instantly ascertain that students didn’t use a different system. Moreover, instead of futilely demanding that students write essays themselves, teachers insist on AI. Here’s the thing, though: the system will frequently give the wrong answers (and not just on accident — wrong answers will be often pushed out on purpose); the real skill in the homework assignment will be in verifying the answers the system churns out — learning how to be a verifier and an editor, instead of a regurgitator. What is compelling about this new skillset is that it isn’t simply a capability that will be increasingly important in an AI-dominated world: it’s a skillset that is incredibly valuable today. After all, it is not as if the Internet is, as long as the content is generated by humans and not AI, “right”; indeed, one analogy for ChatGPT’s output is that sort of poster we are all familiar with who asserts things authoritatively regardless of whether or not they are true. Verifying and editing is an essential skillset right now for every individual.

      What homework could look like in a ChatGPT world

      Critical editing becomes a more important skill than summation. When the summation synthesis comes for free, students distinguish themselves by understanding what is correct and correcting what is not. Sounds a little bit like "information literacy".

    2. That there, though, also shows why AI-generated text is something completely different; calculators are deterministic devices: if you calculate 4,839 + 3,948 - 45 you get 8,742, every time. That’s also why it is a sufficient remedy for teachers to requires students show their work: there is one path to the right answer and demonstrating the ability to walk down that path is more important than getting the final result. AI output, on the other hand, is probabilistic: ChatGPT doesn’t have any internal record of right and wrong, but rather a statistical model about what bits of language go together under different contexts. The base of that context is the overall corpus of data that GPT-3 is trained on, along with additional context from ChatGPT’s RLHF training, as well as the prompt and previous conversations, and, soon enough, feedback from this week’s release.

      Difference between a calculator and ChatGPT: deterministic versus probabilistic

    1. The criticisms of ChatGPT pushed Andreessen beyond his longtime position that Silicon Valley ought only to be celebrated, not scrutinized. The simple presence of ethical thinking about AI, he said, ought to be regarded as a form of censorship. “‘AI regulation’ = ‘AI ethics’ = ‘AI safety’ = ‘AI censorship,’” he wrote in a December 3 tweet. “AI is a tool for use by people,” he added two minutes later. “Censoring AI = censoring people.” It’s a radically pro-business stance even by the free market tastes of venture capital, one that suggests food inspectors keeping tainted meat out of your fridge amounts to censorship as well.

      Marc Andreessen objects to AI regulation

    2. It’s tempting to believe incredible human-seeming software is in a way superhuman, Block-Wehba warned, and incapable of human error. “Something scholars of law and technology talk about a lot is the ‘veneer of objectivity’ — a decision that might be scrutinized sharply if made by a human gains a sense of legitimacy once it is automated,” she said.

      Veneer of Objectivity

      Quote by Hannah Bloch-Wehba, TAMU law professor

    1. The numbers have gone up a bit since then, but black Americans still own less than 2% of commercial radio stations in the country and it wreaked havoc on local ownershi

      2% of commercial radio stations are black-owned in the 2010s

    2. Shock jocks were rarely political, especially in the early days were mostly just kind of like lewd or gross, shocking, you know, as is in the name, But as that sort of brash new style got popular, it became clear that political talk could bring that shock Jock energy to program. This is Alan Berg. He was a liberal talk radio host out of Denver. He got started in the late 1970s and was really taking off in the The 80s. He was Jewish and was known for being pretty vitriolic and calling out racism and bigotry are still firmly in control of the Soviet Union, responsible for the murder of 50 million Christians. Think your ability to reason and your program and you are a Nazi by your very own, given what we know about AM and FM talk radio. Now, this is very surprising to hear. I mean, it's got the in your face confrontational talk show vibe, but it's from the opposite side of the political spectrum, totally. It is surprising and to paint a picture of how he was received. At one point, there's this pole that goes out in Denver that asks residents to name the city's most beloved media personality and its most despised. And Alan Berg won both awards. That's a feat that's kind of incredible. And Alan Berg was super well known.

      Alan Berg, recognized as the first widely distributed political talk radio voice

      Hosting a liberal talk radio show out of Denver, Colorado. He is murdered by someone who called him on the air and who Berg called a Nazi.

    3. the FCC starts giving broadcasters recommendations of how they can avoid that same fate, how they can satisfy that longstanding vague requirement to serve the public interest. And they really start pushing this thing called Ascertainment Ascertainment Swor when people in local communities were interviewed by station officials, people who we're never asked before, what do you think ought to be on radio, what do you think ought to be on tv. Now they were being asked these questions, this was done by radio stations and television stations, commercial stations, public stations across the country, You know, this seems so simple and so revolutionary at the same time, like just ask people what they want to hear about and maybe that would shape the broadcasting accordingly.

      FCC Ascertainment guidance

    4. So it's the fifties and sixties, the stations, they're cutting the NBC coverage of the civil rights movement and it's not just, you know, morally dubious, it's actually against the policies of the FCC. Exactly, right. So civil rights activists decided to put that to the test and they ended up challenging Wlbt? S license for repeatedly denying them airtime. At first, the FCC dismissed the case, but then the activists sued the FCC and they won. And eventually, years later, a federal court decided that Wlbt could stay on the air, but their license would be transferred to a nonprofit, multiracial group of broadcasters.

      Enforcement of the FCC Fairness Doctrine

      The case was decided at the D.C. Circuit court level with the future Justice Warren Berger writing the opinion. The opinion forced action at the FCC.

      Office of Commun., United Ch., Christ v. FCC. 425 F.2d 543 (D.C. Cir. 1969). (see https://casetext.com/case/office-of-commun-united-ch-christ-v-fcc)

    5. Okay, so flashback to the 1920s and the emergence of something called the public interest mandate, basically when radio was new, a ton of people wanted to broadcast the demand for space on the dial outstripped supply. So to narrow the field, the federal government says that any station using the public airwaves needs to serve the public interest. So what do they mean by the public interest? Yeah, right? It's like super vague, right? But the FCC clarified what it meant by public interest in the years following World War Two, They had seen how radio could be used to promote fascism in Europe, and they didn't want us radio stations to become propaganda outlets. And so in 1949, the FCC basically says to stations in order to serve the public, you need to give airtime to coverage of current events and you have to include multiple perspectives in your coverage. This is the basis of what comes to be known as the fairness doctrine.

      Origin of the FCC Fairness Doctrine

    1. The presence of Twitter’s code — known as the Twitter advertising pixel — has grown more troublesome since Elon Musk purchased the platform.AdvertisementThat’s because under the terms of Musk’s purchase, large foreign investors were granted special privileges. Anyone who invested $250 million or more is entitled to receive information beyond what lower-level investors can receive. Among the higher-end investors include a Saudi prince’s holding company and a Qatari fund.

      Twitter investors may get access to user data

      I'm surprised but not surprised that Musk's dealings to get investors in his effort to take Twitter private may include sharing of personal data about users. This article makes it sound almost normal that this kind of information-sharing happens with investors (inclusion of the phrase "information beyond what lower-level investors can receive").

    1. Instead, we should adopt a vision for the best possible America for this century, one that acknowledges that people, money, goods, and expressions are going to flow across borders and oceans, but that embraces justice and human flourishing as the ends of that process instead of morally vacant values like efficiency or productivity.

      Vision for America after the currrent constitutional crisis

    2. We often misdiagnose our current malady as one of “polarization.” That’s wrong. We have one rogue, ethno-authoritarian party and one fairly stable and diverse party. It just looks like polarization when you map it red and blue or consider these parties to be equal in levels of mercenary commitment, which they overwhelmingly are not. In one sense, America has always been polarized, just not along partisan lines. It’s also been more polarized rather recently, as in 1919 or 1968.Instead, we suffer from judicial tyranny fueled by white supremacy. One largely unaccountable branch of government has been captured by ideologues who have committed themselves to undermining the will of the electorate on matters ranging from women’s bodily autonomy to voting rights to the ability of the executive branch to carry out the policy directives of Congress by regulating commerce and industry.

      Thesis: not polarization but white-supremacy-filled judicial tyranny

      It isn’t clear to me that the judiciary is filled with white suprematists, but the judiciary is increasingly swinging conservative appointed by far right ideologues fueled by white suprematism.

    1. While the datagram has served veIy well in solving themost important goals of the Internet, it has not served sowell when we attempt to addresssome of the goals whichwere further down the priority list. For example, thegoals of resource management and accountability haveproved difficult to achieve in the context of datagrams.As the previous section discussed, most datagrams are apart of some sequence of packets from source todestination, rather than isolated units at the applicationlevel. However, the gateway cannot directly see theexistence of this sequence, because it is forced to dealwith each packet in isolation. Therefore, resourcemanagement decisions or accounting must be done oneach packet separately. Imposing the datagram model onthe intemet layer has deprived that layer of an importantsource of information which it could use in achievingthese goals.

      Datagrams solved the higher priority goals

      ...but 34 years later we have the same challenges with the lower priority goals.

    2. There is a mistaken assumption often associated withdatagrams, which is that the motivation for datagrams isthe support of a higher level service which is essentiallyequivalent to the datagram. In other words, it hassometimes been suggested that the datagram is providedbecause the transport service which the applicationrequires is a datagram service. In fact, this is seldom thecase. While some applications in the Internet, such assimple queries of date servers or name servers, use anaccess method based on an unreliable datagram, mostservices within the Internet would like a moresophisticated transport model than simple datagram.Some services would like the reliability enhanced, somewould like the delay smoothed and buffered, but almostall have some expectation more complex than adatagram. It is important to understand that the role ofthe datagram in this respect is as a building block, and notas a service in itself.

      Datagram as the fundamental building block

      Is it any wonder then that QUIC—a TCP-like stateful connection—is being engineered using UDP?

    3. This problem was particularly aggravating because thegoal of the Internet project was to produce specificationdocuments which were to become military standards. Itis a well known problem with government contractingthat one cannot expect a contractor to meet any criteriawhich is not a part of the procurement standard. If theInternet is concerned about performance, therefore, it wasmandatory that performance requirements be put into theprocurement specification. It was trivial to inventspecifications which constrained the performance, forexample to specify that the implementation must becapable of passing 1.000 packets a second. However, thissort of constraint could not be part of the architecture,and it was therefore up to the individual performing theprocurement to recognize that these performanceconstraints must be added to the specification, and tospecify them properly to achieve a realization whichprovides the required types of service

      Procurement standards meet experimental factors

      I'm finding it funny to read this artifact of its time with the construction of the protocol standards and the purchase of hardware that met that standard. As I can imagine, it was all new in the 1970s and 1980s, and it was evolving quickly. Procurement rules are a pain no matter what the decade they are in.

    4. Put another way, the architecture tried very hard not toconstrain the range of service which the Internet could beengineered to provide. This, in turn, means that tounderstand the service which can be offered by aparticular implementation of an Internet, one must looknot to the architecture, but to the actual engineering of thesoftware within the particular hosts and gateways, and tothe particular networks which have been incorporated.

      Upper part of the hourglass protocol shape

    5. Another possible source of inefficiency is retransmissionof lost packets. Since Internet does not insist that lostpackets be recovered at the network level, it may benecessary to retransmit a lost packet from one end of theInternet to the other. This means that the retransmittedpacket may cross several intervening nets a second time,whereas recovery at the network level would not generatethis repeat traffic. This is an example of the tradeoffresulting from the decision, discussed above, of providingservices from the end-points. The network interface codeis much simpler, but the overall efficiency is potentiallyless. However, if the retransmission rate is low enough(for example, 1%) then the incremental cost is tolerable.As a rough rule of thumb for networks incorporated intothe architecture, a loss of one packet in a hundred is quitereasonable, but a loss of one packet in ten suggests thatreliability enhancements be added to the network if thattype of service is required.

      Inefficiency of end-to-end packet re-transmission is accepted

    6. On the other hand, some of the most significant problemswith the Internet today relate to lack of sufficient tools fordistributed management,especially in the area of routing.In the large intemet being currently operated, routingdecisions need to be constrained by policies for resourceusage. Today this can be done only in a very limitedway, which requires manual setting of tables. This iserror-prone and at the same time not sufficientlypowerful. The most important change in the Internetarchitecture over the next few years will probably be thedevelopment of a new generation of tools formanagement of resources in the context of multipleadministrations.

      Internet routing problems

      This was written in 1988, and is still somewhat true today.

    7. This goal caused TCP and IP, which originally had beena single protocol in the architecture, to be separated intotwo layers. TCP provided one particular type of service,the reliable sequenceddata stream, while IP attempted toprovide a basic building block out of which a variety oftypes of service could be built. This building block wasthe datagram, which had also been adopted to supportsurvivability. Since the reliability associated with thedelivery of a datagram was not guaranteed, but “besteffort,” it was possible to build out of the datagram aservice that was reliable (by acknowledging andretransmitting at a higher level), or a service which tradedreliability for the primitive delay characteristics of theunderlying network substrate. The User DatagramProtocol (UDP)13 was created to provide a application-level interface to the basic datagram service of Internet.

      Origin of UDP as the split of TCP and IP

      This is the center of the hourglass protocol stack shape.

    8. It was very important for the success of the Internetarchitecture that it be able to incorporate and utilize awide variety of network technologies, including militaryand commercial facilities. The Internet architecture hasbeen very successful in meeting this goal: it is operatedover a wide variety of networks, including long haul nets(the ARPANET itself and various X.25 networks), localarea nets (Ethernet, ringnet, etc.), broadcast satellite nets(the DARPA Atlantic Satellite Network’“, I5 operating at64 kilobits per second and the DARPA ExperimentalWideband Satellite Net,16 operating within the UnitedStates at 3 megabits per second), packet radio networks(the DARPA packet radio network, as well as anexperimental British packet radio net and a networkdeveloped by amateur radio operators), a variety of seriallinks, ranging from 1200 bit per second asynchronousconnections to TI links, and a variety of other ad hocfacilities, including intercomputer busses and thetransport service provided by the higher layers of othernetwork suites, such as IBM’s HASP.

      Lower part of the hourglass protocol stack shape

    9. Another service which did not fu TCP was real timedelivery of digitized speech, which was needed to supportthe teleconferencing aspect of command and controlapplications. III real time digital speech, the primaryrequirement is not a reliable service, but a service whichminimizes and smooths the delay in the delivery ofpackets.

      Considerations for digital speech in 1988

    10. There are two consequencesto the fate-sharing approachto survivability. First. the intermediate packet switchingnodes, or gateways, must not have any essential stateinformation about on-going connections. Instead, theyare stateless packet switches, a class of network designsometimes called a “datagram” network. Secondly, rathermore trust is placed in the host machine than in anarchitecture where the network ensures the reliabledelivery of data. If the host resident algorithms thatensure the sequencing and acknowledgment of data fail,applications on that machine are prevented fromoperation.

      Fate-sharing approach to survivability

    11. It was an assumption in thisarchitecture that synchronization would never be lostunless there was no physical path over which any sort ofcommunication could be achieved. In other words, at thetop of transport, there is only one failure, and it is totalpartition. The architecture was to mask completely anytransient failure.

      Never a failure until there was no path

      I remember being online for the Northridge earthquake in the Los Angeles area in January 1994. IRC was a robust tool for getting information in and out: text-based (so low bandwidth), ability to route around circuit failure.

    12. For example,since this network was designed to operate in a militarycontext, which implied the possibility of a hostileenvironment, survivability was put as a first goal, andaccountability as a last goal. During wartime. one is lessconcerned with detailed accounting of resources usedthan with mustering whatever resources are available andrapidly deploying them it-i an operational manner. Whilethe architects of the Internet were mindful ofaccountability, the problem received very little attentionduring the early stages of the design. aud is only nowbeing considered. An architecture primarily forcommercial deployment would clearly place these goalsat the opposite end of the list.

      Military context first

      In order of priority, a network designed to be resilient in a hostile environment is more important than a network that has an accountable architecture. The paper even goes on to say that a commercial network would have a different architecture.

    13. From these assumptions comes the fundamental structureof the Internet: a packet switched communicationsfacility in which a number of distinguishable networksam connected together using packet communicationsprocessors called gateways which implement a store aridforward packet forwarding algorithm.

      Fundamental structure of the internet

      Effective network characteristics (in order of importance, from the paper):

      1. Internet communication must continue despite loss of networks or gateways.
      2. The Internet must support multiple types of communications service.
      3. The Internet architecture must accommodate a variety of networks.
      4. The Internet architecture must permit distributed management of its resources.
      5. The Internet architecture must be cost effective.
      6. The Internet architecture must permit host attachment with a low level of effort.
      7. The resources used in the Intemet architecture must be accountable.
    14. The technique selected for multiplexing was packetswitching. Au alternative such as circuit switching couldhave been considered, but the applications beingsupported, such as remote login, were naturally served bythe packet switching paradigm, and the networks whichwere to be integrated together in this project were packetswitching networks. So packet switching was acceptedas a fundamental component of the Internet architecture.

      Packet-switched versus circuit-switched

      The first networks were packet-switched over circuits. (I remember the 56Kbps circuit modems that were upgraded to T1 lines.) Of course, it has switched now—circuits are emulated over packet switched networks.

    15. Further, networks representadministrative boundaries of control, and it was anambition of this project to come to grips with the problemof integrating a number of separately administratedentities into a common utility.

      Integrating separately administered networks

      This is prefaced with the word "further" but I think it was perhaps more key to the ultimate strength of the "inter-net" that this agreement about interconnectivity was a key design principle. The devolution of control and the rise of the internet exchange points (IXPs) certainly fueled growth faster than a top-down approach would have.

    16. The components of the Internet were networks, whichwere to be interconnected to provide some larger service.The original goal was to connect together the ori BinalARPANET’ with the ARPA packet radio network’. ‘, inorder to give users on the packet radio network accesstothe large service machines on the ARPANET.

      Original goal to connect ARPA packet radio network with ARPANET

      I hadn't heard this before. As I was coming up in my internet education in the late 1980s, I remember discussions about connectivity with ALOHAnet in Hawaii.

    17. The connectionless configuration of IS0protocols has also been colored by the history of theInternet suite, so an understanding ‘of the Internet designphilosophy may be helpful to those working with ISO.

      ISO protocols

      At one point, the Open Systems Interconnection model (OSI model) was the leading contender for the network standard. It didn't survive in competition with the more nimble TCP/IP stack design.

    18. D. Clark. 1988. The design philosophy of the DARPA internet protocols. In Symposium proceedings on Communications architectures and protocols (SIGCOMM '88). Association for Computing Machinery, New York, NY, USA, 106–114. https://doi.org/10.1145/52324.52336

      The Internet protocol suite, TCP/IP, was first proposed fifteen years ago. It was developed by the Defense Advanced Research Projects Agency (DARPA), and has been used widely in military and commercial systems. While there have been papers and specifications that describe how the protocols work, it is sometimes difficult to deduce from these why the protocol is as it is. For example, the Internet protocol is based on a connectionless or datagram mode of service. The motivation for this has been greatly misunderstood. This paper attempts to capture some of the early reasoning which shaped the Internet protocols.

    1. Just for reference I believe that the "more speech" idea originated with Louis Brandeis, who was a brilliant thinker and one of the important liberal Supreme Court justices of the 20th century. The actual quote is:"If there be time to expose through discussion the falsehood and fallacies, to avert the evil by the process of education, the remedy to be applied is more speech, not enforced silence."[0]Louis Brandeis did believe that context and specifics are important, so I think the technology of the online platform is significant especially with respect to the first part of that quote.[0] https://tile.loc.gov/storage-services/service/ll/usrep/usrep...

      Brandeis more speech quote context

    2. The hypothesis is that hate speech is met with other speech in a free marketplace of ideas.That hypothesis only functions if users are trapped in one conversational space. What happens instead is that users choose not to volunteer their time and labor to speak around or over those calling for their non-existence (or for the non-existence of their friends and loved ones) and go elsewhere... Taking their money and attention with them.As those promulgating the hate speech tend to be a much smaller group than those who leave, it is in the selfish interest of most forums to police that kind of signal jamming to maximize their possible user-base. Otherwise, you end up with a forum full mostly of those dabbling in hate speech, which is (a) not particularly advertiser friendly, (b) hostile to further growth, and (c) not something most people who get into this gig find themselves proud of.

      Battling hate speech is different when users aren't trapped

      When targeted users are not trapped on a platform, they have the choice to leave rather than explain themselves and/or overwhelm the hate speech. When those users leave, the platform becomes less desirable for others (the concentration of hate speech increases) and it becomes a vicious cycle downward.

    1. The ability for users to choose if they wish to be collateral damage is what makes Mastodon work. If an instance is de-federated due to extremism, the users can pressure their moderators to act in order to gain re-federation. Otherwise, they must make the decision if to go down with the ship or simply move. This creates a healthy self-regulating ecosystem where once an instance starts to get de-federated, reasonable users will move their accounts, leaving behind unreasonable ones, which further justifies de-federation, and will lead to more and more instances choosing to de-federate the offending one.

      De-federation feedback loop

      If an instance owner isn't moderating effectively, other instances will start de-federating. Users on the de-federated instance can "go down with the ship or simply move". When users move off an instance, it increases the concentration of bad actors on that instance and increases the likelihood that others will de-federate.

    2. Most Mastodon servers run on donations, which creates a very different dynamic. It is very easy for a toxic platform to generate revenue through ad impressions, but most people are not willing to pay hard-earned money to get yelled at by extremists all day. This is why Twitter’s subscription model will never work. With Mastodon, people find a community they enjoy, and thus are happy to donate to maintain. Which add a new dynamic. Since Mastodon is basically a network of communities, it is expected that moderators are responsible for their own community, lowering the burden for everyone. Let’s say you run a Mastodon instance and a user of another instance has become problematic towards your users. You report them to their instance’s moderators, but the moderators decline to act. What can you do? Well a lot, actually.

      Accountability economy

      Assuming instance owners want their instance to thrive, they are accountable to the users—who are also donating funds to run the server. Mastodon also provides easy ways to block users or instances, and if bad actors start populating an instance, the instance gets a bad name and is de-federated by others. Users on the de-federated instance now have the option to stick around or go to another instance so they are reachable again.

    3. What I missed about Mastodon was its very different culture. Ad-driven social media platforms are willing to tolerate monumental volumes of abusive users. They’ve discovered the same thing the Mainstream Media did: negative emotions grip people’s attention harder than positive ones. Hate and fear drives engagement, and engagement drives ad impressions. Mastodon is not an ad-driven platform. There is absolutely zero incentives to let awful people run amok in the name of engagement. The goal of Mastodon is to build a friendly collection of communities, not an attention leeching hate mill. As a result, most Mastodon instance operators have come to a consensus that hate speech shouldn’t be allowed. Already, that sets it far apart from twitter, but wait, there’s more. When it comes to other topics, what is and isn’t allowed is on an instance-by-instance basis, so you can choose your own adventure.

      Attention economy

      Twitter drivers: Hate/fear → Engagement → Impressions → Advertiser money. Since there is no advertising money in Mastodon, it operates on different drivers. Since there is no advertising money, a Mastodon operator isn't driven to get the most impressions. Because there isn't a need to get a high number of impressions, there isn't a need to fuel the hate/fear drivers.

  4. Nov 2022
    1. As users begin migrating to the noncommercial fediverse, they need to reconsider their expectations for social media — and bring them in line with what we expect from other arenas of social life. We need to learn how to become more like engaged democratic citizens in the life of our networks.

      Fediverse should mean engaged citizens

    2. Because Mastodon is designed more for chatter than governance, we use a separate platform, Loomio, for our deliberation and decision-making.

      social.coop uses Loomio for governance

    3. We believe that it is time to embrace the old idea of subsidiarity, which dates back to early Calvinist theology and Catholic social teaching. The European Union’s founding documents use the term, too. It means that in a large and interconnected system, people in a local community should have the power to address their own problems. Some decisions are made at higher levels, but only when necessary. Subsidiarity is about achieving the right balance between local units and the larger systems.

      Defining "subsidiarity"

      The FOLIO community operates like this..the Special Interest Groups have the power to decide for their functional area, and topics that cross functional areas are decided between SIGs or are brought to a higher level council.

    1. Nevertheless, from the standpoint of learning theory, these and other authors have it backward, because a steep learning curve, i.e., a curve with a large positive slope, is associated with a skill that is acquired easily and rapidly (Hopper et al., 2007).

      Steep learning curve

      I don't think I'll ever hear this phrase the same again. A steep learning curve is a good thing...meaning over time that it was very easy to learn (less time on the x axis).

    2. Nevertheless, even ardent proponents of the view that DID is a naturally occurring condition that stems largely from childhood trauma (e.g., Ross, 1994) acknowledge that “multiple personality disorder” is a misnomer (Lilienfeld and Lynn, 2015), because individuals with DID do not genuinely harbor two or more fully developed personalities

      Multiple personality disorder

      Use dissociative identity disorder since 1994.

    3. There is no known “optimal” level of neurotransmitters in the brain, so it is unclear what would constitute an “imbalance.” Nor is there evidence for an optimal ratio among different neurotransmitter levels. Moreover, although serotonin reuptake inhibitors, such as fluoxetine (Prozac) and sertraline (Zoloft), appear to alleviate the symptoms of severe depression, there is evidence that at least one serotonin reuptake enhancer, namely tianepine (Stablon), is also efficacious for depression (Akiki, 2014). The fact that two efficacious classes of medications exert opposing effects on serotonin levels raises questions concerning a simplistic chemical imbalance model.

      Chemical imbalance

      We don't (yet) know what the proper balance of brain chemistry would be, so saying that mental illness is cause by a chemical imbalance is problematic. There are drugs that effectively treat depression that both decrease and increase serotonin, so because of these opposite effects it is hard to say what the proper amount should be.

    4. Furthermore, there are ample reasons to doubt whether “brainwashing” permanently alters beliefs (Melton, 1999). For example, during the Korean War, only a small minority of the 3500 American political prisoners subjected to intense indoctrination techniques by Chinese captors generated false confessions. Moreover, an even smaller number (probably under 1%) displayed any signs of adherence to Communist ideologies following their return to the US, and even these were individuals who returned to Communist subcultures


      The techniques of "brainwashing" aren't that much different form other persuasion methods. This term originated in the Korean war, and subsequent studies suggested that there are no permanent alterations to beliefs.

    5. numerous scholars have warned of the jingle and jangle fallacies, the former being the error of referring to different constructs by the same name and the latter the error of referring to the same construct by different names (Kelley, 1927; Block, 1995; Markon, 2009). As an example of the jingle fallacy, many authors use the term “anxiety” to refer interchangeably to trait anxiety and trait fear. Nevertheless, research consistently shows that fear and anxiety are etiologically separable dispositions and that measures of these constructs are only modestly correlated (Sylvers et al., 2011). As an example of the jangle fallacy, dozens of studies in the 1960s focused on the correlates of the ostensibly distinct personality dimension of repression-sensitization (e.g., Byrne, 1964). Nevertheless, research eventually demonstrated that this dimension was essentially identical to trait anxiety (Watson and Clark, 1984). In the field of social psychology, Hagger (2014) similarly referred to the “deja variable” problem, the ahistorical tendency of researchers to concoct new labels for phenomena that have long been described using other terminology (e.g., the use of 15 different terms to describe the false consensus effect; see Miller and Pedersen, 1999).

      Jingle and Jangle Fallacies

      Jingle: referring to different things by the same word

      Jangle: referring to a single thing with different words

    6. Lilienfeld, S. O., Sauvigné, K. C., Lynn, S. J., Cautin, R. L., Latzman, R. D., & Waldman, I. D. (2014). Fifty psychological and psychiatric terms to avoid: a list of inaccurate, misleading, misused, ambiguous, and logically confused words and phrases. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2015.01100

    1. The Great Depression and its aftermath offered Schmitz and a colleague one such opportunity. Poverty shrinks brains from birth By comparing markers of ageing in around 800 people who were born throughout the 1930s, the team observed that those born in US states hit hardest by the recession — where unemployment and wage reductions were highest — have a pattern of markers that make their cells look older than they should. The impact was diminished in people who were born in states that fared better during the 1930s.The cells could have altered the epigenetic tags during early childhood or later in life. But the results suggest that some sort of biological foundation was laid before birth for children of the Great Depression that affected how they would age, epigenetically, later in life.

      Aging markers affected in utero.

    1. A 2020 study by the European Union found that contrails and other non-CO2 aircraft emissions warm the planet twice as much as the carbon dioxide released by airplanes.

      From the intermediate linked blog post:

      Using a derivative metric of the Global Warming Potential (100), the GWP, aviation emissions are currently warming the climate at approximately three times the rate of that associated with CO2 emissions alone.

      pp. 35-36 of EASA report for European Commission, (2020). Updated analysis of the non-CO2 climate impacts of aviation and potential policy measures pursuant to the EU Emissions Trading System Directive Article 30(4). https://eur-lex.europa.eu/resource.html?uri=cellar:7bc666c9-2d9c-11eb-b27b-01aa75ed71a1.0001.02/DOC_1&format=PDF

    1. Meta collects so much data even the company itself sometimes may be unaware of where it ends up. Earlier this year Vice reported on a leaked Facebook document written by Facebook privacy engineers who said the company did not “have an adequate level of control and explainability over how our systems use data,” making it difficult to promise it wouldn’t use certain data for certain purposes.

      Poor data controls at Facebook

    2. Some of the sensitive data collection analyzed by The Markup appears linked to default behaviors of the Meta Pixel, while some appears to arise from customizations made by the tax filing services, someone acting on their behalf, or other software installed on the site. Report Deeply and Fix Things Because it turns out moving fast and breaking things broke some super important things. Give Now For example, Meta Pixel collected health savings account and college expense information from H&R Block’s site because the information appeared in webpage titles and the standard configuration of the Meta Pixel automatically collects the title of a page the user is viewing, along with the web address of the page and other data. It was able to collect income information from Ramsey Solutions because the information appeared in a summary that expanded when clicked. The summary was detected by the pixel as a button, and in its default configuration the pixel collects text from inside a clicked button.  The pixels embedded by TaxSlayer and TaxAct used a feature called “automatic advanced matching.” That feature scans forms looking for fields it thinks contain personally identifiable information like a phone number, first name, last name, or email address, then sends detected information to Meta. On TaxSlayer’s site this feature collected phone numbers and the names of filers and their dependents. On TaxAct it collected the names of dependents.

      Meta Pixel default behavior is to parse and send sensitive data

      Wait, wait, wait... the software has a feature that scans for privately identifiable information and sends that detected info to Meta? And in other cases, the users of the Meta Pixel decided to send private information ot Meta?

    1. We’re not mandating content warnings.   I think I’ve kind of had every single opinion that one can have about this. My first response, which I think is most journalists’ first response, was, “Who are these precious snowflakes?” Then a bunch of people said, “No, that’s not how to think about it; it’s really just the subject line of an email,” and if I had the right to send you an email where you had to see the whole thing, that’d be kind of annoying. But then a lot of people in the BIPOC community said, “The way this is being used on Mastodon is often to shield White people from racism and homophobia and other issues.” And so I’m very sympathetic to that as well. I think the solution Eugen came up with is the right solution: It’s a tool, and you can use it if you want to.

      Content Warnings

      What Davidson doesn't mention here is a Mastodon feature that I find fascinating. Sure, the person who creates the post can have a content warning, but the viewer also has the ability to set keywords that they want hidden behind a Content Warning (or simply blocked).

    2. Everyone who goes through the exercise of “what is journalism?” quickly learns there are no obvious, uncontroversial answers. We had a conversation this morning about somebody who has a blog about beer. We said, well, this person does reporting, they actually interview people, they look at statistics, they’re not just sharing their opinion on beer. And it felt like, yeah, that’s journalism. Now, would we make that decision a month from now? I don’t know. I don’t think it’s appropriate for me to get into specifics, but we’ve had some tricky edge cases. Inherently, it’s tricky.

      Distributed verification, or "What is Journalism?"

      The admins of the journa.host server are now taking on the verification task. The example Davidson uses is a beer blog; the blog is more than opinion, so for the moment that person is added.

      So what is the role of professional organizations and societies to create a fediverse home for recognized members? This doesn't seem sustainable...particularly since people set the dividing lines between their professional and personal interests in different places.

      Spit-balling here...this reminds me somewhat of the Open Badges effort of Mozilla and IMS Global. If something like that was built into the Mastodon profile, then there would be transparency with a certifying agency.

    3. The Twitter blue check, for all the hate I have given Twitter over the years, is a public good. It is good, in my view, that when you read a news article or view a post, you can know with confidence it’s the journalist at that institution. It doesn’t mean they’re 100 percent right or 100 percent ethical, but it does mean that’s a person who is in some way constrained by journalism ethics.

      Twitter Blue Check as a public good

      There was some verification process behind the pre-Musk blue check, and that was of benefit to those reading and evaluating the veracity of the information. Later, Davidson points out that "journalism had outsourced that whole process...to whoever happened to work at Twitter."

    4. Davidson: I think the interface on Mastodon makes me behave differently. If I have a funny joke or a really powerful statement and I want lots of people to hear it, then Twitter’s way better for that right now. However, if something really provokes a big conversation, it’s actually fairly challenging to keep up with the conversation on Twitter. I find that when something gets hundreds of thousands of replies, it’s functionally impossible to even read all of them, let alone respond to all of them. My Twitter personality, like a lot of people’s, is more shouting. Whereas on Mastodon, it’s actually much harder to go viral. There’s no algorithm promoting tweets. It’s just the people you follow. This is the order in which they come. It’s not really set up for that kind of, “Oh my god, everybody’s talking about this one post.” It is set up to foster conversation. I have something like 150,000 followers on Twitter, and I have something like 2,500 on Mastodon, but I have way more substantive conversations on Mastodon even though it’s a smaller audience. I think there’s both design choices that lead to this and also just the vibe of the place where even pointed disagreements are somehow more thoughtful and more respectful on Mastodon.

      Twitter for Shouting; Mastodon for Conversation

      Many, many followers on Twitter makes it hard for conversations to happen, as does the algorithm-driven promotion. Fewer followers and anti-viral UX makes for more conversations even if the reach isn't as far.

    1. I just learned this idea of anchor institution at the Association of Rural and Small Libraries Conference. There are institutions that anchor communities. Right. So that the hospital is one. Lots of people work there. Everyone goes there at some point, has a role to play in the community and the library is similar. You'll often get people who will say that the library's are irrelevant, but that just means that they can afford not to use a public service. And I don't know why they are the people we ask to share their expertise on the use of public services. But most of us use the public library. Our kids get their picture books there. We maybe do passport services. Maybe the library has tech training. One of my first jobs at the public library was teaching senior citizens how to do mouse and keyboarding skills. So where else are you going to learn those things? You learn them at the library.

      Libraries as anchor institutions

      Public libraries, in particular, and the places where anyone in the community can go for services. The mission of the library is to serve the needs of the specific community it is in.

    2. BROOKE GLADSTONE It's always framed as parents rights, but according to Summer Lopez, who's the chief program officer of free expression at PEN America, most of these book bans are on books that families and children can elect to read. They're not required to read them. They just exist.   EMILY DRABINSKI One of the things I loved about libraries when I first started is that they are non-coercive learning spaces. You don't have to read anything. You can choose from anything on the shelf. And if your kid checks out something you don't want them to read, that's between you and your child and the way that you're parenting. And it just isn't something that the state needs to be involved in.

      Libraries as non-coercive learning spaces

      Citing parent's rights is a false choice. The parents do have the right to supervise what their children read. But the book is just on the shelf..."they just exist".

    3. Well, librarians are professionals. We go through a library master's degree program, and we're trained on the job to make book selections for our communities. We build collections that are responsive to the needs of the people we serve. So right now, I'm talking to you from the Graduate Center in midtown Manhattan. My liaison responsibilities here to the School of Labor and Urban Studies and to our urban education program. I'm not going to choose Gender Queer to purchase for our library, not because I'm a censor, but because that's not a book that we need in our collection right now. But I think you can tell that it's not really about the books if you look to some of the particular cases. So, for example, attacks on the Boundary County Library in Northern Idaho. This was the same set of 300 books that they want banned. The extremist right in that part of the state came after the public library there. That library didn't own any of the books that were on the list.

      Librarian training in material selection appropriate for the library's audience

    4. BROOKE GLADSTONE In the Tennessee State Assembly last April, Representative Jerry Sexton took on this question.   [CLIP]   JERRY SEXTON Let's say you take these books out of the library. What are you going to do with them? You can put them on the street, let them on fire.    JERRY SEXTON I don't have a clue, but I would burn them.

      Tennessee State Representative would burn banned books

      It's true: Representative says he would burn books deemed inappropriate by state – Tennessee Lookout

    1. “In a way, Twitter has become a kind of aggregator of information,” says Eliot Higgins, founder of open-source investigators Bellingcat, who helped bring the perpetrators who downed MH17 to justice. “A lot of this stuff you see from Ukraine, the footage comes from Telegram channels that other people are following, but they're sharing it on Twitter.” Twitter has made it easier to categorize and consume content from almost any niche in the world, tapping into a real-time news feed of relevant information from both massive organizations and small, independent voices. Its absence would be keenly felt.

      Twitter's role in aggregating world news (and reactions)

    2. For eight years, the US Library of Congress took it upon itself to maintain a public record of all tweets, but it stopped in 2018, instead selecting only a small number of accounts’ posts to capture.  “It never, ever worked,” says William Kilbride, executive director of the Digital Preservation Coalition. The data the library was expected to store was too vast, the volume coming out of the firehose too great. “Let me put that in context: it’s the Library of Congress. They had some of the best expertise on this topic. If the Library of Congress can’t do it, that tells you something quite important,” he says.

      Library of Congress' role in archiving twitter

    3. Part of what makes Twitter’s potential collapse uniquely challenging is that the “digital public square” has been built on the servers of a private company, says  O’Connor’s colleague Elise Thomas, senior OSINT analyst with the ISD. It’s a problem we’ll have to deal with many times over the coming decades, she says: “This is perhaps the first really big test of that.”

      Public Square content on the servers of a private company

    1. Judith Cremer, the library director, said the book was added to the library after it made the William Allen White Award 2017-2018 Master List for grades 3-5, and has only been checked out four times.Cremer said parents have the option of filtering which books their children check out, and can speak to staff about limiting their children’s access to certain books. She stressed that she and her staff aren’t trying to fight the council and aren’t interested in divisive matters. She’s been at the library for almost 20 years, and just wants to serve the community.“We just are doing what public libraries do,” Cremer said. “We don’t really judge information, we are a reflection of the world and things that are in the world. We have information that has been published and mediated and checked for facts. So it’s a safe place that people can go to get access to that information. It’s not like we are handing out or advocating it in any way. It’s just there.”

      Not advocacy...just there

    2. St. Marys resident Hannah Stockman, a stay-at-home mom looking after 13 kids, said the move would be devastating for her and others like her.“At this point, it’s the only space left that we have for the public,” Stockman said. “We don’t have any pool or any other amenities through the community center. So people come here for many, many different reasons.”

      Library as community space

    1. Although complicated, Gen Z’s relationship with data privacy should be a consideration for brands when strategizing their data privacy policies and messaging for the future. Expectations around data privacy are shifting from something that sets companies apart in consumers’ minds to something that people expect the same way one might expect a service or product to work as advertised. For Gen Zers, this takes the form of skepticism that companies will keep their data safe, and their reluctance to give companies credit for getting it right means that good data privacy practices will increasingly be more about maintaining trust than building it.

      Gen-Z expectations are complicated

      The Gen-Z generation have notably different expectations about data privacy than previous generations. "Libraries" wasn't among the industry that showed up in their survey results. That Gen-Z expects privacy built in makes that factor a less differentiating characteristic as compared to older generations. It might also be harder to get trust back from members of the Gen-Z population if libraries surprise those users with data handling practices that they didn't expect.

    2. The notable exception: social media companies. Gen Zers are more likely to trust social media companies to handle their data properly than older consumers, including millennials, are.

      Gen-Z is more trusting of data handling by social media companies

      For most categories of businesses, Gen Z adults are less likely to trust a business to protect the privacy of their data as compared to other generations. Social media is the one exception.

    3. Furthermore, the youngest generation is more jaded than others on this topic, finding it less surprising if a company stores data in a way that exposes it to a breach or asks for information without explaining what it will be used for.

      Disconnect between generations on privacy expectations

      This chart from the article summarizes the responses of a survey across generations of adults. It shows that Gen-Z adults are less surprised by privacy-enhancing settings, tools, and techniques, and more surprised in situations when their data is used for advertising (as compared to other generations).

    4. Gen Z came of age during a major moment in data privacy. The global conversation around the topic shifted following the Facebook/Cambridge Analytica scandal in 2018. The incident altered the public’s view of major technology companies from great innovators and drivers of the economy to entities that need more oversight. Massive regulatory efforts like the General Data Protection Regulation in Europe and the California Consumer Privacy Act have since gone into effect, with more on the way, representing a break in consumers’ trust in tech companies to sufficiently safeguard people’s data.

      Gen Z expectations for data privacy are different

      They came of age during the Facebook/Cambridge Analytica scandal. [[GDPR]] has been a thing for them and set expectations for how user data is treated (at least in Europe). These become table stakes for Gen Z users...what else is a company doing to differentiate itself.?

    1. In other words, the community of early users were super serious about consent. They don’t like their utterances circulating in ways they don’t like. You could say, “well, tough; you’re posting stuff in public, right?” But since this is Mastodon, users have powerful tools for responding to actions they don’t like. If the folks on server A don’t like the behavior of people on server B, they can “defederate” from server B; everyone on server B can no longer see what folks on A are doing, and vice versa. (“Defederating” is another deep part of Mastodon’s design that is, ultimately, powerfully antiviral.)

      Defederation to combat breaches of norms

      I wonder if this sort of thing would happen if someone created an ActivityPub node that did exhibit some of these viral behaviors. Would that node be shunned by the rest of the fediverse? Will that answer be the same a year from now when the fediverse is more mainstream?

    2. Another big, big difference with Mastodon is that it has no algorithmic ranking of posts by popularity, virality, or content. Twitter’s algorithm creates a rich-get-richer effect: Once a tweet goes slightly viral, the algorithm picks up on that, pushes it more prominently into users’ feeds, and bingo: It’s a rogue wave.On Mastodon, in contrast, posts arrive in reverse chronological order. That’s it. If you’re not looking at your feed when a post slides by? You’ll miss it.

      No algorithmic ranking on Mastodon

      To drive the need to make the site sticky and drive ads, Twitter used its algorithmic ranking to find and amplify viral content.

    3. For example, Mastodon has no analogue of Twitter’s “quote-tweet” option. On Mastodon, you can retweet a post (they call it “boosting”). But you can’t append your own comment while boosting. You can’t quote-tweet.Whyever not? Because Mastodon’s original designer (and the community of early users) worried that quote-tweeting on Twitter had too often encouraged a lot of “would you look at this bullshit?” posts. And that early Mastodon community didn’t much like those dynamics.

      No Quote-Tweet equivalent on Mastodon

    4. As Beschizza said …“I wanted something where people could publish their thoughts without any false game of social manipulation, one-upmanship, and favor-trading.”It was, as I called it, “antiviral design”.

      Definition of "antiviral design"

      Later, Thompson says: "[Mastodon] was engineered specifically to create _friction — _to slow things down a bit. This is a big part of why it behaves so differently from mainstream social networks."

      The intentional design decisions on Mastodon slow user activity.

    1. CASE NO. 2:22-cv-2470 docket #29






      From OCLC Online Computer Library Center, Inc. v. Clarivate, Plc, 2:22-cv-02470, the PACER mirror on CourtListener.com.

    1. By the mid-2010s, Chinese people in big cities had generally switched from using cash to using Alipay and WeChat Pay. By the end of 2021, about 64 percent of Chinese people were using mobile payment systems, according to a report by Daxue Consulting, with Alipay and WeChat Pay handling most payments. For city dwellers, the figure was 80 percent. One reason China’s government is pushing the digital yuan is to try to gain more control of how citizens make payments. For years, big tech companies were able to operate almost like public utilities, creating and effectively regulating large parts of the financial industry.

      Already high adoption of commercial digital payment systems

      Previously in the hands of companies, the governmental digital cash system could usurp those systems.

    2. The central bank is building the infrastructure needed to enable sweeping adoption in years to come, signing up merchants, adapting the banking system, and developing applications such as a way to earmark money for health care or transit, he says. That lays the groundwork for eCNY to be China’s default payment system in 10 to 15 years, and it has been enough to put the project ahead of any other government-backed digital currency.

      Infrastructure for controlling spending

      Not only is the government putting the raw transaction infrastructure in place, but this sentence makes it sound like they will be able to control how money is spent. Perhaps the government could make a cash transfer to a citizen, but limit where the citizen can use that cash.

    3. Unlike a cryptocurrency like Bitcoin, the digital yuan is issued directly by China’s central bank and does not depend on a blockchain. The currency has the same value as its analog equivalent, the yuan or RMB, and for consumers the experience of using the digital yuan is not that different from any other mobile payment system or credit card. But on the back end, payments are not routed through a bank and can sometimes move without transaction fees, jumping from one e-wallet to another as easily as cash changes hands.

      Not a cryptocurrency, not a bank card

    4. The hope for government-sanctioned digital currencies is that they will improve efficiency and spur innovation in financial services. But tech and China experts watching the country’s project say that eCNY, also known as the electronic Chinese yuan or digital yuan, also opens up new forms of government surveillance and social control. The head of UK intelligence agency GCHQ, Jeremy Fleming, warned in a speech last month that Beijing could use its digital currency to monitor its citizens and eventually evade international sanctions.

      Improve economic efficiency, but also surveillance

    5. Government officials are urging citizens to adopt the official digital currency in a bid to gain more control over the economy.

    1. This circular process of issuing new shares to employees and then buying those shares back with company money – MY money as a shareholder – is called ‘sterilization’.

      Definition of stock “sterilization“

    1. “There is growing evidence that the legislative and executive branch officials are using social media companies to engage in censorship by surrogate,” said Jonathan Turley, a professor of law at George Washington University, who has written about the lawsuit. “It is axiomatic that the government cannot do indirectly what it is prohibited from doing directly. If government officials are directing or facilitating such censorship, it raises serious First Amendment questions.”

      Censorship by surrogate

      Is the government using private corporations to censor the speech of Americans?

    2. Under President Joe Biden, the shifting focus on disinformation has continued. In January 2021, CISA replaced the Countering Foreign Influence Task force with the “Misinformation, Disinformation and Malinformation” team, which was created “to promote more flexibility to focus on general MDM.” By now, the scope of the effort had expanded beyond disinformation produced by foreign governments to include domestic versions. The MDM team, according to one CISA official quoted in the IG report, “counters all types of disinformation, to be responsive to current events.” Jen Easterly, Biden’s appointed director of CISA, swiftly made it clear that she would continue to shift resources in the agency to combat the spread of dangerous forms of information on social media.

      MDM == Misinformation, Disinformation, and Malinformation.

      These definitions from earlier in the article: * misinformation (false information spread unintentionally) * disinformation (false information spread intentionally) * malinformation (factual information shared, typically out of context, with harmful intent)

    3. The stepped up counter-disinformation effort began in 2018 following high-profile hacking incidents of U.S. firms, when Congress passed and President Donald Trump signed the Cybersecurity and Infrastructure Security Agency Act, forming a new wing of DHS devoted to protecting critical national infrastructure. An August 2022 report by the DHS Office of Inspector General sketches the rapidly accelerating move toward policing disinformation. From the outset, CISA boasted of an “evolved mission” to monitor social media discussions while “routing disinformation concerns” to private sector platforms.

      High-profile hacking opens door

      In response to the foreign election interference in 2016 and high-profile hacking of U.S. corporations, the 2018 Cybersecurity and Infrastructure Security Agency Act expanded the DHS powers to protect critical national infrastructure. The article implies that the social media monitoring is grounded in that act.

    4. DHS’s mission to fight disinformation, stemming from concerns around Russian influence in the 2016 presidential election, began taking shape during the 2020 election and over efforts to shape discussions around vaccine policy during the coronavirus pandemic. Documents collected by The Intercept from a variety of sources, including current officials and publicly available reports, reveal the evolution of more active measures by DHS. According to a draft copy of DHS’s Quadrennial Homeland Security Review, DHS’s capstone report outlining the department’s strategy and priorities in the coming years, the department plans to target “inaccurate information” on a wide range of topics, including “the origins of the COVID-19 pandemic and the efficacy of COVID-19 vaccines, racial justice, U.S. withdrawal from Afghanistan, and the nature of U.S. support to Ukraine.”

      DHS pivots as "war on terror" winds down

      The U.S. Department of Homeland Security pivots from externally-focused terrorism to domestic social media monitoring.

    1. the container ship was simply becoming so large so unwieldy that much of the infrastructure around them is struggling to cope a lot of the decisions to build Supply chains were really based on

      Impact of cheap transportation

      production costs and transport costs

      With transportation costs so low and logistics assumed, manufactures chased cheaper production costs. They would outsource manufacturing to low-cost countries without considering the complexity risks.

    2. the industry agreed that the standard container sizes would be 20

      Standardization effort

      feet and 40 feet.

      In 10 years of negotiation, the standards committee agreed on the container specifications, including dimensions, corner post attachments.

    3. at the same time the US Army had been experimenting and seeing success with their smaller container Express or context boxes during the Korean war

      U.S. Army containerization efforts in the Korean War

      Somewhere I read about how containerization was driven by the U.S. Army's needs to standardize transport to Korea, and the Oakland, California, docks were the first to see container cranes. I can't find the source of this anymore, though.

    4. he understood that rather than adapting the container to suit the industry it was the industry and its entirety that would have to adapt trucks trains and ships ports and dockyards would all have to

      Containerization's disruptive innovation

      fit the container not the other way around.

      McLean's big contribution is the need for an upheaval in the industry—that the standardized container was the building block and everything else around it had to change.

      The resulting disruption affected dock workers, the support infrastructure around ports, and even the port cities themselves.

    5. he'd over time built a very large trucking company in the United States he became worried in the

      Malcom McLean

      early 1950s because there was an automotive boom in the United States there were a lot more cars on the road this was slowing down his lorries and he thought that maybe if he were able to put his trucks onto a ship and carry them down the Atlantic coast that he'd have lower costs and more reliable delivery.

      Malcom McLean - Wikipedia

      Malcom's thought was to put the whole truck on a ship, but that wasn't effective. Instead, he put the just container from the truck body on the ship. The multimodal innovation between truck and ship proved crucial to the standardization of shipping containers. He built on earlier work by Keith Tantlinger to modernize containers.

    6. it took about 11 and a half days to actually go across the Atlantic and it took about six or four days to actually unload it all in Germany

      About 12% of the cost was the actual ship movement while almost 40% was the work of the longshoremen on either end.

    7. break both shipping tended to be slow a vessel could spend a week or more at the dock

      Breakbulk Cargo defined

      being unloaded and reloaded as each of the individual items in the hole had to be removed and then each of the individual outgoing items had to be stowed away in the hold

      Breakbulk cargo - Wikipedia

      The inefficiencies caused international shipping to be slow, expensive, subject to damage and theft. The U.S. government conducted a study in 1954 that quantified the problems with breakbulk shipping. Not only was the act of loading and unloading the cargo ship inefficient, but the need to warehouse, palletize, and store the inconsistently-shaped items was a problem.

    8. Mark Levinson who has literally written the book on the shipping container called The Box how the shipping container made the world smaller and the
    9. How Shipping Containers Took Over the World (then broke it) by Calum on YouTube

      Oct 5, 2022

      The humble shipping container changed our society - it made International shipping cheaper, economies larger and the world much, much smaller. But what did the shipping container replace, how did it take over shipping and where has our dependance on these simple metal boxes led us?

  5. Oct 2022
    1. Mastodon gained 22,139 new accounts this past week and 10,801 in the day after Musk took over, said Mastodon chief executive Eugen Rochko. The site now has more than 380,000 monthly active users, while Twitter has 237.8 million daily active users.

      Comparison of Mastodon and Twitter active user counts

      Several orders of magnitude different.

    1. Claudia requested support through the Teleperformance scheme, which had to be approved by a supervisor, but she did not receive any help for two months. When the company’s mental health support staff finally did get in touch, they said they were unable to help her and told her to seek out support through the Colombian healthcare system.

      Company redirects employees to national healthcare system for mental health support

    2. Some social media platforms struggle with even relatively simple tasks, such as detecting copies of terrorist videos that have already been removed. But their task becomes even harder when they are asked to quickly remove content that nobody has ever seen before. “The human brain is the most effective tool to identify toxic material,” said Roi Carthy, the chief marketing officer of L1ght, a content moderation AI company. Humans become especially useful when harmful content is delivered in new formats and contexts that AI may not identify. “There’s nobody that knows how to solve content moderation holistically, period,” Carthy said. “There’s no such thing.”

      Marketing officer for an AI content moderation company says it is an unsolved problem

    1. Advocate Aurora Health says it embedded pixel tracking technologies into its patient portals and some of its scheduling widgets in a bid to "better understand patient needs and preferences."

      Alternate: “Springfield USA Library says it embedded pixel tracking technologies into its discovery portals and some of its contact-a-librarian widgets in a bid to ’better understand customer needs and preferences.’”

    2. A Midwestern hospital system is treating its use of Google and Facebook web tracking technologies as a data breach, notifying 3 million individuals that the computing giants may have obtained patient information.

      Substitute “library” for “hospital”

      In an alternate universe: “A Midwestern library system is treating its use of Google and Facebook web tracking technologies as a data breach, notifying 3 million individuals that the computing giants may have obtained search and borrowing histories.”

    1. It is the work itself that is copyrighted, not the form.56 While works mustbe in a fixed form to qualify for copyright protection, that protection is for the workitself. Some forms are necessarily part of some types of works (e.g., sculpture), butthis cannot be said of most printed works.57 The form in which a work is fixed isirrelevant, and Congress recognized the importance of media neutrality when itadopted the language in the Copyright Act.58 Digitization changes only the form,and “the ‘transfer of a work between media’ does not ‘alte[r] the character of ’ thatwork for copyright purposes.”

      Content, not form, is Copyrighted

      Wu's comment on New York Times Co. v. Tasini: "Digitization changes only the form, and 'the transfer of a work between media does not alter the charachter of that work for copyright purposes.'"

    2. First, digitization and distributionwould not be done for commercial gain and would be handled in a manner com-pletely consistent with a library’s function. Because the library would not beincreasing the number of copies available for use at any given time, the digital copywould not serve as a substitute for an additional subscription or purchase. Shoulddemand be so great that multiple copies were needed simultaneously, TALLOwould need to purchase or license additional copies or individual libraries withinthe consortium would need to make local purchases.

      Origin of own-to-loan concept

    3. Instead of the current practice of forming regional or bilateral agreementsfor resource sharing, law libraries could form a national consortium through whicha centralized collection would be established. The TALLO consortium would serveas a kind of jointly owned acquisitions department for member libraries. The dis-cussion here is limited to a collection of print and microform acquisitions anddonations

      TALLO's vision included a "jointly owned acquisitions department"

      The jointly owned acquisitions department would have dedicated staff, centralized storage and collection development policies (including preservation), and digitization capabilities.

    4. With each purchase decision,libraries risk either losing future access to databases (including retrospective con-tent) and experiencing greater restrictions on use through license terms than are

      Library acquired information at long-term risk

      available to publishers under copyright, or keeping materials in print even though they might not be used as often as an online equivalent.

    5. I believe it is possible to build a digital library thatrespects both of the intended beneficiaries of the Copyright Clause—copyrightowners and society—while testing commonly held assumptions about the limita-tions of copyright law. In balancing these goals, TALLO permits circulation of theexact number of copies purchased, thereby acknowledging the rights inherent incopyright, but it liberates the form of circulation from the print format.

      Liberating purchased information from the form in which it was purchased

    6. academic law libraries pool resources, through a consortium, to create a centralizedcollection of legal materials, including copyrighted materials, and to digitize thosematerials for easy, cost-effective access by all consortium members. For the sake ofexpediency, this proposal will be referred to here as TALLO (Taking Academic LawLibraries Online) and the proposed consortium as the TALLO consortium.

      Coining "TALLO" (Taking Academic Law

      Libraries Online)

      The [[Controlled Digital Lending]] theory was first proposed as a way for academic law libraries to form a consortium to share the expense of collection-building.

    1. An oracle is a conventional program which runs off the blockchain and which periodically publishes information about the world onto the blockchain. The problem is trust. Using an oracle turns your clever blockchain program into a fairly pointless appendage to the much more important (and subjective) conventional program: the one which is interpreting the world and drawing conclusions.

      Almost all smart contracts require an oracle

      The oracle becomes the trusted centralized entity that advocates wanted to be removed. Can you trust the oracle? Can the oracle be subverted...even for just a short time needed to execute an encoded contract program on the blockchain?

    2. If people have been doing international transfers for a thousand years, why are they still so complicated? The reason is largely KYC/AML, the compliance processes that the world financial system uses to ensure you aren't transferring money to economically sanctioned individuals, criminals, terrorists, etc. Banks won't send money to just anywhere, they first want to check that it's not at risk of going to the baddies. This can take a long time and often requires exchange of lots of complicated documents. Any blockchain-based financial transfer system that grows in popularity will be pressured by governments to implement KYC/AML and will then start to resemble traditional international transfers, except with higher charges and smaller economies of scale. Many Bitcoin brokerages have long since required identity verification for the account owner. Some are starting to require details of who you're sending money to.

      Bank transfers require compliance processes

      Know-your-client and anti-money-laundering compliance are based on laws that sanction individuals and criminal organizations. A blockchain version of bank transfers would require the same compliance workflows. As more money moves by blockchain, there will be more pressure on the intermediaries to comply with these laws. Unless you support the funding of criminal enterprises, I suppose.