23 Matching Annotations
  1. Dec 2024
    1. Clio processes raw conversations in a secureprivate environment with restricted access, and only aggregate clusters are made available outside ofthis secure environment. A small number of authorized staff have access to this private environment,which is why they can access individual records in clusters but other members of staff cannot.21In addition, Clio’s aggregate outputs do not include personal information (see Appendix G.8 andAppendix D).

      what is this "secure environment"?

    2. To use Clio, one typically begins with a target dataset. This dataset is typically an unfiltered sampleof Claude traffic, but one could also choose other datasets, including filtering down the target datasetusing a regular expression or an AI model to analyze a more narrow distribution of data.

      What does this really mean or implicate?

    3. Alex Tamkin led the project, including proposing the idea, building the initial proof of conceptsystem (with Deep Ganguli), leading design and analysis of the experiments, and writing the paper.Miles McCain built the scalable Clio system used in the paper, ran the majority of the experiments,led the engagement with civil society organizations, and made deep contributions to the experimentaldesign, analysis, and writing. Kunal Handa ran the high-level use case experiments, and contributedto the experiments in the safety and multilingual sections and the writing of the paper. Esin Durmusdesigned and ran the initial versions of the multilingual experiments in the paper. Liane Lovittcontributed to the framing and execution of the experiments, organizational support, and feedback.Ankur Rathi provided deep contributions to the privacy framing and experiments. Jack Clark andJared Kaplan provided high level guidance and support throughout the process. Deep Ganguliprovided detailed guidance, organizational support, and feedback throughout all stages of the project,including the initial proof of concept, design of the experiments, analysis, and feedback on drafts.

      Google/linkedin these folks.

    1. AI providers have a dual responsibility: to maintain the safety of their systems while protecting user privacy.

      I think that the argument that extensive monitoring (almost surveillance) of the usage of the models at the expense of users' privacy can be morally defensible in the context that the relative risks of going under-monitored given the not entirely understood capabilities of these technologies.

    2. Clio also helps us monitor novel uses and risks during periods of uncertainty or high-stakes events. For example, while we conducted a wide range of safety tests in advance of launching a new computer use feature, we used Clio to screen for emergent capabilities and harms we might have missed1.

      How do they go about doing this? What is computer use model?

    3. Conversation topics that appeared more frequently in three selected languages (compared to the base rate of that language), as revealed by Clio.

      I think that these findings are contradictory to a degree in this notion of privacy-preservation, when they can explicitly trace back demographic data not explicitly from how we traditionally think of direct and quasi identifiers as in the explicit placeholders of info provided, but from quasi identifiers from our interactions and their nature with the models. I don't think that is something that can be abstracted entirely from model governance.

    4. Claude usage varies considerably across languages, reflecting varying cultural contexts and needs.

      Again, doesn't seem like personal identifiers are not accounted for in Clio while there is multiple instances where they attribute a finding to a given demographic? If it was anonymized how come can they go as far as say they have "cultural" insights?

    5. Counting the r’s in the word “strawberry”.

      I think this is a fantastic example of how people engage with the system, in the context of being influenced by other users that identify interesting behaviour with models for the sake of fun and experimentation.

    6. Software developers use Claude for tasks ranging from debugging code to explaining Git operations and concepts.

      How can they say "sofware devs" explicitly, wasn't the data anonymized? or to what extent?

    7. This revealed a particular emphasis on coding-related tasks

      This makes a lot of sense as it would reflect those first adopters being technical people (who code for instance), and or students (who where among the first appealed with the technology). However it is interesinf that there is an explicit focus on web and mobile app dev., is this an indicator that non-technical people are trying to leverage it for entreprenurial purposes for instace? again this notion of intent seems relevant in this context of evaluating use.

    8. which may look different than usage of other AI systems due to differences in user bases and model types

      interesting point, how true is this for the "user bases" part of the statement? the model type portion sounds reasonably intuitive to me.

    9. While public datasets like WildChat and LMSYS-Chat-1M provide useful information on how people use language models, they only capture specific contexts and use cases.

      Very very interesting look into these datasets.

    10. As a final check, Claude verifies that cluster summaries don’t contain any overly specific or identifying information before they’re displayed to the human user.

      So end to end process performed by AI? ai to govern ai?

    11. We also have a minimum threshold for the number of unique users or conversations, so that low-frequency topics (which might be specific to individuals) aren’t inadvertently exposed.

      I don't understand this. What do they mean?

    12. This is part of our privacy-first design of Clio, with multiple layers to create “defense in depth.” For example, Claude is instructed to extract relevant information from conversations while omitting private details.

      What does this even mean? So the model itself does data selection and extraction?

    13. A summary of Clio’s analysis steps, using imaginary conversation examples for illustration.

      It is interestiing that they chose to demonstrate that the case of the user having a facet on information about a genetic syndrome, and showing how that wouldn't be a HL cluster visible to analysts. How do they segregate intent in these clusters? If this conversation had no implications of the user stating they have this why would it be excluded or consider private if its not explicitly about them?

    14. Knowing how people actually use language models is important for safety reasons: providers put considerable effort into pre-deployment testing, and use Trust and Safety systems to prevent abuses.

      Totally, AKA evals and red teaming.

    15. It gives us insights into the day-to-day uses of claude.ai in a way that’s analogous to tools like Google Trends. It’s also already helping us improve our safety measures. In this post—which accompanies a full research paper—we describe Clio and some of its initial results.

      This is fascinating, how do competitors tackle this same problem?

    16. Claude insights and observations, or “Clio,” is our attempt to answer this question. Clio is an automated analysis tool that enables privacy-preserving analysis of real-world language model use.

      How is this technically achieved? (Read paper) What data minimization and anonymization techniques are implemented? How do they go about categorizing the different types of data in varying degrees of sensitive info?

    17. Claude models are not trained on user conversations by default,

      This notion of model not being trained by deatul ties in with the notion discussed in class of opt-in/opt-out.

    18. There’s also a crucially important factor standing in the way of a clear understanding of AI model use: privacy. At Anthropic, our Claude models are not trained on user conversations by default, and we take the protection of our users’ data very seriously. How, then, can we research and observe how our systems are used while rigorously maintaining user privacy?

      Oh this intersection of model governance and privacy implications is something I hadn't thought of, aside from the data for training perspective.

    19. the sheer scale and diversity of what language models can do makes understanding their uses—not to mention any kind of comprehensive safety monitoring—very difficult.

      how is safety monitoring done currently besides guardrails set for a given model?