22 Matching Annotations
  1. Last 7 days
    1. s its verbose239regression outputs contain large execution traces with sparse hypothesis-relevant content.240MetaScience resists condensation entirely, its compact, already-aggregated outputs leave241little noise to remove.

      do some qualitative validation!

    1. Stage 3: Interesting findings extraction. To build the culture layer, a separate week-147batched pass uses elevated temperature to identify 3–10 qualitatively interesting moments148per week—humor, prescience, fun facts, cultural artifacts—each with a catchy title, de-149scription, category, excitement rating (1–10), and source email references. Findings below150excitement level 6 are discarded, yielding 382 curated findings across 119 weeks.1514

      .............

    2. 3)Temporal density—email provides daily or even hourly granularity over121months or years;

      Yes, but do you establish how frequently? I want to see this data. Taking the corpus size of 345k for 150 employees over 4 years works out as ~1.6 emails/employee-day. Is the claim here genuinely that people's work is summarized by two emails a day?

    3. 2)Comprehensiveness—email captures strategic119discussions, operational coordination, social exchanges, and administrative processes within120a single medium;

      This cannot possibly have evidence

  2. May 2026
  3. Apr 2026
    1. the ability to directlymanipulate model weights allows users to bypass safety training entirely, modifying models so that they no longerrefuse any request

      open models may not even have safety training, in that they may just be base model releases, rather than RLHFd ones.

    Annotators

  4. Oct 2025
    1. Some participants argued that moredata is necessary to capture the full range of cultural expressions, while others contended thatthe focus should be on developing thicker development pipelines that incorporate expertiseand context. They discussed the limitations of current models, which often operate on crudemetrics and may not adequately represent the richness of cultural data.

      This more vs thicker data debate is a good one, but it is also begging the question - do we want to model all cultural variation? e.g. https://aclanthology.org/2025.naacl-long.273.pdf

    2. They highlighted the importance of interdisciplinaryteaming at specific points in the development pipeline, imagining pairs of experts fromtechnical and qualitative fields working together step-by-step to negotiate approaches thatmeet shared goals. This collaborative approach would ensure that both qualitative insightsand quantitative rigor are incorporated into AI development.

      hell yeah, this is what we address in our interdisciplinarity piece

    3. common basic language for evaluating iterations of AI that do not assumethere is a linear or universal path of improving AI for all users, regardless of context.This shared language would help clarify where disciplinary specificity is needed and whereinterdisciplinary collaboration can be most effective.

      Again, the thing about culture is that it cannot be done with a monopolar view from the west, which is the circumstance that allows for the scalarization of "bad" and "good"

    4. how small language models can contribute to responsible innovation, and howto design decentralized infrastructure architectures that enable users to choose how theyshare and distribute their data and models

      These feel like really key questions that do strike at the centralization/hegemonic nature of LLM development.

    5. the need for alternativeapproaches to AI development that prioritize sustainability, justice, and inclusivity

      How alternative are we talking? I'd love some more context here.

    6. ocused on the gaps between community knowledge and com-putational knowledge

      What does this mean? Can community knowledge be encoded computationally?

    7. Institutional aspects and interdisciplinarity play a significant role in the cultures of AIproduction. There is a need for alternative imaginaries of technology that go beyond thecorporate inclusion of data.

      This feels like a real throughline regardless - how do we do technology that is not adapted noblesse oblige, but by the stakeholders who use it?