248 Matching Annotations
  1. May 2026
    1. The tool also provided reflective value. Participants reported that it helped articulate what matters to them and why. Beyond research settings, individuals can use the framework to audit which dimensions drive their own sense of ownership, select AI tools that respect those priorities (e.g., suggestion-only assistance for high-Control creators), and mediate collaboration by visualizing divergent ownership profiles when teammates disagree about contribution and credit.

      IMPLICATIONS

  2. Apr 2026
    1. MDP is a formalism that originates from studies of sequential decision-making in artificial intelligence and operations research. Instead of the choice between n actions, MDP deals with environments where rewards are delayed (or distal). This requires an ability to plan actions as part of sequences instead of one-shot choices.

      sentence that mentions implicitly or explicitly a particular theory about computing or information

    2. Rational analysis is a theory of rational behavior proposed by Anderson and Schooler [21]. It examines the distribution of rewards in the environment to explain how users adapt their behavior. According to rational analysis, behavior is sensitive to the statistical distribution of rewards in the environment that a user has experienced.

      sentence that mentions implicitly or explicitly a particular theory about how humans think or act

    3. The term satisficing is used to describe how users tend to behave when facing a complex decision-making problem. It refers to settling on a satisfactory but not optimal solution in the normative sense.

      sentence that mentions implicitly or explicitly a particular concept relevant to HCI

    1. Our design was motivated by two major goals for notation authoring. These goals followed from recent studies of notation augmentation [30, 71] and conversations with scientists who had experience writing notation in instructional materials and research communications (4 professors, 2 graduate students, R1–6).

      sentence that describes who the system is designed for

    2. We define the key projections as markup (in this case, LaTeX), an annotatable render, and a structure hierarchy view. Augmentations are made easy to invoke, and projections are kept synchronized and co-present so that authors can shift between representations as is expedient to them.

      sentence that describes the characteristics that define the proposed system

    3. the challenge of using these tools is that annotations are unmoored from the structure of the formula and must be redone whenever the formula changes. Authors must perform precision positioning and sizing operations that could be inferred from the coordinates of the augmented expressions.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    4. these markup languages can require cumbersome and error-prone editing, arising from the intermixing of annotation markup with the underlying formula. Participants in a study by Wu et al. [71] identified difficulty with debugging nested braces and locating markup to edit.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    5. FreeForm, a projectional editor wherein authors can augment formulas—with color, labels, spacing, and more—across multiple synchronized representations. Augmentations are created graphically using direct selections and compact menus. Those augmentations propagate to LaTeX markup, which can itself be edited and easily exported.

      sentence that describes the characteristics that define the proposed system

    1. designing complex behavior can be a difficult programming task, and program representations in end-user programming tools may not be well-suited for heavy programs.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    2. It encourages program decomposition into "layer" abstractions, It automatically creates visualizations of event payloads at layer boundaries to help users understand layer behavior without having to read the underlying generated code, and It constructs ad hoc parametrization interfaces that allow users to configure important dimensions of the behavior of each layer without having to re-author it.

      sentence that describes the characteristics that define the proposed system

    3. However, such LLM-authored code, especially when implementing nontrivial logic, can be difficult to specify, understand or debug. Users need appropriate tools and handles to understand and make changes to the computation that is being performed in such code.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    4. Trigger-action programming has been a success in end-user programming. Traditionally, the simplicity of links between triggers and actions limits the expressivity of such systems. LLM-based code generation promises to enable users to specify more complex behavior in natural language. However, users need appropriate ways to understand and control this added expressive power.

      sentence that describes the conditions for which the system is designed

    1. by triangulating our empirical findings with existing theoretical models from the literature, we found out that the existing models of technology adoption require new theory components to be able to describe technology adoption processes of our participants. In particular, we identified an additional phase that is prominent among the participants, intention to learn, but did not appear in prior models. Then, we identified three new factors that significantly influence their technology acceptance but which are, again, not represented in the existing models: self-efficacy, conversion readiness, and peer support.

      sentences about extending existing theoretical models with research findings

    2. Our preliminary results indicate that there is an additional phase, the intention to learn, and three relating factors, self-efficacy, conversion readiness, and peer support, that significantly influence the acceptance of mobile technologies among the participants, but are not represented in the existing models. With these findings, we propose a tentative theoretical model that extends the existing theories to explain the ways in which our participants came to accept mobile technologies.

      sentences about extending existing theoretical models with research findings

    1. Then, by triangulating our empirical findings with existing theoretical models from the literature, we found out that the existing models of technology adoption require new theory components to be able to describe technology adoption processes of our participants.

      sentences about extending existing theoretical models with research findings

    2. We identified three distinct factors that influence older adults' technology acceptance behaviors, particularly the intention to learn phase, that are not represented in prior models: self-efficacy, conversion readiness, and peer support.

      sentences about extending existing theoretical models with research findings

  3. Mar 2026
    1. We inductively analyzed the first-round interview data using thematic analysis based on a grounded theory approach [33]. Grounded theory methods build theory iteratively from the data, using rigorous coding practices. Initial open codes are primarily descriptive. These may be combined into more sophisticated related sets of descriptors, in which each set is referred to as an axial code. Subsequently, axial codes are combined into more theoretically powerful code complexes, called selective codes. Our approach included a process of open coding, axial coding, and selective coding.

      sentences that use or mention grounded theory

    2. Triangulating the empirical findings from our preliminary results with the existing theoretical models, we proposed an extension of the existing theoretical models that explains the technology acceptance behavior of our participants who were aged 60 or over.

      sentences that implicitly or explicitly mention theory

    3. Consolidating our preliminary findings with the existing models, we propose an extended technology acceptance model for older adults illustrated in Figure 3. Extending to the predecessor theories, our tentative model introduces the perceived effort of learning a new technology as an obstacle for older adults' technology acceptance, which has not been reported in any studies of younger adults' technology acceptance.

      sentences that implicitly or explicitly mention theory

    4. Azjen's theory of planned behavior [1, 2] posits that a specific behavior is the result of an intention to carry it out, and that intention is determined by attitudes, norms, and the perception of control over the behavior. Drawing upon this theory of planned behavior, Davis et al. developed the technology acceptance model (TAM) [10].

      sentences that implicitly or explicitly mention theory

    5. Then, by triangulating our empirical findings with existing theoretical models from the literature, we found out that the existing models of technology adoption require new theory components to be able to describe technology adoption processes of our participants.

      sentences that implicitly or explicitly mention theory

    1. Established theories of human cognition describe how exposure to variation and consistency within prescribed structures can help people more robustly form mental models of a phenomenon, e.g., how an LLM behaves. Specifically, in line with Variation Theory [35], the features we instantiate identify patterns of consistency (Figure 1d, "Exact Matches"), variation (Figure 1c, "Unique Words"), or both (Figures 1a, 1b, "Positional Diction Clustering (PDC)"—a novel algorithm we introduce in this paper). In line with Analogical Learning Theory [13], PDC highlights analogous text across LLM responses, i.e., positionally consistent and similar in diction, such that users can see emergent relationships.

      sentences that implicitly or explicitly mention theory

    2. users may want to select the best option from among many, compose their own response through bricolage, consider many ideas during ideation, audit a model by looking at the variety of possible responses, or compare the functionality of different models or prompts.

      sentences about intended user's goals

    1. dialogue, as a form of interaction, is not limited to speech and language even though this is often our first interpretation of the term "dialogue."... the concepts of dialogue are applicable across modalities.

      highlight the most important assumptions, conclusions, and points of the paper

    2. Formal models of computation are suitable for describing discrete, moded dialogues. A mode refers to the variation in the interpretation of a user's input according to an internal state. In a modeless dialogue, all inputs are possible in all states and their interpretation is always the same.

      gimme some software concepts that are color coded and categories

    3. One thing that is missing is an account of how beliefs about the computer are formed and updated and how they drive action specification. The current understanding is that users form internal models that predict how their actions produce perceived outputs, and they learn to minimize prediction errors.

      I want to highlight things that are novel (not simply tool stuff)

    4. both the computer and the human participate in establishing a shared context. The computer does not simply receive a message; it also communicates the effects of that message.

      I want to highlight things that are novel (not simply tool stuff)

    5. Dialogue can be understood as computation, goal-directed action, communication, or embodied action. Each perspective provides specific methods for the analysis and design of dialogue.

      Highlight the sentences that capture the main point of this chapter

    6. The key idea in the dialogue view of interaction is the organization of communication as a series of turns. Dialogue evolves through communication turns between two or more partners. In one turn, an appropriate communication act is made by one partner based on the communication context. The act aims to get the other partner to do or understand something. This understanding then forms the context within which the other partner takes their turn.

      Highlight the sentences that capture the main point of this chapter

    1. TAM posits that the intention to adopt a particular technology is driven by two kinds of perceptions: (1) how easy it is to use a system and (2) how useful it will be to use it [180]. Furthermore, the perceived ease of use affects the perceived usefulness: If technology is hard to use, it is less useful.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    2. it is perfectly possible to have a program which is structured, modular, readable, flexible, self-documenting, maintainable, which performs its specified function, and which is a source of constant frustration and irritation to its users.

      Highlight what you think good software concepts owuld be and segment them by color coded categories.

    1. Such points about the origins of data and the processes of their collection are a key factor in civic text visualization. Indeed, a shift to emphasizing paradata can help draw attention to the representativeness of data.

      Show alternative approaches to text visualization beyond analytics

    2. In contrast, we could consider designing explicitly for multiple users. Doing so requires more than designing for different levels of expertise (see the following subsection for more on expertise) or designing for collaborative use, though both those things may be valuable in their own right. Rather, this dimension encourages accounting for the different types of relationalities that users may have with a system [cf. BB17].

      Show alternative approaches to text visualization beyond analytics

    3. Civic text visualizations similarly designed to foreground interpretation could help make clearer who is making these interpretive decisions, thereby highlighting the lack of neutrality and objectivity in data [DK20].

      Show alternative approaches to text visualization beyond analytics

    4. It is informative to contrast this analytic emphasis with other evolving discourses in information visualization. The prior work reviewed above illustrates a few alternative orientations, including rhetoric [HD11], feminism [DK16; DK20], ethics [Cor19], and others [DFCC13; VW08].

      Show alternative approaches to text visualization beyond analytics

    5. For example, CommunityPulse [JHSM21] uses common, simple visualizations and iconography, such as bar charts and emojis, to provide overviews of people's emotions towards civic agendas and ideas. Similarly, ConsiderIt [KMF*12b] uses bar charts to visualize people's stance towards ballot measures.

      Find civic text visualization systems that are explicitly named.

    6. Tools such as ConsiderIt [KMF*12b] or Opinion Space [FBRG10] are designed specifically for the public. In contrast, tools such as CommunityPulse [JHSM21] or CommunityClick [JKW*21] are focused more on supporting community leaders and decision makers.

      Find civic text visualization systems that are explicitly named.

    7. For example, MultiConVis [HC16b] makes prescriptive statements not only as to the sentimental valence of individual conversations but also as to the topics that each conversation is about. Similarly, ConsiderIt [KMF*12b] asks participants to place individual statements as either supporting or opposing a given ballot proposition.

      Find civic text visualization systems that are explicitly named.

    8. Improving the public input process has become an important goal in the field of digital civics [MNC*19; VCL*16; OW15]. To that end, researchers and practitioners have developed a variety of systems for, e.g., sharing public opinions [FBRG10], building consensus [KMF*12a; ZNB15], summarizing public input [19], or identifying people's priorities, reflections, and hidden insights [JHSM21].

      Highlight all civic participation approaches

    9. Previous work has introduced several online engagement platforms to enable the public to asynchronously provide their comments, ideas, and feedback around civic issues [19; 20b; MJN*18]. These engagement tools have used micro-tasks [MJN*18], visualizations [19], and forum-like discussions [20b] to engage disconnected and disenfranchised populations [MNC*19]. Others have proposed technologies to promote in-person engagement of reticent participants during town halls [JKW*21] and public meetings [LLS] using clicker-like devices.

      Highlight all civic participation approaches

    10. Despite their central importance in the civic engagement process, members of the general public are not necessarily involved in the analysis process. Hence, they are often left out of the loop when designing civic text visualizations—their requirements, aptitudes, knowledge, etc. are not given central consideration. Integrating participatory approaches in civic text visualization could pave the way not only for more inclusive analysis but also for leveraging the general public's knowledge to gather richer insights.

      Highlight all civic participation approaches

    1. social dynamics, such as shyness and tendency to avoid confrontation with dominant personalities can also hinder opinion sharing in town halls by favoring privileged individuals who are comfortable or trained to take part in contentious public discussions [27, 127].

      Highlight all civic participation approaches

    2. town halls inadvertently cater to a small number of privileged individuals, and silent participants often become disengaged despite physically attending the meetings [61]. Due to the lack of inclusivity, the outcome of such meetings often tends to feel unjust and opaque for the general public [39, 54].

      Highlight all civic participation approaches

    3. designing communitysourcing technologies to include marginalized opinions and amplify participation alone may not be enough to solve inequality of sharing opinions in the civic domain [26, 126]. Despite the success of previous works [25, 53, 90], technology is rarely integrated with existing manual practices and follow-ups of engagements between government officials and community members are seldom propagated to the community.

      Highlight all civic participation approaches

    4. Marginalization can be broadly defined as the exclusion of a population from mainstream social, economic, cultural, or political life [58], which still stands as a barrier to inclusive participation in the civic domain [48, 94]. Researchers in HCI and CSCW have explored various communitysourcing approaches to include marginalized populations in community activities, proceedings, and designs [48, 53, 81, 93, 132].

      Highlight all civic participation approaches

    5. Prior investigations by Bryan [29] and Gastil [56] showed a steady decline in civic participation in town halls due to the growing disconnect between local government and community members and the decline in social capital [43, 111, 113]. Despite the introduction of online methods to increase public engagement in the last decade [4, 5, 7, 37, 81, 93], government officials continue to prefer face-to-face meetings to engage the community in the decision-making process [32, 52, 94].

      Highlight all civic participation approaches

    6. Traditional community consultation methods, such as town halls, public forums, and workshops are the modus operandi for public engagement [52, 94]. For fair and impartial civic decision-making, the inclusivity of community members' feedback is paramount [60, 94, 126]. However, traditional methods rarely provide opportunities for inclusive public participation [30, 87, 95].

      Highlight all civic participation approaches

    7. Murphy used such systems to promote democracy and community partnerships [103]. Similarly, Boulianne et al. deployed clicker devices in contentious public discussions about climate change to gauge public opinions [25]. Bergstrom et al. used a single button device where the attendees anonymously voted (agree/disagree) on issues during the meeting. They showed that back-channel voting helped underrepresented users get more involved in the meeting [22].

      Highlight all civic participation approaches

    1. Again, p is the probability of seeing results as extreme (or more extreme) as those actually observed if the null hypothesis were true. So p is computed under the assumption that the null hypothesis is true. Yet it is common for researchers, teachers and even textbooks to think of p as the probability of the null hypothesis being true (or equivalently, of the results being due to chance), an error called the "fallacy of the transposed conditional" (Haller and Krauss, 2002; Cohen, 1994, p.999).

      p-value is misinterpreted and confusing

    1. This assessment raises two issues. First, it is arbitrary. If 10 of the 15 CIs included the predicted values, would the results also support the theory, or instead refute it? If one instead used 99% CIs, would positive results for 12 of the 15 predictions be enough to support the theory? This arbitrariness arises because CIs offer no principled method for generating an inference regarding the theory.

      Estimation is too messy / complex and not clear enough

    1. To illustrate this point Oakes posed a series of true/false questions regarding the interpretation of p-vales to seventy experienced researchers and discovered that only two had a sound understanding of the underlying concept of significance [25].

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    2. failure to check assumptions about the data required by particular tests, over-testing and using inappropriate tests

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    3. abusing statistical tests, making illogical arguments as a result of tests, deriving inappropriate conclusions from nonsignificant results, and confusing the size of p-values with effect sizes.

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    4. This approach, fiercely promoted by Fisher in the 1930's [9], has become the gold standard in many disciplines including quantitative evaluations in HCI. However, the approach is rather counter-intuitive; many researchers misinterpret the meaning of the p-value.

      Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition

    1. We found that using MINE directly gave identical performance when the task was nontrivial, but became very unstable if the target was easy to predict from the context (e.g., when predicting a single step in the future and the target overlaps with the context).

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    2. We note that better [49, 27] results have been published on these target datasets, by transfer learning from a different source task.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    3. We also found that not all the information encoded is linearly accessible. When we used a single hidden layer instead the accuracy increases from 64.6 to 72.5, which is closer to the accuracy of the fully supervised model.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    4. For lasertag_three_opponents_small, contrastive loss does not help nor hurt. We suspect that this is due to the task design, which does not require memory and thus yields a purely reactive policy.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    5. Although this is a standard transfer learning benchmark, we found that models that learn better relationships in the childeren books did not necessarily perform better on the target tasks (which are very different: movie reviews etc).

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    6. We found that more advanced sentence encoders did not significantly improve the results, which may be due to the simplicity of the transfer tasks (e.g., in MPQA most datapoints consists of one or a few words), and the fact that bag-of-words models usually perform well on many NLP tasks [48].

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    7. It is important to note that the window size (maximum context size for the GRU) has a big impact on the performance, and longer segments would give better results. Our model had a maximum of 20480 timesteps to process, which is slightly longer than a second.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    8. Interestingly, CPCs capture both speaker identity and speech contents, as demonstrated by the good accuracies attained with a simple linear classifier, which also gets close to the oracle, fully supervised networks.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    9. Figure 6 shows that for 4 out of the 5 games performance of the agent improves significantly with the contrastive loss after training on 1 billion frames.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    10. Despite being relatively domain agnostic, CPCs improve upon state-of-the-art by 9% absolute in top-1 accuracy, and 4% absolute in top-5 accuracy.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    11. We also found that not all the information encoded is linearly accessible. When we used a single hidden layer instead the accuracy increases from 64.6 to 72.5, which is closer to the accuracy of the fully supervised model.

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    1. Provide your best guess for the following question, and describe how likely it is that your guess is correct as one of the following expressions: ${EXPRESSION_LIST}. Give ONLY the guess and your confidence, no other words or explanation. For example:\n\nGuess: <most likely guess, as short as possible; not a complete sentence, just the guess!>\nConfidence: <description of confidence, without any extra commentary whatsoever; just a short phrase!>\n\nThe question is: ${THE_QUESTION}

      please find the barebones practical information i need to implement this system or strategy

    2. Provide your ${k} best guesses and the probability that each is correct (0.0 to 1.0) for the following question. Give ONLY the guesses and probabilities, no other words or explanation. For example:\n\nG1: <first most likely guess, as short as possible; not a complete sentence, just the guess!>\n\nP1: <the probability between 0.0 and 1.0 that G1 is correct, without any extra commentary whatsoever; just the probability!>

      please find the barebones practical information i need to implement this system or strategy

    3. Each linguistic likelihood expression is mapped to a probability using responses from a human survey on social media with 123 respondents (Fagen-Ulmschneider, 2023). Ling. 1S-opt. uses a held out set of calibration questions and answers to compute the average accuracy for each likelihood expression, using these 'optimized' values instead.

      please find the barebones practical information i need to implement this system or strategy

    4. Finally, our study is limited to short-form question-answering; future work should extend this analysis to longer-form generation settings.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    5. While our work demonstrates a promising new approach to generating calibrated confidences through verbalization, there are limitations that could be addressed in future work. First, our experiments are focused on factual recall-oriented problems, and the extent to which our observations would hold for reasoning-heavy settings is an interesting open question.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    6. the 1-stage and 2-stage verbalized numerical confidence prompts sometimes differ drastically in the calibration of their confidences. How can we reduce sensitivity of a model's calibration to the prompt?

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    7. Provide your best guess and the probability that it is correct (0.0 to 1.0) for the following question. Give ONLY the guess and probability, no other words or explanation. For example:\n\nGuess: <most likely guess, as short as possible; not a complete sentence, just the guess!>\n Probability: <the probability between 0.0 and 1.0 that your guess is correct, without any extra commentary whatsoever; just the probability!>\n\nThe question is: ${THE_QUESTION}

      please find the barebones practical information i need to implement this system or strategy

    8. Provide your best guess for the following question, and describe how likely it is that your guess is correct as one of the following expressions: ${EXPRESSION_LIST}. Give ONLY the guess and your confidence, no other words or explanation.

      please find the barebones practical information i need to implement this system or strategy

    9. To fit the temperature that is used to compute ECE-t and BS-t we split our total data into 5 folds. For each fold, we use it once to fit a temperature and evaluate metrics on the remaining folds. We find that fitting the temperature on 20% of the data yields relatively stable temperatures across folds.

      please find the barebones practical information i need to implement this system or strategy

    10. Additionally, the lack of technical details available for many state-of-the-art closed RLHF-LMs may limit our ability to understand what factors enable a model to verbalize well-calibrated confidences and differences in this ability across different models.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    11. With Llama2-70B-Chat, verbalized calibration provides improvement over conditional probabilities across some metrics, but the improvement is much less consistent compared to GPT-* and Claude-*.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    12. The verbal calibration of the open source model Llama-2-70b-chat is generally weaker than that of closed source models but still demonstrates improvement over its conditional probabilities by some metrics, and does so most clearly on TruthfulQA.

      all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper

    13. Among the methods for verbalizing probabilities directly, we observe that generating and evaluating multiple hypotheses improves calibration (see Figure 1), similarly to humans (Lord et al., 1985), and corroborating a similar finding in LMs (Kadavath et al., 2022).

      please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate

    1. Variation Theory provides the conceptual basis for generating structurally consistent differences, while Structural Alignment Theory (SAT) enhances the user's ability in recognizing and processing these differences.

      return any single sentence that describes an explicit or implicit connection to theory

    2. This finding is consistent with previous work that supports users' sense-making of text, e.g., by modulating text saliency. Specifically, Gu et al. [32] and Gero et al. [29] both found improved reading efficiency and comprehension with saliency-modulating text renderings.

      any single sentence that compares and contrasts this work with prior work.

    3. In decision making, SAT argues that people tend to focus on alignable differences—features that can be directly compared—rather than on differences that cannot be easily aligned.

      return any single sentence that describes an explicit or implicit connection to theory

    4. Specifically, we use Variation Theory of learning [44] which states that for learning to occur, some aspects that define the concept being learned must vary while others are held constant.

      return any single sentence that describes an explicit or implicit connection to theory

    5. According to SAT, humans compare two similar entities by trying to find structural alignments between them, and then comparing corresponding elements, with a special focus on differing aligned elements.

      return any single sentence that describes an explicit or implicit connection to theory

    6. VT posits that human learning occurs when learners experience variation across critical and superficial aspects of a concept—through exposure to contrasting examples that systematically vary along different critical and superficial feature dimensions.

      return any single sentence that describes an explicit or implicit connection to theory

    7. To analyze the annotation efficiency, we first conducted a Kruskal-Wallis rank sum test [39] to determine if there were statistically significant differences in annotation time across the three conditions, because our data violated the homogeneity of variances assumption, making non-parametric methods more appropriate.

      return any single sentence that describes data analysis done on data collected by the authors when running human subjects experiments.

    1. Interviews were video and audio recorded. We transcribed the audio using OpenAI's Whisper automatic speech recognition system and anonymized the transcript before analysis. We analyzed the interview data using thematic analysis [1]. First, two members of the research team independently coded four (25% of collected data) randomly chosen participant data to generate low-level codes. The inter-coder reliability between the coders was 0.88 using Krippendorff's alpha [37]. The two coders then met together to cross-check, resolve coding conflicts, and consolidate the codes into a codebook across two sessions. Using the codebook, the two coders analyzed six randomly selected participant data each. The research team then met, discussed the analysis outcomes, and finalized themes over three sessions.

      sentence describing how analysis was performed on data collected by the authors of this paper

    1. This research follows a constructionist approach to musical affect (Cespedes-Guevara & Eerola, 2018). That is, although we are interested in the \'bottom-up\' influence of certain musical features on musical affect, we believe these cannot be adequately evaluated without considering the \'top-down\' effects of context and individual differences that are present when affects are constructed. The perception or induction of affect does not merely arise in response to a stimulus but is also formed in relation to the individual and the context.

      makes an explicit connection between a music theory concept and congition

    1. Although there are many idiosyncrasies in what may trigger a person with misophonia, the most common triggers are created by other humans, such as the sound of someone chewing, clearing their throat, tapping their foot, or typing on a keyboard.

      any sentences referring to misophonia verbatim

    1. Composers and music researchers had previously analyzed and annotated 65 movements from the Classical, Romantic, and early Modern repertoire in terms of the Taxonomy of Orchestral Grouping Effects (McAdams et al., 2022).

      please find any claims that depend on citations referring to works by any of the present authors