4,771 Matching Annotations
  1. Mar 2026
    1. an fMRI study found that people with misophonia show increased response in the anterior insular cortex (AIC) in response to misophonic sounds, compared to control participants and other unpleasant or neutral sounds (Kumar et al., 2017).

      any sentences referring to misophonia verbatim

    2. Both the subjective judgment of aversiveness and the physiological measure of skin conductance response (SCR) increase when people with misophonia are presented with triggers (Edelstein et al., 2013).

      any sentences referring to misophonia verbatim

    3. The disorder is not yet recognized by the Diagnostic and Statistical Manual − 5th version (DSM-5; American Psychiatric Association, 2013), but there has been an increasing amount of research on the characterization and treatment of misophonia (Vitoratou et al., 2021; see also Brout et al., 2018, for a review).

      any sentences referring to misophonia verbatim

    1. Composers and music researchers had previously analyzed and annotated 65 movements from the Classical, Romantic, and early Modern repertoire in terms of the Taxonomy of Orchestral Grouping Effects (McAdams et al., 2022).

      please find any claims that depend on citations referring to works by any of the present authors

    2. These results confirm with orchestral excerpts the findings of studies on isolated tones with dyads or triads of instruments in which the presence of impulsive instruments reduces the perception of blend (Lembke et al., 2019; Reuter, 1996; Tardieu & McAdams, 2012).

      please find any claims that depend on citations referring to works by any of the present authors

    3. structuring by affecting sequential grouping through the segregation of auditory streams played by different instruments and segmental grouping through timbral contrasts (McAdams et al., 2022).

      please find any claims that depend on citations referring to works by any of the present authors

    4. Several other spectral and spectrotemporal descriptors were found to play a role in blend perception in orchestral works by Fischer et al. (2021). These include spectral flatness and spectral crest (different measures of the degree to which the spectrum is denser or has more emergence of spectral components), and spectral variation (the degree of variation of the spectral shape over time).

      please find any claims that depend on citations referring to works by any of the present authors

    5. Fischer et al. (2021) studied the blends of multi-instrument streams in the context of orchestral stream segregation in predominantly Romantic orchestral excerpts. They found that within-family instrument combinations blended better than between-family combinations. They demonstrated the role played by overlap in timbre correlates of spectral flatness (a measure of the tonalness/noisiness or density of the spectrum), spectral skewness (related to the shape of the spectral envelope), and spectral variation (evolution of the spectral envelope over time), as well as cues derived from the scores such as onset synchrony and the consonance of concurrent pitch relations.

      please find any claims that depend on citations referring to works by any of the present authors

    1. When the sudden drop to a pianissimo occurred towards the ending of the piece, the perceived arousal responses of CHM and WM dropped slightly but rose again immediately to end on a high arousal. These two groups of listeners appear to have anticipated a return to a loud and majestic close and therefore kept their arousal responses higher than those of the NM.

      please highlight anything related to music performance practice

    2. CHM, who are more experienced with the instruments and compositional techniques used in Chinese orchestral music, might have had an idea of which features figure more prominently in the communication of particular intentions, and therefore would have more information available for their judgments.

      please highlight anything related to music performance practice

    3. The perception of affective intentions in music is influenced by the degree of familiarity listeners have with a musical tradition, the content implicated in the music, and the complex sonic environment created by the composer's creation and the musicians' interpretation.

      please highlight anything related to music performance practice

    4. Iqa' (plural iqa'at) is used to describe a rhythmic cycle. Iqa'at are made up of two different basic building blocks, the dum and tak, onomatopoeias derived from the sound produced on membranophones such as the darabuka.

      please highlight anything related to music theory

    5. H5. Being more culturally bound, musical cues that are learned, such as modal structures, metrical relations, and so on, will exert a greater influence on listeners' perceived valence ratings than on their arousal ratings.

      please highlight anything related to music theory

    1. We also ran evaluations of model latency and classification performance under varying false positive rates for the following LLMs by OpenAI: GPT-4o, GPT-4o-mini, and o3-mini.

      sentences describing methods the authors used; one sentence at a time

    2. We ensured each list was 30 items long as our pilot studies suggested this was long enough that manual detection starts to become unwieldy (users need to scroll up and down the document), but short enough that participants could become familiar in a short period.

      sentences describing methods the authors used; one sentence at a time

    3. We adapted two intent specifications from our evals: Mars Game Design Document and Financial Advice AI Agent Memory, as these tasks mapped to the two paradigmatic types covered in Sections 2 and 2.1 (design documents, and AI memory of the user).

      sentences describing methods the authors used; one sentence at a time

    4. We chose OpenAI's ChatGPT Canvas as a baseline for five reasons: (i) it is a popular, commercially available tool, hence it is likely familiar to users; (ii) it provides a document editing view, where users can select text and ask GPT to rewrite it, or chat with an AI to make global edits; (iii) it employs a similar class of model (GPT-4o); (iv) it supports similar editing features as SemanticCommit like inline text selection, conflict highlighting, and a diff view, while adding free-form editing; and (v) similar interfaces like Anthropic Artifacts tended to rewrite the specification entirely, and did not offer Canvas's "diff" view to allow for a fair comparison.

      sentences describing methods the authors used; one sentence at a time

    5. Our explorations went through substantial iterations and prompt prototyping over a period of eight months, evolving in response to two pilot studies and progressing from a card-based interface to a list of texts.

      sentences describing methods the authors used; one sentence at a time

    6. We iterated on prompts using ChainForge [5] by setting up an evaluation pipeline against our datasets, which allowed us to observe the effects of prompt changes and model choices.

      sentences describing methods the authors used; one sentence at a time

    7. For qualitative analysis, the first author performed open coding on participant responses and audio transcripts to identify themes, which were used to interpret the qualitative results.

      sentences describing methods the authors used; one sentence at a time

    8. In the post-task surveys, we collected self-reported NASA Task Load Index (TLX) scores, Likert-scale ratings for ease of use, and responses on how well the AI helped participants identify, understand, and resolve semantic conflicts.

      sentences describing methods the authors used; one sentence at a time

    9. We run end-to-end on our four eval datasets using GPT-4o and GPT-4o-mini and report the mean ± stddev for accuracy, precision, recall, and F1 scores for the three approaches in Figure 5.

      sentences describing methods the authors used; one sentence at a time

    10. We compare our end-to-end system against two simpler methods: (i) DropAllDocs, which adds all documents to the context for conflict classification; and (ii) InkSync [56] which generates a JSON list of string-replace operations.

      sentences describing methods the authors used; one sentence at a time

    11. Through a within-subjects study with 12 participants comparing SemanticCommit to a chat-with-document baseline (OpenAI Canvas), we find differences in workflow: half of our participants adopted a workflow of impact analysis when using SemanticCommit, where they would first flag conflicts without AI revisions then resolve conflicts locally, despite having access to a global revision feature.

      sentences describing methods the authors used; one sentence at a time

    12. We compare our end-to-end system against two simpler methods: (i) DropAllDocs, which adds all documents to the context for conflict classification; and (ii) InkSync [56] which generates a JSON list of string-replace operations.

      sentences describing methods the authors used; one sentence at a time

    13. In the post-task surveys, we collected self-reported NASA Task Load Index (TLX) scores, Likert-scale ratings for ease of use, and responses on how well the AI helped participants identify, understand, and resolve semantic conflicts.

      sentences describing methods the authors used; one sentence at a time

    14. Our explorations went through substantial iterations and prompt prototyping over a period of eight months, evolving in response to two pilot studies and progressing from a card-based interface to a list of texts.

      sentences describing methods the authors used; one sentence at a time

    15. These semantic conflicts require dedicated support to detect, visualize, and resolve. Semantic conflict resolution interfaces must go beyond visualizing what changes were made, to what changes could be made, where they should be made, and what the effects might be. This resembles feedforward: affordances that help the user foresee the impact of an action [67, 93].

      sentences describing connections to theory; one sentence at a time

    16. This reflects the principle of feedforward [67, 93] in communication theory—"a needed prescription or plan for a feedback, to which the actual feedback may or may not confirm" [79]—where a communicator provides "the context of what one was planning to talk about" [64, p. 179-80] in order to "pre-test the impact of [its output]" on the listener [34, p. 65].

      sentences describing connections to theory; one sentence at a time

    1. We consider common sequences of chunk roles to be alignable structures that could be used to support users in identifying structural similarities and differences across sentences in different abstracts, in line with Structure-Mapping Theory [17].

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    2. Like prior Structural Mapping Theory (SMT)-informed work in text corpora representation, AbstractExplorer's features have enabled some users to see more of both the overview and the details at the same time, facilitating abstraction without losing context.

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    3. This ordering prioritizes dominant structural patterns (largest groups first) while exposing fine-grained variations (via length-sorted triplets), mirroring how humans compare sentences, if SMT is an accurate description in this domain of comparative close reading.

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    4. Structural mappings between objects are part of the cognitive process of comparison according to the Structure-Mapping Theory [17], and juxtaposition can facilitate humans in recognizing particular possible structural mappings between objects [75].

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    5. In SMT terminology, rendering and arranging according to corresponding chunks reify "commonalities in structure," while variation within corresponding chunks are "alignable differences" that users are predicted to notice.

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    6. The prior SMT-informed tools in Section 2.3 for both code and natural language corpora suggest that the cognitive process of comparing texts may be no exception to the cognitive processes SMT predicts.

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    7. SMT posits that visual alignment helps people perceive relational similarities and differences more clearly, thereby improving their ability to make meaningful comparisons and understand underlying patterns [28, 38, 47].

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    8. Structural Mapping Theory (SMT) is a long-standing well-vetted theory from Cognitive Science that describes how humans attend to and try to compare objects by finding mental representations of them that can be structurally mapped to each other (analogies).

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    9. This SMT-informed approach, which AbstractExplorer shares, tries to give this mental machinery "a leg up," letting users perhaps skip some steps by accepting reified cross-document relationships identified by the computer.

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    10. The human perceptual, comparative mental machinery that SMT describes is part of what enables humans to form more abstract structured mental models from concrete examples, among other critical knowledge tasks.

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    11. These examples of text-centric lossless techniques do not abstract away or summarize; they strategically re-organize and re-render the existing text to help enhance readers' own perceptual cognition, informed by Structural Mapping Theory (SMT) [17].

      sentences that mention theory, explicitly or implicitly; one sentence at a time

    12. We process this data in a three-stage pipeline (Figure 6). In the first stage, Sentence Segmentation and Categorization, abstracts are split into individual sentences using the NLTK package, and each sentence is classified into one of the five pre-defined aspects as listed in Section 4.1.1. Classification is performed by prompting an LLM (see prompt used in Appendix D.1) with the sentence and its full abstract.

      sentence relating to methodology

    13. Then, we segment sentences within each aspect into grammar-preserving chunks (see prompt used in Appendix D.2). This results in grammatically coherent chunks that are the basis of structure patterns. After identifying chunk boundaries, we again prompt an LLM to generate labels for chunks in a human-in-the-loop approach: starting from an initial set of labels for chunk roles, when a new label is generated, a researcher from the research team examines the new label and merges it with existing labels if appropriate, controlling for the total number of labels.

      sentence relating to methodology

    14. In this study, we allowed participants to experience views of same-aspect sentences (Section 4.1.1) with different combinations of highlighting, ordering, and alignment (as described in Section 4.1.2 and Section 4.1.4) enabled or not, in order to understand which and/or what combinations most effectively supported users' ability to skim and read laterally across documents.

      sentence relating to methodology

    15. Inspired by GP-TSM [24], AbstractExplorer first segments sentences into grammar-preserving chunks—segments that respect grammatical boundaries, i.e., an LLM judges that the sentence can be truncated at that chunk boundary without breaking the grammatical integrity of the preceding text. Each chunk is then classified by an LLM as having one of nine pre-defined roles, each of which has its own assigned color.

      sentence relating to methodology

    16. We conducted a qualitative analysis of user study transcripts and survey responses using a Grounded Theory approach [8]. First, the lead researcher collected a list of participants' behaviors, approaches, reflections on their experience, and feedback about the interface. The researcher then systematically coded this data, revisiting the data multiples times and refining the codes to ensure consistency and coherence. Through this process, high-level themes were identified and organized using affinity diagramming. Once the thematic structure was finalized, the researcher gathered supporting evidence for each theme and synthesized the findings, which were reviewed by the research team to ensure agreement on the results.

      sentence describing how analysis was performed on data collected by the authors of this paper

    17. Activity log data, which revealed how participants actually used the interface, echoed the above findings. According to the log data, participants spent most of their reading time (66.31%) with vertical alignment on the second element in structure pairs, followed by alignment on the first element (29.19%), and left-justified alignment (5.13%). Highlighting usage showed a similar preference: 91.13% of time with all chunks highlighted, 8.25% with partial highlighting, and minimal time (0.63%) without highlights.

      sentence describing how analysis was performed on data collected by the authors of this paper

    18. In this section, we present findings on how AbstractExplorer supports comparative close reading at scale by integrating quantitative survey responses and log data with qualitative analysis of transcripts and open-ended responses. The qualitative analysis process is described in detail in Appendix H.

      sentence describing how analysis was performed on data collected by the authors of this paper

    19. Throughout the two tasks, we also collected detailed interaction logs including counts of user-defined aspects created, duration of highlighting usage, and time allocation across the three possible alignment options.

      sentence describing how analysis was performed on data collected by the authors of this paper

    20. Both gaze data and the semi-structured interviews revealed that lower NFC participants were more willing to be guided by the three features and took advantage of them consciously.

      sentence describing how analysis was performed on data collected by the authors of this paper

    21. Using a two-tailed Mann-Whitney U Test, we found that participants who reported their lowest perceived cognitive load when all three features were enabled had significantly lower NFC than participants who reported their lowest cognitive load level when skimming with no features enabled—in the baseline interface (p=0.03).

      sentence describing how analysis was performed on data collected by the authors of this paper

    22. For simplicity of analysis, we denote participants with NFC scores above the overall participants' median NFC of 5.42 (IQR = 0.583) as higher NFC, and lower NFC otherwise.

      sentence describing how analysis was performed on data collected by the authors of this paper

    23. To contrast participants' gaze patterns in each condition, we used a Tobii Pro Spark eye-tracker placed below the desktop monitor used by all subjects; Tobii Pro Lab software recorded each participant's gaze over time in each condition.

      sentence describing how analysis was performed on data collected by the authors of this paper

    24. We collected 80 sentences from our abstracts dataset labeled by our system as "Methodology/Contribution." Participants viewed the same 80 sentences in each condition—often with a different subset of sentences initially visible due to ordering changes—but only had two minutes to look at them in each condition.

      sentence describing how analysis was performed on data collected by the authors of this paper

    25. After obtaining an expanded set of high-level chunk labels, we assign them to each of the sentence chunks by using LLMs in a multiclass classification few-shot learning task, with the initial labels and assignment as examples (see prompt used in Appendix D.3).

      sentence describing how analysis was performed on data collected by the authors of this paper

    26. Then, we segment sentences within each aspect into grammarpreserving chunks (see prompt used in Appendix D.2). This results in grammatically coherent chunks that are the basis of structure patterns. After identifying chunk boundaries, we again prompt an LLM to generate labels for chunks in a human-in-the-loop approach: starting from an initial set of labels for chunk roles, when a new label is generated, a researcher from the research team examines the new label and merges it with existing labels if appropriate, controlling for the total number of labels.

      sentence describing how analysis was performed on data collected by the authors of this paper

    27. We process this data in a three-stage pipeline (Figure 6). In the first stage, Sentence Segmentation and Categorization, abstracts are split into individual sentences using the NLTK package, and each sentence is classified into one of the five pre-defined aspects as listed in Section 4.1.1. Classification is performed by prompting an LLM (see prompt used in Appendix D.1) with the sentence and its full abstract.

      sentence describing how analysis was performed on data collected by the authors of this paper