McAdams et al. (2022) distinguish two subcategories of these two types of blend: stable and transforming.
please find any claims that depend on citations referring to works by any of the present authors
McAdams et al. (2022) distinguish two subcategories of these two types of blend: stable and transforming.
please find any claims that depend on citations referring to works by any of the present authors
Iqa' (plural iqa'at) is used to describe a rhythmic cycle. Iqa'at are made up of two different basic building blocks, the dum and tak, onomatopoeias derived from the sound produced on membranophones such as the darabuka.
please highlight anything related to music theory
H5. Being more culturally bound, musical cues that are learned, such as modal structures, metrical relations, and so on, will exert a greater influence on listeners' perceived valence ratings than on their arousal ratings.
please highlight anything related to music theory
This is a simple PDF file. Fun fun fun.
testing something new today
using SemanticCommit, we recorded all instances of edits, check for conflicts, make change, local, and global resolution actions using telemetry.
sentences describing methods the authors used; one sentence at a time
A sub-task was considered a failure if the participant was unable to complete it within the time limit.
sentences describing methods the authors used; one sentence at a time
Finally, we conducted an informal interview about their experience.
sentences describing methods the authors used; one sentence at a time
After both tasks were completed, participants filled out a final survey to compare the two conditions.
sentences describing methods the authors used; one sentence at a time
We chose GPT-4o for performance and latency reasons, as it performed optimally against our evals.
sentences describing methods the authors used; one sentence at a time
We also ran evaluations of model latency and classification performance under varying false positive rates for the following LLMs by OpenAI: GPT-4o, GPT-4o-mini, and o3-mini.
sentences describing methods the authors used; one sentence at a time
For each task, participants were tasked with integrating three new pieces of information into the memory, one at a time ("sub-tasks").
sentences describing methods the authors used; one sentence at a time
We ensured each list was 30 items long as our pilot studies suggested this was long enough that manual detection starts to become unwieldy (users need to scroll up and down the document), but short enough that participants could become familiar in a short period.
sentences describing methods the authors used; one sentence at a time
We adapted two intent specifications from our evals: Mars Game Design Document and Financial Advice AI Agent Memory, as these tasks mapped to the two paradigmatic types covered in Sections 2 and 2.1 (design documents, and AI memory of the user).
sentences describing methods the authors used; one sentence at a time
We recruited 12 participants (7 female, 5 male) through the mailing lists of two research universities and one multinational technology company.
sentences describing methods the authors used; one sentence at a time
We chose OpenAI's ChatGPT Canvas as a baseline for five reasons: (i) it is a popular, commercially available tool, hence it is likely familiar to users; (ii) it provides a document editing view, where users can select text and ask GPT to rewrite it, or chat with an AI to make global edits; (iii) it employs a similar class of model (GPT-4o); (iv) it supports similar editing features as SemanticCommit like inline text selection, conflict highlighting, and a diff view, while adding free-form editing; and (v) similar interfaces like Anthropic Artifacts tended to rewrite the specification entirely, and did not offer Canvas's "diff" view to allow for a fair comparison.
sentences describing methods the authors used; one sentence at a time
With participant consent, we recorded audio and screen-casts, and participants were encouraged to think aloud.
sentences describing methods the authors used; one sentence at a time
Four coauthors created the evals, and two coauthors manually double-checked all conflicts, a process that took several days.
sentences describing methods the authors used; one sentence at a time
We ran one pilot study with five users of our card-based interface, and a second with four users of a revised interface.
sentences describing methods the authors used; one sentence at a time
Our explorations went through substantial iterations and prompt prototyping over a period of eight months, evolving in response to two pilot studies and progressing from a card-based interface to a list of texts.
sentences describing methods the authors used; one sentence at a time
We iterated on prompts using ChainForge [5] by setting up an evaluation pipeline against our datasets, which allowed us to observe the effects of prompt changes and model choices.
sentences describing methods the authors used; one sentence at a time
To measure statistical significance, we used Mann–Whitney–Wilcoxon tests and report the p-values.
sentences describing methods the authors used; one sentence at a time
For qualitative analysis, the first author performed open coding on participant responses and audio transcripts to identify themes, which were used to interpret the qualitative results.
sentences describing methods the authors used; one sentence at a time
In the post-task surveys, we collected self-reported NASA Task Load Index (TLX) scores, Likert-scale ratings for ease of use, and responses on how well the AI helped participants identify, understand, and resolve semantic conflicts.
sentences describing methods the authors used; one sentence at a time
Each condition had a time limit of 15 minutes, after which the participant completed a post-task survey.
sentences describing methods the authors used; one sentence at a time
Before each task, participants received a tutorial on the assigned tool and were given five minutes to explore it using a test document.
sentences describing methods the authors used; one sentence at a time
Both the order of task assignment and tool assignment were counterbalanced and randomly assigned.
sentences describing methods the authors used; one sentence at a time
We conducted a controlled within-subjects study with mixed methods, comparing SemanticCommit with a baseline interface.
sentences describing methods the authors used; one sentence at a time
We run end-to-end on our four eval datasets using GPT-4o and GPT-4o-mini and report the mean ± stddev for accuracy, precision, recall, and F1 scores for the three approaches in Figure 5.
sentences describing methods the authors used; one sentence at a time
We compare our end-to-end system against two simpler methods: (i) DropAllDocs, which adds all documents to the context for conflict classification; and (ii) InkSync [56] which generates a JSON list of string-replace operations.
sentences describing methods the authors used; one sentence at a time
In order to minimize relevance assessment issues, we apply a PageRank-based relevance ranking over the KG, akin to HippoRAG [36].
sentences describing methods the authors used; one sentence at a time
We implement the back-end using a knowledge-graph (KG) RAG architecture [36] consisting of two phases: pre-processing and inference.
sentences describing methods the authors used; one sentence at a time
This ordering prioritizes dominant structural patterns (largest groups first) while exposing fine-grained variations (via length-sorted triplets), mirroring how humans compare sentences, if SMT is an accurate description in this domain of comparative close reading.
sentences that mention theory, explicitly or implicitly; one sentence at a time
Structural mappings between objects are part of the cognitive process of comparison according to the Structure-Mapping Theory [17], and juxtaposition can facilitate humans in recognizing particular possible structural mappings between objects [75].
sentences that mention theory, explicitly or implicitly; one sentence at a time
In SMT terminology, rendering and arranging according to corresponding chunks reify "commonalities in structure," while variation within corresponding chunks are "alignable differences" that users are predicted to notice.
sentences that mention theory, explicitly or implicitly; one sentence at a time
The prior SMT-informed tools in Section 2.3 for both code and natural language corpora suggest that the cognitive process of comparing texts may be no exception to the cognitive processes SMT predicts.
sentences that mention theory, explicitly or implicitly; one sentence at a time
SMT posits that visual alignment helps people perceive relational similarities and differences more clearly, thereby improving their ability to make meaningful comparisons and understand underlying patterns [28, 38, 47].
sentences that mention theory, explicitly or implicitly; one sentence at a time
SMT provides a framework for understanding how humans compare two or more objects by finding common structural alignments between objects.
sentences that mention theory, explicitly or implicitly; one sentence at a time
Structural Mapping Theory (SMT) is a long-standing well-vetted theory from Cognitive Science that describes how humans attend to and try to compare objects by finding mental representations of them that can be structurally mapped to each other (analogies).
sentences that mention theory, explicitly or implicitly; one sentence at a time
This SMT-informed approach, which AbstractExplorer shares, tries to give this mental machinery "a leg up," letting users perhaps skip some steps by accepting reified cross-document relationships identified by the computer.
sentences that mention theory, explicitly or implicitly; one sentence at a time
The human perceptual, comparative mental machinery that SMT describes is part of what enables humans to form more abstract structured mental models from concrete examples, among other critical knowledge tasks.
sentences that mention theory, explicitly or implicitly; one sentence at a time
These examples of text-centric lossless techniques do not abstract away or summarize; they strategically re-organize and re-render the existing text to help enhance readers' own perceptual cognition, informed by Structural Mapping Theory (SMT) [17].
sentences that mention theory, explicitly or implicitly; one sentence at a time
Theory (SMT) to facilitate seeing both the overview and the details at the same time, facilitating abstraction without losing context.
sentences that mention theory, explicitly or implicitly; one sentence at a time
Inspired by GP-TSM [24], AbstractExplorer first segments sentences into grammar-preserving chunks—segments that respect grammatical boundaries, i.e., an LLM judges that the sentence can be truncated at that chunk boundary without breaking the grammatical integrity of the preceding text. Each chunk is then classified by an LLM as having one of nine pre-defined roles, each of which has its own assigned color.
sentence relating to methodology
AbstractExplorer classifies sentences into five pre-defined aspects common in CHI abstracts: Problem Domain, Gaps in Prior Work, Methodology/Contribution, Results/Findings, and Discussion/Conclusion.
sentence relating to methodology
We conducted a qualitative analysis of user study transcripts and survey responses using a Grounded Theory approach [8]. First, the lead researcher collected a list of participants' behaviors, approaches, reflections on their experience, and feedback about the interface. The researcher then systematically coded this data, revisiting the data multiples times and refining the codes to ensure consistency and coherence. Through this process, high-level themes were identified and organized using affinity diagramming. Once the thematic structure was finalized, the researcher gathered supporting evidence for each theme and synthesized the findings, which were reviewed by the research team to ensure agreement on the results.
sentence describing how analysis was performed on data collected by the authors of this paper
Activity log data, which revealed how participants actually used the interface, echoed the above findings. According to the log data, participants spent most of their reading time (66.31%) with vertical alignment on the second element in structure pairs, followed by alignment on the first element (29.19%), and left-justified alignment (5.13%). Highlighting usage showed a similar preference: 91.13% of time with all chunks highlighted, 8.25% with partial highlighting, and minimal time (0.63%) without highlights.
sentence describing how analysis was performed on data collected by the authors of this paper
In this section, we present findings on how AbstractExplorer supports comparative close reading at scale by integrating quantitative survey responses and log data with qualitative analysis of transcripts and open-ended responses. The qualitative analysis process is described in detail in Appendix H.
sentence describing how analysis was performed on data collected by the authors of this paper
Throughout the two tasks, we also collected detailed interaction logs including counts of user-defined aspects created, duration of highlighting usage, and time allocation across the three possible alignment options.
sentence describing how analysis was performed on data collected by the authors of this paper
Both gaze data and the semi-structured interviews revealed that lower NFC participants were more willing to be guided by the three features and took advantage of them consciously.
sentence describing how analysis was performed on data collected by the authors of this paper
Using a two-tailed Mann-Whitney U Test, we found that participants who reported their lowest perceived cognitive load when all three features were enabled had significantly lower NFC than participants who reported their lowest cognitive load level when skimming with no features enabled—in the baseline interface (p=0.03).
sentence describing how analysis was performed on data collected by the authors of this paper
The raw NASA-TLX score is the sum of all 6 NASA-TLX questions after reversing the appropriate questions.
sentence describing how analysis was performed on data collected by the authors of this paper
To compute a participant's NFC score, we averaged their response to the six questions, each ranging from 1 to 7, after reversing the appropriate questions.
sentence describing how analysis was performed on data collected by the authors of this paper
For simplicity of analysis, we denote participants with NFC scores above the overall participants' median NFC of 5.42 (IQR = 0.583) as higher NFC, and lower NFC otherwise.
sentence describing how analysis was performed on data collected by the authors of this paper
To contrast participants' gaze patterns in each condition, we used a Tobii Pro Spark eye-tracker placed below the desktop monitor used by all subjects; Tobii Pro Lab software recorded each participant's gaze over time in each condition.
sentence describing how analysis was performed on data collected by the authors of this paper
We collected 80 sentences from our abstracts dataset labeled by our system as "Methodology/Contribution." Participants viewed the same 80 sentences in each condition—often with a different subset of sentences initially visible due to ordering changes—but only had two minutes to look at them in each condition.
sentence describing how analysis was performed on data collected by the authors of this paper
After obtaining an expanded set of high-level chunk labels, we assign them to each of the sentence chunks by using LLMs in a multiclass classification few-shot learning task, with the initial labels and assignment as examples (see prompt used in Appendix D.3).
sentence describing how analysis was performed on data collected by the authors of this paper
Then, we segment sentences within each aspect into grammarpreserving chunks (see prompt used in Appendix D.2). This results in grammatically coherent chunks that are the basis of structure patterns. After identifying chunk boundaries, we again prompt an LLM to generate labels for chunks in a human-in-the-loop approach: starting from an initial set of labels for chunk roles, when a new label is generated, a researcher from the research team examines the new label and merges it with existing labels if appropriate, controlling for the total number of labels.
sentence describing how analysis was performed on data collected by the authors of this paper
We process this data in a three-stage pipeline (Figure 6). In the first stage, Sentence Segmentation and Categorization, abstracts are split into individual sentences using the NLTK package, and each sentence is classified into one of the five pre-defined aspects as listed in Section 4.1.1. Classification is performed by prompting an LLM (see prompt used in Appendix D.1) with the sentence and its full abstract.
sentence describing how analysis was performed on data collected by the authors of this paper
After the interviews, we analyzed the data using the process described in Appendix B
sentence describing how analysis was performed on data collected by the authors of this paper
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 16 blind and visually impaired (BI) developers.
sentence about testing
We conducted role-playing exercises with 24 US journalists.
sentence about testing
We conducted a study with 32 blind SR users.
sentence about testing
We conducted a collaborative, user-centered design study with a team of scientific researchers.
sentence about testing
We conducted quantitative (N=79) and qualitative (N=93) studies with healthcare experts.
sentence about testing
We conducted a qualitative study with 35 blind and visually impaired (VI) Developers.
sentence about testing
We conducted a study with 32 blind US users.
sentence about testing
We conducted a study with 12 blind SR users.
sentence about testing