What seem today as obvious notations often have relatively short histories: for instance, arrows in diagrams emerged around the 18th century.
- Last 7 days
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
-
Notations are deployed and embedded throughout the process of HCI and software development.
One or more sentences contextualizing the current work with typically uncited statements about the past.
-
Almost everything we do with computers involves notations.
One or more sentences contextualizing the current work with typically uncited statements about the past.
-
These informal interactions can then lead to formal representations, but depend upon pre-existing formalisms known to both humans and AI.
-
Many notations are culturally learned and inherited.
One or more sentences contextualizing the current work with typically uncited statements about the past.
-
From our analysis, we derive a set of initial implications for the design of future systems that create new abstractions (Section 5), including that notations primarily originate through linking metaphors and most often in a social—rather than a technical—context, and that notation design decisions around what to include as "meaningful" (and thus what to exclude) are often left implicit by inventors, but could be made explicit and become manipulable objects through reification [10].
-
Our analysis identifies 33 patterns of how notations are created, evolved, and formalized over time, which are largely shared across histories and loosely categorized into three social stages of development (invention/incubation, dispersion/divergence, and institutionalization/sanctification) and three functional stages (descriptive, generative, and evaluative).
-
What about novel formalisms and notations? How are new abstractions created, evolved, and incrementally formalized over time—and how might new systems, in turn, be explicitly designed to support these processes?
-
How might we co-create a new notation with a machine, and thereafter communicate through that notation, even share out the notation to broader communities?
-
While current AI systems support "horizontal" translations from informal ideas to established notations, how should we ensure that the "vertical" process of creation—new notations, new abstractions—is also supported?
-
How do humans ultimately develop new notations, new formalisms, and new abstractions, that they use to communicate with machines and each other?
-
The use of notation happens everyday in small ways, e.g., whenever people work together over a whiteboard or paper towards a joint objective. People jot down X's, boxes and arrows to stand-for concepts they are working through.
-
Human-computer interactions have historically been mediated by formally-defined structures—such as command-line interfaces, graphical user interfaces, and programming languages—that provide an unambiguous mapping to an underlying formal model.
-
- Jun 2026
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
In supervised and semi-supervised machine learning (ML) pipelines, labeled data is a vital component of training and validating models [46].
An individual sentence describing the setting in which this work was done.
-
- May 2026
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
The tool also provided reflective value. Participants reported that it helped articulate what matters to them and why. Beyond research settings, individuals can use the framework to audit which dimensions drive their own sense of ownership, select AI tools that respect those priorities (e.g., suggestion-only assistance for high-Control creators), and mediate collaboration by visualizing divergent ownership profiles when teammates disagree about contribution and credit.
IMPLICATIONS
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
In multitasking, tasks compete for limited sensory, motor, and central (cognitive) capacities
-
Visual objects that are unique in their visual primitives attract user's attention.
-
Interaction is a concept that is fundamental in HCI and specific to this field [357]. Intuitively, it refers to the reciprocal influence between people and an interactive system that takes place through the user interface.
-
Users continuously adapt their social behavior to compensate for the lack of social cues in computer-mediated communication
-
Users' performance in providing input to a computer is limited by a speed–accuracy trade-off
-
A key technical construct in HCI is the user interface. It refers to the parts of an interactive system that the user comes into contact with or that in other ways shape the user's perception of the system.
a sentence that describes a concept
-
In HCI, evaluation refers to the application of some systematic methodology to attribute human-related values to an artifact, prototype, system, or process.
a sentence that describes a concept
-
A special part of a computing system is the user interface. It is the part that the user can see and utilize to control the computer.
a sentence that describes a concept
-
Programmability lends computers their power as tools.
a sentence that describes a concept
-
- Apr 2026
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
Theories of rationality have increased our understanding of how users fail to be optimal.
sentence that describes theories in the abstract
-
MDP is a formalism that originates from studies of sequential decision-making in artificial intelligence and operations research. Instead of the choice between n actions, MDP deals with environments where rewards are delayed (or distal). This requires an ability to plan actions as part of sequences instead of one-shot choices.
sentence that mentions implicitly or explicitly a particular theory about computing or information
-
Information scent refers to a user's intuition that a cue in the interface represents the information needed. It is an estimation of relevance based on a proximal cue.
sentence that mentions implicitly or explicitly a particular concept relevant to HCI
-
IFT proposes that information-seeking behavior develops to maximize the rate of information gained per unit of time or effort invested.
sentence that mentions implicitly or explicitly a particular theory about how humans think or act
-
Information foraging refers to information-seeking activities such as navigating, exploring, comparing, searching, or manipulating information contents in an information space.
sentence that mentions implicitly or explicitly a particular concept relevant to HCI
-
A payoff refers to the benefits that are left after the costs have been subtracted.
sentence that mentions implicitly or explicitly a particular concept relevant to HCI
-
To state that a user's choice is rational means that it is selected with the expectation that it yields the highest utility out of the available options.
sentence that mentions implicitly or explicitly a particular concept relevant to HCI
-
Rational analysis is a theory of rational behavior proposed by Anderson and Schooler [21]. It examines the distribution of rewards in the environment to explain how users adapt their behavior. According to rational analysis, behavior is sensitive to the statistical distribution of rewards in the environment that a user has experienced.
sentence that mentions implicitly or explicitly a particular theory about how humans think or act
-
They share a focus on the emergence of interactive behavior; in other words, they predict how users choose to behave in certain given circumstances.
sentence that describes theories in the abstract
-
Utility refers to the agent's consideration of positive and negative rewards when deciding how to act.
sentence that mentions implicitly or explicitly a particular concept relevant to HCI
-
bounded rationality states that we are only rational to the extent allowed by the involved constraints, or bounds.
sentence that mentions implicitly or explicitly a particular theory about how humans think or act
-
The term satisficing is used to describe how users tend to behave when facing a complex decision-making problem. It refers to settling on a satisfactory but not optimal solution in the normative sense.
sentence that mentions implicitly or explicitly a particular concept relevant to HCI
-
-
andrewhead.info andrewhead.info
-
The author wants to augment the formula to explain the meaning of the terms on either side of the arrow—first
sentence that describes the goals of the intended user
-
they often benefit from being augmented with descriptive elements, such as labels describing the meaning of an expression or colors linking an identifier to its description in the text.
sentence that describes the goals of the intended user
-
In this walkthrough, the author is trying to add labels to the formula V(s_t) ← R_t to describe the meaning of its terms in an article they are writing.
sentence that describes the goals of the intended user
-
Our design was motivated by two major goals for notation authoring. These goals followed from recent studies of notation augmentation [30, 71] and conversations with scientists who had experience writing notation in instructional materials and research communications (4 professors, 2 graduate students, R1–6).
sentence that describes who the system is designed for
-
We define the key projections as markup (in this case, LaTeX), an annotatable render, and a structure hierarchy view. Augmentations are made easy to invoke, and projections are kept synchronized and co-present so that authors can shift between representations as is expedient to them.
sentence that describes the characteristics that define the proposed system
-
the challenge of using these tools is that annotations are unmoored from the structure of the formula and must be redone whenever the formula changes. Authors must perform precision positioning and sizing operations that could be inferred from the coordinates of the augmented expressions.
sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals
-
these markup languages can require cumbersome and error-prone editing, arising from the intermixing of annotation markup with the underlying formula. Participants in a study by Wu et al. [71] identified difficulty with debugging nested braces and locating markup to edit.
sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals
-
FreeForm, a projectional editor wherein authors can augment formulas—with color, labels, spacing, and more—across multiple synchronized representations. Augmentations are created graphically using direct selections and compact menus. Those augmentations propagate to LaTeX markup, which can itself be edited and easily exported.
sentence that describes the characteristics that define the proposed system
-
Authors of typeset formulas augment those formulas to make them easier to understand.
sentence that describes who the system is designed for
-
-
people.eecs.berkeley.edu people.eecs.berkeley.edu
-
Ply offers this LLM-supported program decomposition supported by visualization and parameterization UIs, permitting users to use interactions beyond chat to compose their programs incrementally.
sentence that describes the characteristics that define the proposed system
-
designing complex behavior can be a difficult programming task, and program representations in end-user programming tools may not be well-suited for heavy programs.
sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals
-
users can develop, test, and tweak program components, exploring possibilities for how data can be transformed and composed to discover and achieve goals.
sentence that describes the goals of the intended user
-
It encourages program decomposition into "layer" abstractions, It automatically creates visualizations of event payloads at layer boundaries to help users understand layer behavior without having to read the underlying generated code, and It constructs ad hoc parametrization interfaces that allow users to configure important dimensions of the behavior of each layer without having to re-author it.
sentence that describes the characteristics that define the proposed system
-
Ply maintains the simplicity of a straightforward connection between a trigger and action but provides a structure within which users can enlist an LLM to specify the behavior of each trigger and action.
sentence that describes the characteristics that define the proposed system
-
However, such LLM-authored code, especially when implementing nontrivial logic, can be difficult to specify, understand or debug. Users need appropriate tools and handles to understand and make changes to the computation that is being performed in such code.
sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals
-
Trigger-action programming offers an elegant interface to construct simple programs that result in customized behavior for software or devices.
sentence that describes the conditions for which the system is designed
-
Trigger-action programming has been a success in end-user programming. Traditionally, the simplicity of links between triggers and actions limits the expressivity of such systems. LLM-based code generation promises to enable users to specify more complex behavior in natural language. However, users need appropriate ways to understand and control this added expressive power.
sentence that describes the conditions for which the system is designed
-
-
kgajos.seas.harvard.edu kgajos.seas.harvard.edu
-
by triangulating our empirical findings with existing theoretical models from the literature, we found out that the existing models of technology adoption require new theory components to be able to describe technology adoption processes of our participants. In particular, we identified an additional phase that is prominent among the participants, intention to learn, but did not appear in prior models. Then, we identified three new factors that significantly influence their technology acceptance but which are, again, not represented in the existing models: self-efficacy, conversion readiness, and peer support.
sentences about extending existing theoretical models with research findings
-
Our preliminary results indicate that there is an additional phase, the intention to learn, and three relating factors, self-efficacy, conversion readiness, and peer support, that significantly influence the acceptance of mobile technologies among the participants, but are not represented in the existing models. With these findings, we propose a tentative theoretical model that extends the existing theories to explain the ways in which our participants came to accept mobile technologies.
sentences about extending existing theoretical models with research findings
-
-
www.eecs.harvard.edu www.eecs.harvard.edu
-
Then, by triangulating our empirical findings with existing theoretical models from the literature, we found out that the existing models of technology adoption require new theory components to be able to describe technology adoption processes of our participants.
sentences about extending existing theoretical models with research findings
-
We identified three distinct factors that influence older adults' technology acceptance behaviors, particularly the intention to learn phase, that are not represented in prior models: self-efficacy, conversion readiness, and peer support.
sentences about extending existing theoretical models with research findings
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
The performance of the system must be reliable and controllable. Its behavior should be safe, and the way it is designed and used should be ethical [768]. Users need to trust the system's decisions and ability. It should be made clear to the user what it can and cannot do.
statements that describe assertions of desirable system properties
-
- Mar 2026
-
www.eecs.harvard.edu www.eecs.harvard.edu
-
We also identified the factors that are critical to older adults but did not appear in the existing models. Finally, we applied the existing vocabulary to our model to comply with the conventional terms in the field.
sentences that implicitly or explicitly mention theory
-
Again following grounded theory practices from [33], we compared the model that emerged from our data with existing theoretical models of technology acceptance to determine differences and similarities between them.
sentences that use or mention grounded theory
-
Again following grounded theory practices from [33], we compared the model that emerged from our data with existing theoretical models of technology acceptance to determine differences and similarities between them.
sentences that implicitly or explicitly mention theory
-
Employing the grounded theory method [33], we allowed recurring themes and concepts in relation to technology acceptance behaviors to arise from the data itself.
sentences that use or mention grounded theory
-
We inductively analyzed the first-round interview data using thematic analysis based on a grounded theory approach [33]. Grounded theory methods build theory iteratively from the data, using rigorous coding practices. Initial open codes are primarily descriptive. These may be combined into more sophisticated related sets of descriptors, in which each set is referred to as an axial code. Subsequently, axial codes are combined into more theoretically powerful code complexes, called selective codes. Our approach included a process of open coding, axial coding, and selective coding.
sentences that use or mention grounded theory
-
With these findings, we propose a tentative theoretical model that extends the existing theories to explain the ways in which our participants came to accept mobile technologies.
sentences about extending existing theoretical models with research findings
-
Triangulating the empirical findings from our preliminary results with the existing theoretical models, we proposed an extension of the existing theoretical models that explains the technology acceptance behavior of our participants who were aged 60 or over.
sentences that implicitly or explicitly mention theory
-
Consolidating our preliminary findings with the existing models, we propose an extended technology acceptance model for older adults illustrated in Figure 3. Extending to the predecessor theories, our tentative model introduces the perceived effort of learning a new technology as an obstacle for older adults' technology acceptance, which has not been reported in any studies of younger adults' technology acceptance.
sentences that implicitly or explicitly mention theory
-
Using TAM, UTAUT, and several other works as theoretical underpinning, Renaud and Biljon proposed a model to explain older adults' mobile phone adoption.
sentences that implicitly or explicitly mention theory
-
Although many researchers have sought to understand and predict technology acceptance behavior, there has been relatively less effort to build a theoretical model for older adults, with one exception (STAM).
sentences that implicitly or explicitly mention theory
-
Extending the original TAM and consolidating the constructs of several other existing models, Venkatesh et al. proposed the Unified Theory of Acceptance and Use of Technology (UTAUT) [37].
sentences that implicitly or explicitly mention theory
-
Azjen's theory of planned behavior [1, 2] posits that a specific behavior is the result of an intention to carry it out, and that intention is determined by attitudes, norms, and the perception of control over the behavior. Drawing upon this theory of planned behavior, Davis et al. developed the technology acceptance model (TAM) [10].
sentences that implicitly or explicitly mention theory
-
Then, by triangulating our empirical findings with existing theoretical models from the literature, we found out that the existing models of technology adoption require new theory components to be able to describe technology adoption processes of our participants.
sentences that implicitly or explicitly mention theory
-
Technology acceptance has been widely studied, and several models have been proposed and tested [10, 37]. However, the HCI literature lacks a comprehensive explanation of technology acceptance among older adults.
sentences that implicitly or explicitly mention theory
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
Established theories of human cognition describe how exposure to variation and consistency within prescribed structures can help people more robustly form mental models of a phenomenon, e.g., how an LLM behaves. Specifically, in line with Variation Theory [35], the features we instantiate identify patterns of consistency (Figure 1d, "Exact Matches"), variation (Figure 1c, "Unique Words"), or both (Figures 1a, 1b, "Positional Diction Clustering (PDC)"—a novel algorithm we introduce in this paper). In line with Analogical Learning Theory [13], PDC highlights analogous text across LLM responses, i.e., positionally consistent and similar in diction, such that users can see emergent relationships.
sentences that implicitly or explicitly mention theory
-
users may want to select the best option from among many, compose their own response through bricolage, consider many ideas during ideation, audit a model by looking at the variety of possible responses, or compare the functionality of different models or prompts.
sentences about intended user's goals
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
dialogue, as a form of interaction, is not limited to speech and language even though this is often our first interpretation of the term "dialogue."... the concepts of dialogue are applicable across modalities.
highlight the most important assumptions, conclusions, and points of the paper
-
An FSM is a model of discrete computation applicable to dialogues. In computer science, an FSM is a special case of a Turing machine that reads but does not write on the tape.
Please highlight key definitions.
-
Formal models of computation are suitable for describing discrete, moded dialogues. A mode refers to the variation in the interpretation of a user's input according to an internal state. In a modeless dialogue, all inputs are possible in all states and their interpretation is always the same.
gimme some software concepts that are color coded and categories
-
One thing that is missing is an account of how beliefs about the computer are formed and updated and how they drive action specification. The current understanding is that users form internal models that predict how their actions produce perceived outputs, and they learn to minimize prediction errors.
I want to highlight things that are novel (not simply tool stuff)
-
both the computer and the human participate in establishing a shared context. The computer does not simply receive a message; it also communicates the effects of that message.
I want to highlight things that are novel (not simply tool stuff)
-
Robustness refers to the communication partners' ability to achieve shared understanding even in light of misunderstandings and other unanticipated troubles.
Highlight sentences that give a definition of a concept.
-
Communication repair refers to the "work of restoring shared understanding" when conversational partners misunderstand each other.
Highlight sentences that give a definition of a concept.
-
Mixed-initiative interaction is the idea of organizing interaction in dialogue where both the computer and the human can take initiative.
Highlight sentences that give a definition of a concept.
-
Dialogue can be understood as computation, goal-directed action, communication, or embodied action. Each perspective provides specific methods for the analysis and design of dialogue.
Highlight the sentences that capture the main point of this chapter
-
Dialogue interaction includes speech-based and graphical interactions.
Highlight the sentences that capture the main point of this chapter
-
The core elements of dialogue are communication turns, the communication context, and turn interpretation.
Highlight the sentences that capture the main point of this chapter
-
Dialogue is about the organization of communication as a series of turns between communication partners.
Highlight the sentences that capture the main point of this chapter
-
The key idea in the dialogue view of interaction is the organization of communication as a series of turns. Dialogue evolves through communication turns between two or more partners. In one turn, an appropriate communication act is made by one partner based on the communication context. The act aims to get the other partner to do or understand something. This understanding then forms the context within which the other partner takes their turn.
Highlight the sentences that capture the main point of this chapter
-
Interaction may be viewed as a dialogue, that is, a conversation that occurs between two partners in a context for some purpose.
Highlight the sentences that capture the main point of this chapter
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
TAM posits that the intention to adopt a particular technology is driven by two kinds of perceptions: (1) how easy it is to use a system and (2) how useful it will be to use it [180]. Furthermore, the perceived ease of use affects the perceived usefulness: If technology is hard to use, it is less useful.
Highlight what you think good software concepts owuld be and segment them by color coded categories.
-
it is perfectly possible to have a program which is structured, modular, readable, flexible, self-documenting, maintainable, which performs its specified function, and which is a source of constant frustration and irritation to its users.
Highlight what you think good software concepts owuld be and segment them by color coded categories.
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
Text entry can also be seen as a task where different subtasks are shared between the human and the computer (Chapter 20).
a statement that describes a type of user task
-
One example is autocorrect, which automatically corrects typing errors while the user is typing. Another example is the use of word predictions, which allow the user to select a word from a set of word suggestions instead of typing out the word in full.
sentence describing examples of a concept
-
For example, text entry methods such as eye typing are designed to allow nonspeaking users with motor disabilities to enter text using their eye movements only.
sentence describing examples of a concept
-
Text entry is also a good example of tool use (Chapter 19). A text entry method is a tool that allows the user to communicate with someone or something, typically other people or a service, using asynchronous text messages and longer documents.
a statement that describes a type of user task
-
In teleological determination, goals or purposes determine interaction in some way.
a sentence defining a concept
-
What happens in interaction is mutually determined by the human and the computer. In other words, what happens in interaction cannot be attributed solely to the human or the computer—the two must be considered together.
a sentence describing the concept of interaction
-
"Average movement time can be predicted as linear regression to the index of difficulty."
a statement that is a claim about the world as described by a particular theory
-
"The difficulty of selecting a target is proportional to its distance and inversely proportional to its width (index of difficulty)."
a statement that is a claim about the world as described by a particular theory
-
"If the user tries to increase speed, accuracy will be compromised, and vice versa: An increase in accuracy reduces speed."
a statement that is a claim about the world as described by a particular theory
-
For example, they can talk about information, difficulty, working memory, and so on.
sentence describing examples of a concept
-
A proposition is a claim about the world.
a sentence defining a concept
-
Interaction also occurs in different contexts, including work, leisure, and in-between contexts such as commuting.
sentence describing examples of a concept
-
Interaction is a dynamic phenomenon that unfolds over time as users and computers influence each other.
sentence describing the concept of interaction
-
It has been used to describe individuals, groups, and communities using computers.
sentence describing examples of a concept
-
Interaction is a core notion in HCI and refers to the mutual influence between people and computers.
sentence describing the concept of interaction
-
Pressing a button takes about a hundred milliseconds; adopting an information system in a large organization can easily take months.
sentence describing examples of a concept
-
We have used it to discuss various applications, from a user typing on a smartphone to a team of information workers communicating via email.
sentence describing examples of a concept
-
-
ericbaumer.com ericbaumer.com
-
Such points about the origins of data and the processes of their collection are a key factor in civic text visualization. Indeed, a shift to emphasizing paradata can help draw attention to the representativeness of data.
Show alternative approaches to text visualization beyond analytics
-
On the other side of this spectrum, at the detail level, articulating nuanced information present in raw text data can enable civic leaders to peruse and sublimate critical insights.
Show alternative approaches to text visualization beyond analytics
-
In contrast, we could consider designing explicitly for multiple users. Doing so requires more than designing for different levels of expertise (see the following subsection for more on expertise) or designing for collaborative use, though both those things may be valuable in their own right. Rather, this dimension encourages accounting for the different types of relationalities that users may have with a system [cf. BB17].
Show alternative approaches to text visualization beyond analytics
-
Civic text visualizations similarly designed to foreground interpretation could help make clearer who is making these interpretive decisions, thereby highlighting the lack of neutrality and objectivity in data [DK20].
Show alternative approaches to text visualization beyond analytics
-
work on visualization evaluation [SP06; IZCC08; LBI*12] has emphasized the importance of close attention to the various contexts in which a visualization will be applied.
Show alternative approaches to text visualization beyond analytics
-
It is informative to contrast this analytic emphasis with other evolving discourses in information visualization. The prior work reviewed above illustrates a few alternative orientations, including rhetoric [HD11], feminism [DK16; DK20], ethics [Cor19], and others [DFCC13; VW08].
Show alternative approaches to text visualization beyond analytics
-
For instance, CommunityPulse provides a scaffolding for multifaceted public input analysis using visualizations [JHSM21], and MultiConVis enables multilevel exploration and analysis of threaded conversations [HC16b].
Find civic text visualization systems that are explicitly named.
-
For example, CommunityPulse [JHSM21] uses common, simple visualizations and iconography, such as bar charts and emojis, to provide overviews of people's emotions towards civic agendas and ideas. Similarly, ConsiderIt [KMF*12b] uses bar charts to visualize people's stance towards ballot measures.
Find civic text visualization systems that are explicitly named.
-
For instance, visual analytic systems such as MultiConVis [HC16b] use multiple connected views to enable analysts to filter and explore text data at multiple levels.
Find civic text visualization systems that are explicitly named.
-
Tools such as ConsiderIt [KMF*12b] or Opinion Space [FBRG10] are designed specifically for the public. In contrast, tools such as CommunityPulse [JHSM21] or CommunityClick [JKW*21] are focused more on supporting community leaders and decision makers.
Find civic text visualization systems that are explicitly named.
-
For example, MultiConVis [HC16b] makes prescriptive statements not only as to the sentimental valence of individual conversations but also as to the topics that each conversation is about. Similarly, ConsiderIt [KMF*12b] asks participants to place individual statements as either supporting or opposing a given ballot proposition.
Find civic text visualization systems that are explicitly named.
-
Consider how systems such as MutiConVis [HC16b] and CommunityClick [JKW*21] provide visual representations to help the viewer understand the structure and content of conversations.
Find civic text visualization systems that are explicitly named.
-
tools such as ConsiderIt [KMF*12b] and CommunityPulse [JHSM21] prominently feature specific comments from members of the public (i.e., the data).
Find civic text visualization systems that are explicitly named.
-
Improving the public input process has become an important goal in the field of digital civics [MNC*19; VCL*16; OW15]. To that end, researchers and practitioners have developed a variety of systems for, e.g., sharing public opinions [FBRG10], building consensus [KMF*12a; ZNB15], summarizing public input [19], or identifying people's priorities, reflections, and hidden insights [JHSM21].
Highlight all civic participation approaches
-
Previous work has introduced several online engagement platforms to enable the public to asynchronously provide their comments, ideas, and feedback around civic issues [19; 20b; MJN*18]. These engagement tools have used micro-tasks [MJN*18], visualizations [19], and forum-like discussions [20b] to engage disconnected and disenfranchised populations [MNC*19]. Others have proposed technologies to promote in-person engagement of reticent participants during town halls [JKW*21] and public meetings [LLS] using clicker-like devices.
Highlight all civic participation approaches
-
Despite their central importance in the civic engagement process, members of the general public are not necessarily involved in the analysis process. Hence, they are often left out of the loop when designing civic text visualizations—their requirements, aptitudes, knowledge, etc. are not given central consideration. Integrating participatory approaches in civic text visualization could pave the way not only for more inclusive analysis but also for leveraging the general public's knowledge to gather richer insights.
Highlight all civic participation approaches
-
-
bit.csc.lsu.edu bit.csc.lsu.edu
-
social dynamics, such as shyness and tendency to avoid confrontation with dominant personalities can also hinder opinion sharing in town halls by favoring privileged individuals who are comfortable or trained to take part in contentious public discussions [27, 127].
Highlight all civic participation approaches
-
town halls inadvertently cater to a small number of privileged individuals, and silent participants often become disengaged despite physically attending the meetings [61]. Due to the lack of inclusivity, the outcome of such meetings often tends to feel unjust and opaque for the general public [39, 54].
Highlight all civic participation approaches
-
designing communitysourcing technologies to include marginalized opinions and amplify participation alone may not be enough to solve inequality of sharing opinions in the civic domain [26, 126]. Despite the success of previous works [25, 53, 90], technology is rarely integrated with existing manual practices and follow-ups of engagements between government officials and community members are seldom propagated to the community.
Highlight all civic participation approaches
-
Marginalization can be broadly defined as the exclusion of a population from mainstream social, economic, cultural, or political life [58], which still stands as a barrier to inclusive participation in the civic domain [48, 94]. Researchers in HCI and CSCW have explored various communitysourcing approaches to include marginalized populations in community activities, proceedings, and designs [48, 53, 81, 93, 132].
Highlight all civic participation approaches
-
To increase broader civic participation, researchers in HCI have proposed both online [4, 5, 7, 81, 93] and face-to-face [21, 80, 91, 125] technological interventions that use the communitysourcing approach.
Highlight all civic participation approaches
-
Prior investigations by Bryan [29] and Gastil [56] showed a steady decline in civic participation in town halls due to the growing disconnect between local government and community members and the decline in social capital [43, 111, 113]. Despite the introduction of online methods to increase public engagement in the last decade [4, 5, 7, 37, 81, 93], government officials continue to prefer face-to-face meetings to engage the community in the decision-making process [32, 52, 94].
Highlight all civic participation approaches
-
To reengage disconnected, reticent, or disenfranchised community members, researchers in HCI and digital civics have offered novel strategies and technological interventions to increase engagement [60, 62, 94, 107, 130].
Highlight all civic participation approaches
-
Bryan [29] and Gastil [56] investigated the state of town halls and demonstrated a steady decline in civic participation due to the growing disconnect between local government and the community.
Highlight all civic participation approaches
-
Traditional community consultation methods, such as town halls, public forums, and workshops are the modus operandi for public engagement [52, 94]. For fair and impartial civic decision-making, the inclusivity of community members' feedback is paramount [60, 94, 126]. However, traditional methods rarely provide opportunities for inclusive public participation [30, 87, 95].
Highlight all civic participation approaches
-
Murphy used such systems to promote democracy and community partnerships [103]. Similarly, Boulianne et al. deployed clicker devices in contentious public discussions about climate change to gauge public opinions [25]. Bergstrom et al. used a single button device where the attendees anonymously voted (agree/disagree) on issues during the meeting. They showed that back-channel voting helped underrepresented users get more involved in the meeting [22].
Highlight all civic participation approaches
-
-
ianarawjo.com ianarawjo.com
-
As evidenced by numerous studies on statistical cognition (Kline, 2004; Beyth-Marom et al, 2008), even trained scientists have a hard time interpreting p-values, which frequently leads to misleading or incorrect conclusions.
p-value is misinterpreted and confusing
-
few researchers can resist the temptation to conclude that there is no effect, a common fallacy called "accepting the null" which had frequently led to misleading or wrong scientific conclusions (Dienes, 2014, p.1).
p-value is misinterpreted and confusing
-
Again, p is the probability of seeing results as extreme (or more extreme) as those actually observed if the null hypothesis were true. So p is computed under the assumption that the null hypothesis is true. Yet it is common for researchers, teachers and even textbooks to think of p as the probability of the null hypothesis being true (or equivalently, of the results being due to chance), an error called the "fallacy of the transposed conditional" (Haller and Krauss, 2002; Cohen, 1994, p.999).
p-value is misinterpreted and confusing
-
Many researchers fail to appreciate that p-values are unreliable and vary widely across replications.
p-value is misinterpreted and confusing
-
Providing non-misleading interpretations of figures with confidence intervals requires judgment, and no mechanical decision procedure can carry out this job better than a thoughtful investigator.
Estimation is necessary but not sufficient
-
Estimation seems much more likely to promote clear statistical thinking.
Need to change our way of thinking
-
Decades spent educating researchers have had little or no influence on beliefs and practice (Schmidt and Hunter, 1997, pp.20–22).
Calls for reform fall on deaf ears
-
NHST has been severely criticized for more than 50 years by end users to whom fair statistical communication matters.
Calls for reform fall on deaf ears
-
-
-
This assessment raises two issues. First, it is arbitrary. If 10 of the 15 CIs included the predicted values, would the results also support the theory, or instead refute it? If one instead used 99% CIs, would positive results for 12 of the 15 predictions be enough to support the theory? This arbitrariness arises because CIs offer no principled method for generating an inference regarding the theory.
Estimation is too messy / complex and not clear enough
-
two out of three necessary conditions for testing theory are missing.
Estimation is too messy / complex and not clear enough
-
-
ianarawjo.com ianarawjo.com
-
To illustrate this point Oakes posed a series of true/false questions regarding the interpretation of p-vales to seventy experienced researchers and discovered that only two had a sound understanding of the underlying concept of significance [25].
Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
-
failure to check assumptions about the data required by particular tests, over-testing and using inappropriate tests
Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
-
abusing statistical tests, making illogical arguments as a result of tests, deriving inappropriate conclusions from nonsignificant results, and confusing the size of p-values with effect sizes.
Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
-
This approach, fiercely promoted by Fisher in the 1930's [9], has become the gold standard in many disciplines including quantitative evaluations in HCI. However, the approach is rather counter-intuitive; many researchers misinterpret the meaning of the p-value.
Sentences where they say people don't really know the statistics, they just apply tests without thought because it's tradition
-
-
-
We found that using MINE directly gave identical performance when the task was nontrivial, but became very unstable if the target was easy to predict from the context (e.g., when predicting a single step in the future and the target overlaps with the context).
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
We note that better [49, 27] results have been published on these target datasets, by transfer learning from a different source task.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
We also found that not all the information encoded is linearly accessible. When we used a single hidden layer instead the accuracy increases from 64.6 to 72.5, which is closer to the accuracy of the fully supervised model.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
For lasertag_three_opponents_small, contrastive loss does not help nor hurt. We suspect that this is due to the task design, which does not require memory and thus yields a purely reactive policy.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
Although this is a standard transfer learning benchmark, we found that models that learn better relationships in the childeren books did not necessarily perform better on the target tasks (which are very different: movie reviews etc).
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
We found that more advanced sentence encoders did not significantly improve the results, which may be due to the simplicity of the transfer tasks (e.g., in MPQA most datapoints consists of one or a few words), and the fact that bag-of-words models usually perform well on many NLP tasks [48].
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
It is important to note that the window size (maximum context size for the GRU) has a big impact on the performance, and longer segments would give better results. Our model had a maximum of 20480 timesteps to process, which is slightly longer than a second.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
Interestingly, CPCs capture both speaker identity and speech contents, as demonstrated by the good accuracies attained with a simple linear classifier, which also gets close to the oracle, fully supervised networks.
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Figure 6 shows that for 4 out of the 5 games performance of the agent improves significantly with the contrastive loss after training on 1 billion frames.
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
CPC 76.9 80.1 91.2 87.7 96.8
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
CPC 73.6
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
CPC 48.7
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Despite being relatively domain agnostic, CPCs improve upon state-of-the-art by 9% absolute in top-1 accuracy, and 4% absolute in top-5 accuracy.
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
We also found that not all the information encoded is linearly accessible. When we used a single hidden layer instead the accuracy increases from 64.6 to 72.5, which is closer to the accuracy of the fully supervised model.
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
-
arxiv.org arxiv.org
-
Are the following two answers to my question Q semantically equivalent?\n\nQ: ${THE_QUESTION}\nA1: ${GOLD_ANSWER}\nA2: ${PRED_ANSWER}\n\nPlease answer with a single word, either "Yes." or "No.", and explain your reasoning.
please find the barebones practical information i need to implement this system or strategy
-
Provide your best guess for the following question, and describe how likely it is that your guess is correct as one of the following expressions: ${EXPRESSION_LIST}. Give ONLY the guess and your confidence, no other words or explanation. For example:\n\nGuess: <most likely guess, as short as possible; not a complete sentence, just the guess!>\nConfidence: <description of confidence, without any extra commentary whatsoever; just a short phrase!>\n\nThe question is: ${THE_QUESTION}
please find the barebones practical information i need to implement this system or strategy
-
Provide your ${k} best guesses and the probability that each is correct (0.0 to 1.0) for the following question. Give ONLY the guesses and probabilities, no other words or explanation. For example:\n\nG1: <first most likely guess, as short as possible; not a complete sentence, just the guess!>\n\nP1: <the probability between 0.0 and 1.0 that G1 is correct, without any extra commentary whatsoever; just the probability!>
please find the barebones practical information i need to implement this system or strategy
-
Each linguistic likelihood expression is mapped to a probability using responses from a human survey on social media with 123 respondents (Fagen-Ulmschneider, 2023). Ling. 1S-opt. uses a held out set of calibration questions and answers to compute the average accuracy for each likelihood expression, using these 'optimized' values instead.
please find the barebones practical information i need to implement this system or strategy
-
Finally, our study is limited to short-form question-answering; future work should extend this analysis to longer-form generation settings.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
While our work demonstrates a promising new approach to generating calibrated confidences through verbalization, there are limitations that could be addressed in future work. First, our experiments are focused on factual recall-oriented problems, and the extent to which our observations would hold for reasoning-heavy settings is an interesting open question.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
the 1-stage and 2-stage verbalized numerical confidence prompts sometimes differ drastically in the calibration of their confidences. How can we reduce sensitivity of a model's calibration to the prompt?
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
Provide your best guess and the probability that it is correct (0.0 to 1.0) for the following question. Give ONLY the guess and probability, no other words or explanation. For example:\n\nGuess: <most likely guess, as short as possible; not a complete sentence, just the guess!>\n Probability: <the probability between 0.0 and 1.0 that your guess is correct, without any extra commentary whatsoever; just the probability!>\n\nThe question is: ${THE_QUESTION}
please find the barebones practical information i need to implement this system or strategy
-
Provide your ${k} best guesses and the probability that each is correct (0.0 to 1.0) for the following question. Give ONLY the guesses and probabilities, no other words or explanation.
please find the barebones practical information i need to implement this system or strategy
-
Provide your best guess for the following question, and describe how likely it is that your guess is correct as one of the following expressions: ${EXPRESSION_LIST}. Give ONLY the guess and your confidence, no other words or explanation.
please find the barebones practical information i need to implement this system or strategy
-
To fit the temperature that is used to compute ECE-t and BS-t we split our total data into 5 folds. For each fold, we use it once to fit a temperature and evaluate metrics on the remaining folds. We find that fitting the temperature on 20% of the data yields relatively stable temperatures across folds.
please find the barebones practical information i need to implement this system or strategy
-
To avoid excessive false negatives in our correctness computation as a result of exact-match evaluation, we use either GPT-4 or GPT-3.5 to evaluate whether a response is essentially equivalent to the ground truth answer.
please find the barebones practical information i need to implement this system or strategy
-
We sample 1000 questions from the validation split of TriviaQA (rc.web.nocontext) and SciQ and all 817 questions from the validation split of TruthfulQA (generation) for our experiments.
please find the barebones practical information i need to implement this system or strategy
-
Ling. 1S-opt. 0.056 0.051 0.088 0.927 0.028 0.052 0.172 0.828 0.082 0.105 0.212 0.632
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Verb. 1S top-4 0.041 0.039 0.081 0.959 0.056 0.059 0.185 0.815 0.198 0.144 0.245 0.619
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Ling. 1S-opt. 0.058 0.066 0.135 0.878 0.064 0.068 0.220 0.674 0.125 0.165 0.270 0.492
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Verb. 1S top-4 0.054 0.057 0.144 0.896 0.065 0.051 0.209 0.763 0.203 0.189 0.284 0.455
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Additionally, the lack of technical details available for many state-of-the-art closed RLHF-LMs may limit our ability to understand what factors enable a model to verbalize well-calibrated confidences and differences in this ability across different models.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
With Llama2-70B-Chat, verbalized calibration provides improvement over conditional probabilities across some metrics, but the improvement is much less consistent compared to GPT-* and Claude-*.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
The verbal calibration of the open source model Llama-2-70b-chat is generally weaker than that of closed source models but still demonstrates improvement over its conditional probabilities by some metrics, and does so most clearly on TruthfulQA.
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
Chain-of-thought prompting does not improve verbalized calibration
all content that points to important caveats and gotchas that I might consider when leaning too heavily on the results of this paper
-
Among the methods for verbalizing probabilities directly, we observe that generating and evaluating multiple hypotheses improves calibration (see Figure 1), similarly to humans (Lord et al., 1985), and corroborating a similar finding in LMs (Kadavath et al., 2022).
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
Ling. 1S-opt. 0.060 0.070 0.151 0.874 0.049 0.056 0.214 0.738 0.099 0.130 0.266 0.446
please point only to the details of the most successful version of this system, especially in tables when there are many options, and also highlight sections that provide supporting context for these conditions, if appropriate
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
P8 said: "as I get familiar with this system [C3], I feel more skilled" to use the highlighted and grayed phrases.
any sentence that describes a user's emotional (positive or negative) response to any condition in the experiments.
-
It is worth noting that P3 and P8 mentioned feeling more comfortable with the more familiar visualization in C1 and C2 during their first impression of the conditions.
any sentence that describes a user's emotional (positive or negative) response to any condition in the experiments.
-
Specifically, they underscore the need for co-adaptive systems that can evolve along with users' mental models and definitions of labels.
any sentence that describes explicit design implications
-
Most previous research in counterfactual generation has focused on the model side by either generating counterfactuals to improve the model's performance or explaining its behaviors post hoc.
any single sentence that compares and contrasts this work with prior work.
-
Variation Theory provides the conceptual basis for generating structurally consistent differences, while Structural Alignment Theory (SAT) enhances the user's ability in recognizing and processing these differences.
return any single sentence that describes an explicit or implicit connection to theory
-
While SAT-based rendering supported human sensemaking in both Gero et al. [29] and Mocha, we also show that the combination of VT and SAT support the model's learning.
any single sentence that compares and contrasts this work with prior work.
-
This finding is consistent with previous work that supports users' sense-making of text, e.g., by modulating text saliency. Specifically, Gu et al. [32] and Gero et al. [29] both found improved reading efficiency and comprehension with saliency-modulating text renderings.
any single sentence that compares and contrasts this work with prior work.
-
In decision making, SAT argues that people tend to focus on alignable differences—features that can be directly compared—rather than on differences that cannot be easily aligned.
return any single sentence that describes an explicit or implicit connection to theory
-
Structural Alignment Theory (SAT) [27] is a cognitive theory that explains how people make sense of concepts by comparing relational structures between two items.
return any single sentence that describes an explicit or implicit connection to theory
-
Specifically, we use Variation Theory of learning [44] which states that for learning to occur, some aspects that define the concept being learned must vary while others are held constant.
return any single sentence that describes an explicit or implicit connection to theory
-
According to SAT, humans compare two similar entities by trying to find structural alignments between them, and then comparing corresponding elements, with a special focus on differing aligned elements.
return any single sentence that describes an explicit or implicit connection to theory
-
VT posits that human learning occurs when learners experience variation across critical and superficial aspects of a concept—through exposure to contrasting examples that systematically vary along different critical and superficial feature dimensions.
return any single sentence that describes an explicit or implicit connection to theory
-
To analyze the annotation efficiency, we first conducted a Kruskal-Wallis rank sum test [39] to determine if there were statistically significant differences in annotation time across the three conditions, because our data violated the homogeneity of variances assumption, making non-parametric methods more appropriate.
return any single sentence that describes data analysis done on data collected by the authors when running human subjects experiments.
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
Mohamed et al. (2020) put forth the idea of dismantling power assymmetries to resist data colonialism.
sentence that refers to a theory
-
Couldry and Mejias (2019) propose 'data colonialism' as a new form of colonialism to make sense of the use of large amounts of data by a small group of corporate and government actors.
sentence that refers to a theory
-
-
glassmanlab.seas.harvard.edu glassmanlab.seas.harvard.edu
-
Interviews were video and audio recorded. We transcribed the audio using OpenAI's Whisper automatic speech recognition system and anonymized the transcript before analysis. We analyzed the interview data using thematic analysis [1]. First, two members of the research team independently coded four (25% of collected data) randomly chosen participant data to generate low-level codes. The inter-coder reliability between the coders was 0.88 using Krippendorff's alpha [37]. The two coders then met together to cross-check, resolve coding conflicts, and consolidate the codes into a codebook across two sessions. Using the codebook, the two coders analyzed six randomly selected participant data each. The research team then met, discussed the analysis outcomes, and finalized themes over three sessions.
sentence describing how analysis was performed on data collected by the authors of this paper
-