31 Matching Annotations
  1. Apr 2026
    1. Additionally, our tool currently helps users in the reviewing step solely with the alignment functionality. Future work should add additional assistance during this step in the form of suggested improvements to selected unsatisfactory content in the summary, in addition to the alignment feature.

      Please highlight any phrases that describe recommendations made in the paper

    2. Future work should expand the application's capabilities to the multi-document setting, both in terms of the backend models and in terms of accessibility and intuitiveness of the application's frontend design.

      Please highlight any phrases that describe recommendations made in the paper

    3. Additionally, in light of some user feedback, another interesting extension includes developing more abstractive consolidation and fusion models, which would offer control over the level of abstractness in the outputs.

      Please highlight any phrases that describe recommendations made in the paper

    4. highlights are incorporated into the input text with special markups, <extra_id_1> and <extra_id_2>, marking the beginning and end of each highlighted span, respectively. In our configuration, we set the maximum input length to 4096 and the maximum target length to 400. A greedy decoding strategy was used in order to optimize the decoding speed.

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    5. Our approach locates the longest common subsequence (LCS) between the lemmas of each input sentence and each summary sentence, followed by several heuristics to filter out irrelevant LCSs

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    6. For the initial auto-consolidation, we deploy an available Controlled Text Reduction model (Slobodkin et al., 2023), which is a Flan-T5large model (Chung et al., 2022), finetuned on the highlights-focused CTR dataset.

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    7. we deploy the ExtractiveSummarizer model from the TransformerSum library. The model, a RoBERTabase (Liu et al., 2019) trained on the CNN/DailyMail summarization dataset (Hermann et al., 2015), operates as a binary classifier.

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    8. This step coincides with the recently introduced Controlled Text Reduction task (CTR; Slobodkin et al., 2022), which produces a coherent fused version of the content of marked spans ("highlights") in a source document, as interpreted within the context of the full text.

      Please highlight any phrases that describe the theory behind this work

    9. SUMMHELPER is a modular system consisting of separate components, each performing one subtask, allowing user modifications of that sub-task's output. Such decomposition has been studied before in the context of fully automated summarization, with several works separating the process into salience detection and generation components (Barzilay and McKeown, 2005; Li et al., 2018; Ernst et al., 2022). These works focused on optimizing each component as part of a fully-automatic summarization process in order to improve the overall performance of the model. In contrast, our work uses this modularity to not only improve overall system output, but to also give more control to the user over each step in the summarization process.

      Please highlight any phrases that describe the theory behind this work

    10. Our objective in this paper is to promote such a human-involved approach to summarization, allowing to better tailor the eventual output to real-world user needs, and to synergize the efficiency of the computer with the quality of the human (Hoc, 2000; Pacaux-Lemoine et al., 2017; Flemisch et al., 2019).

      Please highlight any phrases that describe the theory behind this work

  2. Mar 2026
    1. the selection of label options may work better if it is similar to common options for given tasks, such as [positive, neutral, negative] > [super positive, positive, ..., negative] for sentiment classification

      Please highlight any phrases that describe recommendations made in the paper

    2. errors encountered during API calls are handled in two ways: handle within our system or delegate to users. We handle known LLM API errors that can be solved by user-side intervention. This would be in cases such as a Timeout or RateLimitError in OpenAI models

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    3. Data Model MEGAnno+ extends MEGAnno's data model where data Record, Label, Annotation, Metadata (e.g., text embedding or confidence score) are persisted in the service database along with the task Schema.

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    4. MEGAnno+ is designed to provide a convenient and robust workflow for users to utilize LLMs in text annotation. To use our tool, users operate within their Jupyter notebook (Kluyver et al., 2016) with the MEGAnno+ client installed.

      Please highlight any phrases that describe the libraries and tools used to implement the idea

    5. LLM annotators and human annotators should not be treated the same, and annotation tools should carefully design their data models and workflows to accommodate both types of annotators.

      Please highlight any phrases that describe the theory behind this work

    6. we go beyond using LLMs to assist annotation for human annotators or to replace human annotators. Rather, MEGAnno+ advocates for a collaboration between humans and LLMs with our dedicated system design and annotation-verification workflows.

      Please highlight any phrases that describe the theory behind this work

    7. Despite these advancements, it is essential to acknowledge that LLMs have limitations, necessitating human intervention in the data annotation process. One challenge is that the performance of LLMs varies extensively across different tasks, datasets, and labels. LLMs often struggle to comprehend subtle nuances or contexts in natural language, making involvement of humans with social and cultural understanding or domain expertise crucial.

      Please highlight any phrases that describe the theory behind this work

    8. Large language models (LLMs) can label data faster and cheaper than humans for various NLP tasks. Despite their prowess, LLMs may fall short in understanding of complex, sociocultural, or domain-specific context, potentially leading to incorrect annotations. Therefore, we advocate a collaborative approach where humans and LLMs work together to produce reliable and high-quality labels.

      Please highlight any phrases that describe the theory behind this work