Hypothesis

8 Matching Annotations

Nov 2022
arxiv.org arxiv.org

2210.07188.pdf

7
1. ravenscroftj 23 Nov 2022
  
  in Public
  
  Our annotators achieve thehighest precision with OntoNotes, suggesting thatmost of the entities identified by crowdworkers arecorrect for this dataset.
  
  interesting that the mention detection algorithm gives poor precision on OntoNotes and the annotators get high precision. Does this imply that there are a lot of invalid mentions in this data and the guidelines for ontonotes are correct to ignore generic pronouns without pronominals?
  
  data-annotation coreference NLProc
2. ravenscroftj 23 Nov 2022
  
  in Public
  
  an algorithm with high precision on LitBank orOntoNotes would miss a huge percentage of rele-vant mentions and entities on other datasets (con-straining our analysis)
  
  these datasets have the most limited/constrained definitions for co-reference and what should be marked up so it makes sense that precision is poor in these datasets
  
  data-annotation coreference NLProc
3. ravenscroftj 23 Nov 2022
  
  in Public
  
  Procedure: We first launch an annotation tutorial(paid $4.50) and recruit the annotators on the AMTplatform.9 At the end of the tutorial, each annotatoris asked to annotate a short passage (around 150words). Only annotators with a B3 score (Bagga
  
  Annotators are asked to complete a quality control exercise and only annotators who achieve a B3 score of 0.9 or higher are invited to do more annotation
  
  coreference data-annotation NLProc
4. ravenscroftj 23 Nov 2022
  
  in Public
  
  Annotation structure: Two annotation ap-proaches are prominent in the literature: (1) a localpairwise approach, annotators are shown a pairof mentions and asked whether they refer to thesame entity (Hladká et al., 2009; Chamberlain et al.,2016a; Li et al., 2020; Ravenscroft et al., 2021),which is time-consuming; or (2) a cluster-basedapproach (Reiter, 2018; Oberle, 2018; Bornsteinet al., 2020), in which annotators group all men-tions of the same entity into a single cluster. InezCoref we use the latter approach, which can befaster but requires the UI to support more complexactions for creating and editing cluster structures.
  
  ezCoref presents clusters of coreferences all at the same time - this is a nice efficient way to do annotation versus pairwise annotation (like we did for CD^2CR)
  
  coreference data-annotation NLProc
5. ravenscroftj 23 Nov 2022
  
  in Public
  
  owever, these datasets vary widelyin their definitions of coreference (expressed viaannotation guidelines), resulting in inconsistent an-notations both within and across domains and lan-guages. For instance, as shown in Figure 1, whileARRAU (Uryupina et al., 2019) treats generic pro-nouns as non-referring, OntoNotes chooses not tomark them at all
  
  One of the big issues is that different co-reference datasets have significant differences in annotation guidelines even within the coreference family of tasks - I found this quite shocking as one might expect coreference to be fairly well defined as a task.
  
  coreference NLProc data-annotation
6. ravenscroftj 23 Nov 2022
  
  in Public
  
  Specifically, our work investigates the quality ofcrowdsourced coreference annotations when anno-tators are taught only simple coreference cases thatare treated uniformly across existing datasets (e.g.,pronouns). By providing only these simple cases,we are able to teach the annotators the concept ofcoreference, while allowing them to freely interpretcases treated differently across the existing datasets.This setup allows us to identify cases where ourannotators disagree among each other, but moreimportantly cases where they unanimously agreewith each other but disagree with the expert, thussuggesting cases that should be revisited by theresearch community when curating future unifiedannotation guidelines
  
  The aim of the work is to examine a simplified subset of co-reference phenomena which are generally treated the same across different existing datasets.
  
  This makes spotting inter-annotator disagreement easier - presumably because for simpler cases there are fewer modes of failure?
  
  coreference NLProc data-annotation
7. ravenscroftj 23 Nov 2022
  
  in Public
  
  this work, we developa crowdsourcing-friendly coreference annota-tion methodology, ezCoref, consisting of anannotation tool and an interactive tutorial. Weuse ezCoref to re-annotate 240 passages fromseven existing English coreference datasets(spanning fiction, news, and multiple other do-mains) while teaching annotators only casesthat are treated similarly across these datasets
  
  this paper describes a new efficient coreference annotation tool which simplifies co-reference annotation. They use their tool to re-annotate passages from widely used coreference datasets.
  
  coreference NLProc data-annotation
Visit annotations in context

Tags

coreference

data-annotation

NLProc

Annotators

ravenscroftj

URL

arxiv.org/pdf/2210.07188.pdf
Jul 2020
icla2020.jonreeve.com icla2020.jonreeve.com

The Moonstone

1
1. flaviano_christian_reyes 14 Jul 2020
  
  in Public
  
  He looked in amazement at two respectable strangers,
  
  The narrator then explains who these strangers are. An example of deixis/coreference resolution? The style, however, is different. While Betterredge uses it, Miss Clack uses unidentified strangers.
  
  deixis coreference resolution style
Visit annotations in context

Tags

deixis

coreference resolution

style

Annotators

flaviano_christian_reyes

URL

icla2020.jonreeve.com/texts/moonstone.html

Tags

Annotators

URL

Tags

Annotators

URL