Hypothesis

8 Matching Annotations

Jun 2026
Local file Local file

Untitled document

8
1. jimlilili 21 Jun 2026
  
  in Public
  
  Table 2-2.
  
  figure 2-2
2. jimlilili 21 Jun 2026
  
  in Public
  
  GPT-4o annotation of 200 randomly sampled unsupported claims (Cohen’s 𝜅=0.657LLM-LLM IAA, Claude vs ChatGPT; computed offline, annotation script not archived inrepository) partitions gaps into four categories:To characterise the nature of LGKC-identified gaps, a random sample of 200 un-supported claims was drawn from the MedChat-QA evaluation set and annotated usingGPT-4o with retrieval-augmented evidence from PubMed abstracts. Each claim was as-signed to one of four mutually exclusive categories defined by whether the underlyingpharmacological relationship exists in the literature and, if so, how its absence from theKG should be interpreted. Inter-annotator agreement was assessed by replicating the an-notation using a second LLM (Claude), yielding Cohen’s 𝜅 = 0.657, a level conventionallyinterpreted as substantial agreement. The resulting c
  
  two paragraphs are repeat youself.
3. jimlilili 21 Jun 2026
  
  in Public
  
  GENE_ASSOCIATED
  
  not introduced in abbreviations maybe GENE_ASSOCIATED_WITH_DISEASE?
4. jimlilili 21 Jun 2026
  
  in Public
  
  Note. △ Source-scoped: entity vocabulary bounded to source catalog. Fully-supported question rate by relation (allclaims KG-confirmed): CF=32.2% lowest, independently confirms CF=0 direct edges finding.
  
  where is the medicationQA, pubmedQA you mentioned earlier?
5. jimlilili 21 Jun 2026
  
  in Public
  
  Figure 2-5 Cross-KG × benchmark LGKC heatmap (OpenBioLLM-8B, K=10). △ = source-scoped.‡ = text-mined source circularity
  
  the table looks odd, with the stripe in the blocks
6. jimlilili 21 Jun 2026
  
  in Public
  
  nine systems,
  
  according to table 2.4, should be eight systems
7. jimlilili 21 Jun 2026
  
  in Public
  
  𝜑(𝑐, 𝐺)
  
  not introduced before, maybe K(G,c)?
8. jimlilili 21 Jun 2026
  
  in Public
  
  probabilistic lower-bound metric
  
  too strong, -> "an operational estimate of KG coverage over schema-compatible pharmacological claims elicited from an LLM."
Annotators

jimlilili

Annotators