Hypothesis

3 Matching Annotations

May 2022
github.com github.com

[Bug] Break Line (\n) added in replacement of HTML tags in quoted text · Issue #4532 · hypothesis/client

1
1. kael 25 May 2022
 
 in Public
 
 The new lines you mention really are present in the text content of the element. HTML tags are not being replaced by new lines, they just get omitted entirely. If you look at the textContent property of the element you selected in the browser console, and you'll see the same new lines. Also if you select the text and run window.getSelection().getRangeAt(0).toString() in the browser console you'll see the same new lines. In summary, this is working as it is currently expected to. What I think may have been surprising here is that the captured text is not the same as what would be copied to the clipboard. When copying to the clipboard, new lines in the source get replaced with spaces, and tags get converted to new lines. Browser specifications distinguish the original text content of HTML "in the source" as returned by element.textContent from the text content "as rendered" returned by element.innerText. Hypothesis has always captured quotes from and searched for quotes in the "source" text content rather than the "rendered" text. This behavior causes issues with line breaks as well. It might make sense for us to look at capturing the rendered text (as copied to the clipboard) rather than the source text in future. We'd need to be careful to handle all the places where this distinction comes up, and also make sure that all existing annotations anchor properly. Also we should talk to other parties interested in the Web Annotations specifications to discuss how this impacts interoperability.
 
 What I think may have been surprising here is that the captured text is not the same as what would be copied to the clipboard. When copying to the clipboard, new lines in the source get replaced with spaces, and tags get converted to new lines. Browser specifications distinguish the original text content of HTML "in the source" as returned by element.textContent from the text content "as rendered" returned by element.innerText. Hypothesis has always captured quotes from and searched for quotes in the "source" text content rather than the "rendered" text.
 
 hypothesis annotations js selection js:selection dom:textContent html:innerText
Visit annotations in context

Tags

js

hypothesis

annotations

js:selection

selection

html:innerText

dom:textContent

Annotators

kael

URL

github.com/hypothesis/client/issues/4532
developer.mozilla.org developer.mozilla.org

Node.textContent

2
1. kael 25 May 2022
 
 in Public
 
 Differences from innerHTML Element.innerHTML returns HTML, as its name indicates. Sometimes people use innerHTML to retrieve or write text inside an element, but textContent has better performance because its value is not parsed as HTML. Moreover, using textContent can prevent XSS attacks.
 
 js dom html dom:textContent html:innerHTML
2. kael 25 May 2022
 
 in Public
 
 Differences from innerText Don't get confused by the differences between Node.textContent and HTMLElement.innerText. Although the names seem similar, there are important differences: textContent gets the content of all elements, including <script> and <style> elements. In contrast, innerText only shows "human-readable" elements. textContent returns every element in the node. In contrast, innerText is aware of styling and won't return the text of "hidden" elements. Moreover, since innerText takes CSS styles into account, reading the value of innerText triggers a reflow to ensure up-to-date computed styles. (Reflows can be computationally expensive, and thus should be avoided when possible.) Both textContent and innerText remove child nodes when altered, but altering innerText in Internet Explorer (version 11 and below) also permanently destroys all descendant text nodes. It is impossible to insert the nodes again into any other element or into the same element after doing so.
 
 js dom html dom:textContent html:innerText html:innerHTML
Visit annotations in context

Tags

js

html:innerHTML

html

html:innerText

dom:textContent

dom

Annotators

kael

URL

developer.mozilla.org/en-US/docs/Web/API/Node/textContent

Tags

Annotators

URL

Tags

Annotators

URL