Opened 11 years ago
Closed 7 years ago
#10291 closed enhancement (fixed)
Provide a cross-browser equivalent for textContent
Reported by: | wolfram | Owned by: | dylan |
---|---|---|---|
Priority: | high | Milestone: | 1.10 |
Component: | HTML | Version: | 1.4.0b |
Keywords: | attr | Cc: | |
Blocked By: | Blocking: |
Description
Make the following work
// dojo.create uses dojo.attr n = dojo.create("h1", {textContent:"Whatever Headline"}, dojo.body()) dojo.attr(n, "textContent");
before the patch it was resulting in
<h1 textcontent="Whatever Headline"></h1>
now after the patch
<h1>Whatever Headline</h1>
Attachments (1)
Change History (34)
Changed 11 years ago by
Attachment: | attr_textContent.diff added |
---|
comment:1 Changed 11 years ago by
Summary: | Make dojo.attr() work with "textContent" properly too → Make dojo.attr() understand "textContent" properly |
---|
comment:2 Changed 11 years ago by
Milestone: | tbd → 1.5 |
---|---|
Owner: | changed from anonymous to Eugene Lazutkin |
Type: | defect → enhancement |
comment:3 Changed 11 years ago by
I think the basic use case is the compliance to DOM Level3, and the code
dojo.create("h1", {innerHTML:"Whatever"})
makes me clearly expect that
dojo.create("h1", {textContent:"Whatever"})
should work too, because it kinda says "use any DOM attribute there, we map it properly for you". And "textContent" might be faster, but I didn't measure it. So it boils down to an API look+feel and maybe speed. My 2 cents.
comment:4 Changed 11 years ago by
textContent is important for escaping of input retrieved from “untrusted sources” like user input or data coming from other servers. Using innerHTML creates the same XSS vulnerabilities in JavaScript? that other languages have when not escaping strings in HTML.
The following example should illustrate the point:
dojo.create("h1", {innerHTML: "<script>do.something.evil()</script>"}); /* vs. */ dojo.create("h1", {textContent: "<script>do.something.evil()</script>"});
comment:5 Changed 11 years ago by
And I guess mentioning that those who want to break in find various ways, so just converting "<" and ">" would not really solve it, but using the native "textContent" attr makes it as secure as the browser allows. my 2 cents ...
comment:6 follow-up: 8 Changed 11 years ago by
Note: dojo.NodeList?-manipulate's text() function uses a recursive function getting node.nodeValue for get operations and createTextNode() for set operations to work better across browsers.
comment:7 Changed 11 years ago by
Component: | General → Core |
---|---|
Status: | new → assigned |
comment:8 Changed 11 years ago by
Replying to jburke:
Note:
dojo.NodeList-manipulate
'stext()
function uses a recursive function getting node.nodeValue for get operations andcreateTextNode()
for set operations to work better across browsers.
I can explain the tricky things that James mentioned.
According to the official MS documentation on innerText
(http://msdn.microsoft.com/en-us/library/ms533899%28v=VS.85%29.aspx) it differs from textContent
in following ways:
- The innerText property is valid for block elements only.
- The innerText property is read-only on the
html
,table
,tBody
,tFoot
,tHead
, andtr
objects.
- You can set this property only after the onload event fires on the window.
The patch should be improved.
comment:9 Changed 11 years ago by
Milestone: | 1.5 → 1.6 |
---|
Rescheduling to 1.6. There is still a possibility to recast this ticket as 1.5 and include in 1.5.x release.
comment:10 Changed 10 years ago by
Milestone: | 1.6 → 1.7 |
---|
Given all troubles with IE, especially that it works only after onload fired, I want to postpone this ticket for now. We sanitize the incoming HTML in templates, so I suspect that this is the right way to do it rather than trying to support disparate IE and non-IE facilities.
I suggest to expose a sanitizing function and reuse it + innerHTML
to set a sanitized text.
comment:12 Changed 10 years ago by
Milestone: | 1.7 → 1.8 |
---|
comment:13 Changed 9 years ago by
Component: | Core → HTML |
---|
comment:14 Changed 9 years ago by
Milestone: | 1.8 → 2.0 |
---|
1.8 is frozen. Move all enhancements to next release. If you need an exemption from the freeze for this ticket, contact me immediately.
comment:15 Changed 8 years ago by
comment:16 Changed 8 years ago by
Guys, this really needs some attention.
Although the title suggests using attr
, and that's probably no longer the right place, there are underlying issues (see above and in #7921) that the toolkit really ought to address.
Basically, we should provide a cross-browser equivalent for textContent
, which all modern browsers now have. That's actually two distinct features: a getter and a setter.
Setter. A convenient and reliable method to set element text is sorely needed. It's too tempting to just say node.innerHTML = myText
and forget that the text is not escaped. We all know what happens when we forget to escape things, right? A cursory check finds some hundred cases in the toolkit where createTextNode
workarounds have been re-implemented in various ways (not to mention all those that should have but didn't).
A suitable setter is pretty straightforward to implement, I think. You can assign to textContent
if it exists, which naturally is the most efficient way, otherwise fall back to using createTextNode
and such.
#8995 brings up the need for a reusable HTML-escape function. One of the best ways of implementing that (see ticket:8995#comment:10) is actually to create a detached div
, add a text node, then get its innerHTML
. Ironically, though, the most common reason people do HTML-escaping is just so they can push the result into innerHTML
, for lack of any convenient direct way to set the text.
Getter. For browsers that lack the standard textContent
, mainly IE8 and earlier, it's tempting to use innerText
, but the behavior is somewhat different (besides the above, see http://clubajax.org/plain-text-vs-innertext-vs-textcontent/). NodeList-manipulate
has a text()
function that may already do the right thing (though I haven't examined all the edge cases) and just needs to be extracted.
I would guess dom-prop
is where this belongs, so that we can do things like domProp.get(node, "textContent")
or domConstruct.create("a", { textContent: "I <3 Dojo" })
.
Once this is done, I certainly suggest updating all the tutorials (e.g. http://dojotoolkit.org/documentation/tutorials/promises/) so that users don't pick up the bad habit of using innerHTML
for setting text.
comment:17 follow-up: 18 Changed 8 years ago by
Milestone: | 2.0 → 1.10 |
---|
Agreed, we should fix this for 1.10.
comment:18 Changed 8 years ago by
Replying to dylan:
Agreed, we should fix this for 1.10.
My company would really be pleased, if you can fix this for 1.10. Thanks!
comment:19 Changed 7 years ago by
Owner: | changed from Eugene Lazutkin to dylan |
---|---|
Summary: | Make dojo.attr() understand "textContent" properly → Provide a cross-browser equivalent for textContent |
comment:20 Changed 7 years ago by
I am working on a patch for this ticket. However I have a question for you guys. According to the tests I made: textContent provides the exact same result on IE10, Chrome 33, Firefox 27.
However according to Mark Wilcox's post (http://clubajax.org/plain-text-vs-innertext-vs-textcontent/) on oldest versions of Chrome and Firefox the result was not consistent. And IE8 also provides another kind of result.
So, my question is: for get('textContent'), should we rely on browser's textContent property when it exists and use a custom function when it does not?
If so, the result may not be consistent across browsers... In other hand, a custom function always used will be consistent across browser but not standards compliant (and far slower...)
My current changes are available here: https://github.com/ben8p/dojo/compare/textContent
Basically, I moved text() from Nodelist-manipulate to dom-attr. And I use it for IE <= 8, for any other browser I use native textContent
comment:21 Changed 7 years ago by
I would not rely on the veracity of a report from 2010, on that site.
comment:22 Changed 7 years ago by
@csnover I agree. But the fact is: there is no way to have the same value between IE8 and other browsers (because IE8 does not return the indentation of the HTML, just the new lines). So I wonder if that's an issue IE8 (and lower) provides a different value than any other browsers... if so, we can only go for a custom getter...
comment:23 Changed 7 years ago by
Yes, old IE has a habit of mangling data passed into the DOM, but it does that with innerHTML too, so if people are OK with how that has worked for years even though it’s not consistent, I would think this to be OK, too, especially if it is in documentation. Use textContent
for all browsers that support it, and traverse elements for old IE.
comment:24 Changed 7 years ago by
I think so too.
Pull request is available here : https://github.com/dojo/dojo/pull/61
(The title needs to be updated with [CLA][PATCH])
comment:25 Changed 7 years ago by
@ido, thanks for tackling this. This issue has been a pain for me in the past.
Regarding the pull request, it seems like this change should be made to dom-prop instead of dom-attr because textContent
is a DOM property, not an attribute. The difficulty here is that dom-construct uses dom-attr which means adding textContent
support to dom-prop won't help dom-construct.
A counterpoint to my concern is that dom-attr already supports innerHTML
.
Any thoughts on this?
comment:26 Changed 7 years ago by
@brandonp, indeed, dom-prop is a better place... I've updated the pull request. The code is now in dom-prop but dom-attr is still mapping it. dom-attr just need a small if() in the getter.
Let me know if you think I should revert and let the code in dom-attr
comment:28 Changed 7 years ago by
@brandonp I still have an little worry... Let's consider the following content:
<div>first line <div>second line</div> <div>third line</div> fourth line </div>
If you do a console.log(domProp.get(node, 'textContent'));
- New browsers will spit:
first line second line third line fourth line
And that's what we expect.
- IE8- will return:
first line second line third line fourth line
This consistent with the current value returned by Nodelist-manipulate::text() However, we could (IMO we should) consider using IE innerText property.
- innerText will return:
first line second line third line fourth line
It's still not the same behavior as other browser but, it's closer to what the user may expect.
innerText cannot be used for setting the value because of some restrictions. However, it can be used for reading values and I think we should use it.
The drawback of using innerText is the backward compatibility. The result of Nodelist-manipulate::text() will be slightly different in Dojo 1.9- compared to Dojo 1.10+. Is that acceptable ?
Note: the value returned by innerText change according to the node. For instance:
<div> first line second line third line fourth line </div>
return
first line second line third line fourth line
While
<pre> first line second line third line fourth line </pre>
return
first line second line third line fourth line
comment:29 follow-up: 30 Changed 7 years ago by
@ido, given the choice between different whitespace (textContent) and different content (innerText), I'd prefer different whitespace. Do others have thoughts on this?
comment:30 Changed 7 years ago by
@ido, after talking about this with others, I think we should continue with your current implementation and not use innerText.
comment:31 Changed 7 years ago by
@brandonp: thanks, the pull request is ready. I applied latest comments.
comment:33 Changed 7 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Committed to master, thanks for contributing an updated patch for this long-requested enhancement.
not sure what this gains over doing your example with innerHTML?
should generate
is there a valid use case for setting textContent over innerHTML (and couldn't you just .replace("<", "<") before setting innerHTML to get the escaped version?
either way, too late for 1.4. Will decide for 1.5