ReplaceLineBreaksForHtml does not HTML encode the text
See original GitHub issueBoth Umbraco 7 and 8 have methods to replace line breaks with <br />
. This can come in handy when rendering Umbraco.Textarea
properties, but the current implementations contain several awkward implementation details regarding naming and (not) correctly HTML encoding the input (causing a potential security issue).
Umbraco 7
First off, the V7 version returns a string
:
https://github.com/umbraco/Umbraco-CMS/blob/54a2aa00a78caa4e6fe7b3b3cb9ff418fd1f408d/src/Umbraco.Web/HtmlStringUtilities.cs#L21-L29
Both have parameters named text
and as the method name/description contains HTML
, you would assume the text is correctly encoded, so you can use the return value as plain HTML - wrong:
@Html.Raw(Html.ReplaceLineBreaksForHtml("This is the first line\r\nThe second\r\n<script>document.write('And the third!');</script>"))
@* Becomes (note the script tag isn't HTML encoded): *@
This is the first line<br />The second<br /><script>document.write('And the third!');</script>
As you’re explicitly using @Html.Raw()
, you could argue correctly encoding the input is your own responsibility. So to correctly use this method, you would need to write the following in your views:
@Html.Raw(Html.ReplaceLineBreaksForHtml(Html.Encode("This is the first line\r\nThe second\r\n<script>document.write('And the third!');</script>")))
Not a very useful/easy to use method if you ask me…
Umbraco 8
So V8 tried to make it easier to work with this method by returning IHtmlString
:
And this actually makes it worse, as you would expect the text to be correctly HTML encoded before replacing/adding the <br />
s. Everyone just using this method (as-is: @Html.ReplaceLineBreaksForHtml()
) would actually be vulnerable to XSS attacks, especially if user entered data, like a member bio, is rendered this way. So to correctly use this, you would need to write:
@Html.ReplaceLineBreaksForHtml(Html.Encode("This is the first line\r\nThe second\r\n<script>document.write('And the third!');</script>")))
Expected result
Correctly HTML encode the input text before replacing/adding the <br />
s.
Actual result
See above 🔝
Fixing this will cause a breaking change, as you might already encode the input and that might cause double-encoded output.
So the correct fix will probably be to obsolete these methods (with a nice warning) and introduce new ones with the fix (just called ReplaceLineBreaks()
, at least for the HtmlHelper
extension method, so that just becomes @Html.ReplaceLineBreaks()
).
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (5 by maintainers)
@glombek That’s already included in PR https://github.com/umbraco/Umbraco-CMS/pull/6545 (commit https://github.com/ronaldbarendse/Umbraco-CMS/commit/8ac35df6e9266a79911ea8347da0ff8c01e61d7a). This is still open, as I’ve found another issue within
Html.Wrap()
that needs some further investigation…Sure thing, happy to look at a contribution following that suggestion.