Default call should disallow all tags and attributes
See original GitHub issueThe problem to solve
Looking at a call like sanitizeHtml(userInput)
, it appears that all html should be sanitized and nothing allowed through (hence, sanitize-html). When the library has implicit defaults it makes me very unsure as a user as to what it’ll do.
For example, dev 1 wrote:
sanitizeHtml(variant.product.title, {
allowedTags: ['br'],
});
Dev 2 could come later on when we disallow br
tags and remove the allowedTags
option altogether which would make all the default tags allowed.
The other issues since I know now the library “style”, with this code:
sanitizeHtml(variant.product.title, {
allowedTags: ['br'],
});
It begs the question, what attributes can br
have that might screw me over? Are all of br
’s attributes allowed? What if the user can set an attribute I don’t know about and do something dangerous.
Proposed solution
Change the default API so that nothing is allowed through unless explicitly specified.
sanitizeHtml(userInput)
would remove every tag. If I specify an allowed tag, it would allow that tag with 0 attributes.
These snippets would behave the same:
sanitizeHtml(variant.product.title, {
allowedTags: [],
});
sanitizeHtml(variant.product.title, {});
I wouldn’t feel like I have to watch my back.
Default are great and those lists can be exported as well like so:
import sanitizeHtml, {defaultAllowedTags} from 'sanitize-html';
sanitizeHtml({allowedTags: defaultAllowedTags});
Alternatives
Instead of completely breaking backward compatibility, a strict export could be introduced:
import {sanitizeHtmlStrict, defaultAllowedTags} from 'sanitize-html';
sanitizeHtmlStrict(userInput);
Additional context
The only context was me doing code review and not being certain as to what will take place when allowedTags
ends up changing, etc.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top GitHub Comments
Stepping back to the big picture, the module is called ‘sanitize-html’, not ‘strip-html’ or ‘remove-html’ … it’s purpose first is to return a consistent format for HTML. To that end, it’s reasonable for the module to have a default set of tags/attrs/etc that it deems ‘sanitary’. This would be easier to express if HTML5 had a DTD or something official to root the default definition in, but alas.
I see this more as a documentation issue. @hgezim’s use case is directly referenced in the README but overall the README is verbose and frames a lot of scenarios in a less technical, more conversational way. This makes the README hard to read/scan (it takes almost 3,000 words to get the first basic usage example!).
The default options are documented in full right in the documentation… but, this doesn’t mean you’re wrong. It would of course have to be a major version bump.
On Fri, Nov 19, 2021 at 2:15 AM Gezim Hoxha @.***> wrote:
–
THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his