Remove option breaks main slugify functionality
See original GitHub issueReproduction script:
const slugify = require('slugify');
const fileName = decodeURIComponent('a%CC%8Aa%CC%88o%CC%88-123'); // åäö-123
const fileName2 = decodeURIComponent('%0A%C3%A5%C3%A4%C3%B6-123'); // åäö-123
const withoutRemove = slugify(fileName);
const withRemove = slugify(fileName, {
remove: /[*+~.()'"!:@]/g,
});
const withRemoveAlternativeEncoding = slugify(fileName2, {
remove: /[*+~.()'"!:@]/g,
});
console.log(withoutRemove);
console.log(withRemove);
console.log(withRemoveAlternativeEncoding);
Result
aao-123
åäö-123 # Broken
aao-123
I can’t reproduce this when typing the “åäö” characters on my keyboard, however if I paste them from my console or a browser URL then this behaviour can be reproduced. I suspect there is a problem with different code points similar to this: https://github.com/ipython/ipython/issues/522
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Angular slugify leaving dash at the end
Main problem: Remove accents and symbols slugify, but leaving dash at the end. Before we go to the main problem, there are several...
Read more >Remove slug from custom post type post URLs
First, we will remove the slug from the permalink: function na_remove_slug( $post_link, $post, $leavename ) { ...
Read more >Slug
Supply a custom function which checks whether or not the slug is unique. Receives the proposed slug as the first argument and an...
Read more >What Is a WordPress Slug: How to Change It and Optimize ...
Another option is to change your post slug using the WordPress dashboard's Quick Edit feature: Head to Dashboard -> Posts -> All Posts...
Read more >markdown-it-anchor
If a slugify function is given, you can decide how to transform a heading text to a URL slug. See user-friendly URLs. The...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
👋 @simov
I understand the problem. It appears this is called a “Decomposed character sequence” and is an alternate way of writing certain characters.
There is a good visual example here: https://superpixel.ch/bugreport/safari-diacritic/
I saw some discussion regarding this on SO, pointing out Chrome / Safari as the culprit.
It feels like the current solution will work just fine in a majority of cases, since it separates the main character (which won’t be filtered) from all the diacritics and such (that will be filtered). But it would be nice if it could somehow work with
remove
.I noticed there’s a String.normalize function that seems to do what we want. Maybe one option could be to apply it before any other code, but it would be hard to assess the implication of it on existing codebases.
Example:
Before normalization (
å
character only):After normalization:
Published in v1.5.0 https://github.com/simov/slugify/commit/2df63f965aaf63844a51d3c3fab7704d83a31877