question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Remove option breaks main slugify functionality

See original GitHub issue

Reproduction script:

const slugify = require('slugify');

const fileName = decodeURIComponent('a%CC%8Aa%CC%88o%CC%88-123'); // åäö-123
const fileName2 = decodeURIComponent('%0A%C3%A5%C3%A4%C3%B6-123'); // åäö-123

const withoutRemove = slugify(fileName);

const withRemove = slugify(fileName, {
  remove: /[*+~.()'"!:@]/g,
});

const withRemoveAlternativeEncoding = slugify(fileName2, {
  remove: /[*+~.()'"!:@]/g,
});

console.log(withoutRemove);
console.log(withRemove);
console.log(withRemoveAlternativeEncoding);

Result

aao-123
åäö-123 # Broken
aao-123

I can’t reproduce this when typing the “åäö” characters on my keyboard, however if I paste them from my console or a browser URL then this behaviour can be reproduced. I suspect there is a problem with different code points similar to this: https://github.com/ipython/ipython/issues/522

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
khromovcommented, Mar 15, 2021

👋 @simov

I understand the problem. It appears this is called a “Decomposed character sequence” and is an alternate way of writing certain characters.

There is a good visual example here: https://superpixel.ch/bugreport/safari-diacritic/

I saw some discussion regarding this on SO, pointing out Chrome / Safari as the culprit.

It feels like the current solution will work just fine in a majority of cases, since it separates the main character (which won’t be filtered) from all the diacritics and such (that will be filtered). But it would be nice if it could somehow work with remove.

I noticed there’s a String.normalize function that seems to do what we want. Maybe one option could be to apply it before any other code, but it would be hard to assess the implication of it on existing codebases.

Example:

Before normalization (å character only):

console.log(Array.from(decodeURIComponent('a%CC%8A'))
  .map((v) => v.codePointAt(0).toString(16))
  .map((hex) => "\\u" + "0000".substring(0, 4 - hex.length) + hex));

> Array [ "\\u0061", "\\u030a" ] // a,  ̊

After normalization:

console.log(Array.from(decodeURIComponent('a%CC%8A').normalize())
  .map((v) => v.codePointAt(0).toString(16))
  .map((hex) => "\\u" + "0000".substring(0, 4 - hex.length) + hex));

> Array [ "\\u00e5" ] // å - great success!
Read more comments on GitHub >

github_iconTop Results From Across the Web

Angular slugify leaving dash at the end
Main problem: Remove accents and symbols slugify, but leaving dash at the end. Before we go to the main problem, there are several...
Read more >
Remove slug from custom post type post URLs
First, we will remove the slug from the permalink: function na_remove_slug( $post_link, $post, $leavename ) { ...
Read more >
Slug
Supply a custom function which checks whether or not the slug is unique. Receives the proposed slug as the first argument and an...
Read more >
What Is a WordPress Slug: How to Change It and Optimize ...
Another option is to change your post slug using the WordPress dashboard's Quick Edit feature: Head to Dashboard -> Posts -> All Posts...
Read more >
markdown-it-anchor
If a slugify function is given, you can decide how to transform a heading text to a URL slug. See user-friendly URLs. The...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found