question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

StringHtmlContent omitting characters when encoding

See original GitHub issue

Describe the bug

When a StringHtmlContent writes a string containing a surrogate pair, it omits the immediate following single non-supplementary character.

To Reproduce

Steps to reproduce the behavior:

  1. Using this version of ASP.NET Core 2.2.5
  2. Run this code
    var tearsOfJoy2 = new StringHtmlContent("😂2");
    using (var stringWriter = new StringWriter())
    {
        tearsOfJoy2.WriteTo((TextWriter) stringWriter, HtmlEncoder.Default);
        var encoded = stringWriter.ToString();
    }

Expected behaviour

encoded == "😂2"

Actual behaviour

encoded == "😂"

Additional context

I have a feeling that the issue is within System.Text.Encodings.Web.TextEncoder. If the the emoji is followed by 2 characters, like new StringHtmlContent("😂22"), it seems to be working correctly.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
AbhiAgarwal192commented, Jun 14, 2019

Submitted a PR for this. This will get fixed in the next release.

1reaction
AbhiAgarwal192commented, Jun 4, 2019

@ryanbrandenburg I have found the actual issue. Will fix it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What might be happening with these strings in terms of ...
What might be happening with these strings in terms of character encoding in C#? They compare differently when read from different sources.
Read more >
StringHtmlContent Class (Microsoft.AspNetCore.Mvc. ...
Writes the content by encoding it with the specified encoder to the specified writer . Applies to. Product, Versions. ASP.NET Core, 1.0, 1.1,...
Read more >
htmlspecialchars - Manual
htmlspecialchars — Convert special characters to HTML entities ... If omitted, encoding defaults to the value of the default_charset configuration option.
Read more >
How to use character encoding classes in .NET
This article explains how to use the classes that .NET provides for encoding and decoding text by using various encoding schemes.
Read more >
How to strip invalid UTF-8 characters from a string?
Encode CHAR by CODING-SYSTEM and return the resulting string. If CODING-SYSTEM can't safely encode CHAR, return nil. The 3rd optional argument ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found