question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Response is returned with "Byte order mark" (BOM) and some xmlSerilizers fail to parse

See original GitHub issue

We used SOAP core to convert some of our legacy WCF (HTTP\ SOAP) project to NetCore. after some regression tests we found out that the SOAP response from our net NetCore services failed at serialization with this error content-is-not-allowed-in-prolog-when-parsing-perfectly-valid-xm

after some digging and viewing hidden characters indeed this is the case and the response encoding returns BOM (Byte order mark) which causing the issue.

we have workedaround this issue by injecting a MemoryStream into the response and “Hijacking it” and after the response is returned from SoapCore middleware we use our own encoding to write this is the middleware we used:

public static class BomKillerMiddleware
{
    /// <summary>
    /// Please read this page: https://www.freecodecamp.org/news/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7/
    /// SoapCore adds a hidden character to the response: FEFF, an invisible UTF-8 that ruins XML parsing. It causes several erros
    /// and the only way we found to solve it is like this: rewriting the response to the stream with UTF8 encoding. This omits the hidden character
    /// from the response.
    /// </summary>
    public static IApplicationBuilder UseBomKiller(this IApplicationBuilder app)
    {
        return app.Use(async (context, _next) =>
        {
            using (var inMemoryResponse = new MemoryStream())
            {
                var originalResponseStream = context.Response.Body;
                context.Response.Body = inMemoryResponse;

                await _next.Invoke();

                inMemoryResponse.Seek(0, SeekOrigin.Begin);

                using (var streamReader = new StreamReader(inMemoryResponse))
                {
                    var bodyAsText = await streamReader.ReadToEndAsync();

                    context.Response.Body = originalResponseStream;
                    await context.Response.WriteAsync(bodyAsText, Encoding.UTF8);
                }
            }
        });
    }
}

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:7

github_iconTop GitHub Comments

13reactions
RafVandesandecommented, Mar 11, 2020

For my case I found a better solution by passing in a binding with a specific TextEncoding:

new BasicHttpBinding
{
   TextEncoding = new UTF8Encoding(false)
}

The boolean passed to the UTF8Encoding’s constructor indicates that it should not write a BOM. See this article: UTF8Encoding constructor

SoapCore uses this as a WriteEncoding: SoapEndpointExtensions.cs

2reactions
l3endercommented, Dec 29, 2021

It can also be solved by specifying the encoding on the soap options object:

var encoder = new SoapEncoderOptions();
encoder.WriteEncoding = new UTF8Encoding(false);

app.UseEndpoints(endpoints => {
    endpoints.UseSoapEndpoint<IPingService>("/Service.svc", encoder, SoapSerializer.DataContractSerializer);
});
Read more comments on GitHub >

github_iconTop Results From Across the Web

c# - XmlReader breaks on UTF-8 BOM
In my request handler I'm serializing a response object and sending it back as a string. The serialization process adds a UTF-8 BOM...
Read more >
Byte Order Marks and XmlDocument Streaming to HTTP
Basically the problem is that .NET's default XmlTextWriter encoding uses UTF-8 and the default Encoding includes generation of a BOM as part of ......
Read more >
SAX Error - Content is not allowed in prolog
Invalid text or BOM before the XML declaration or different encoding will cause the SAX Error – Content is not allowed in prolog....
Read more >
Removing a BOMB from Your Text!
Anyway, a BOM (Byte Order Mark) is firstly a zero-width, non-breaking space. This means that it should never be rendered, which is why...
Read more >
https://raw.githubusercontent.com/dotnet/samples/m...
Console Some resource strings got lost in the move to GitHub. ... IO Add option on StreamWriter not to emit Byte Order Mark...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found