Response is returned with "Byte order mark" (BOM) and some xmlSerilizers fail to parse
See original GitHub issueWe used SOAP core to convert some of our legacy WCF (HTTP\ SOAP) project to NetCore. after some regression tests we found out that the SOAP response from our net NetCore services failed at serialization with this error content-is-not-allowed-in-prolog-when-parsing-perfectly-valid-xm
after some digging and viewing hidden characters indeed this is the case and the response encoding returns BOM (Byte order mark) which causing the issue.
we have workedaround this issue by injecting a MemoryStream into the response and “Hijacking it” and after the response is returned from SoapCore middleware we use our own encoding to write this is the middleware we used:
public static class BomKillerMiddleware
{
/// <summary>
/// Please read this page: https://www.freecodecamp.org/news/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7/
/// SoapCore adds a hidden character to the response: FEFF, an invisible UTF-8 that ruins XML parsing. It causes several erros
/// and the only way we found to solve it is like this: rewriting the response to the stream with UTF8 encoding. This omits the hidden character
/// from the response.
/// </summary>
public static IApplicationBuilder UseBomKiller(this IApplicationBuilder app)
{
return app.Use(async (context, _next) =>
{
using (var inMemoryResponse = new MemoryStream())
{
var originalResponseStream = context.Response.Body;
context.Response.Body = inMemoryResponse;
await _next.Invoke();
inMemoryResponse.Seek(0, SeekOrigin.Begin);
using (var streamReader = new StreamReader(inMemoryResponse))
{
var bodyAsText = await streamReader.ReadToEndAsync();
context.Response.Body = originalResponseStream;
await context.Response.WriteAsync(bodyAsText, Encoding.UTF8);
}
}
});
}
}
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:7
Top Results From Across the Web
c# - XmlReader breaks on UTF-8 BOM
In my request handler I'm serializing a response object and sending it back as a string. The serialization process adds a UTF-8 BOM...
Read more >Byte Order Marks and XmlDocument Streaming to HTTP
Basically the problem is that .NET's default XmlTextWriter encoding uses UTF-8 and the default Encoding includes generation of a BOM as part of ......
Read more >SAX Error - Content is not allowed in prolog
Invalid text or BOM before the XML declaration or different encoding will cause the SAX Error – Content is not allowed in prolog....
Read more >Removing a BOMB from Your Text!
Anyway, a BOM (Byte Order Mark) is firstly a zero-width, non-breaking space. This means that it should never be rendered, which is why...
Read more >https://raw.githubusercontent.com/dotnet/samples/m...
Console Some resource strings got lost in the move to GitHub. ... IO Add option on StreamWriter not to emit Byte Order Mark...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For my case I found a better solution by passing in a binding with a specific TextEncoding:
The boolean passed to the UTF8Encoding’s constructor indicates that it should not write a BOM. See this article: UTF8Encoding constructor
SoapCore uses this as a WriteEncoding: SoapEndpointExtensions.cs
It can also be solved by specifying the encoding on the soap options object: