NativeVersionInfo emits RC script in incorrect encoding
See original GitHub issueAttempting to use Nerdbank.GitVersioning’s NativeVersionInfo to generate non-version for native Windows DLLs can result in malformed version resources when special characters such as © and ® are used. Generally, the special character is preceded by an unprintable or garbage character.
Crude Diagnosis: File.WriteAllText() writes text files in .NET’s preferred UTF-8, no BOM, which I think leads the output file to basically be RC’s expected ANSI save for the rare multi-byte special character, which is rendered as two separate ANSI characters rather than a single 2 byte UTF-8 character.
I think there are at least three possible improvements:
- Change
Utilities.FileOperationWithRetry(() => File.WriteAllText(this.OutputFile, this.generator.GetCode()));
to explicitly specify System.Text.ASCII, which looks to be the closest match. This comes with the possibility of losing characters altogether that are printable in UTF-8 but not ASCII, but would improve handling of common special characters like ©. - Use escape sequences to escape and insert any character that is more than one byte. I believe RC.exe supports some manner of escaping Unicode characters.
- Write the RC script in UCS-2 encoding with BOM. My understanding is that Encoding.Unicode is a superset of UCS-2 such that all 2 byte UTF-16LE characters would render correctly and only 3 byte and larger characters would have to be escaped.
Side note: Is there any reason for the strings being in Unicode to be conditional? Even if the application was compiled without Unicode support, I’d think there’d be no disadvantage to generating Unicode resources.
#if defined(_UNICODE)
#define NBGV_VERSION_STRING(x) L ##x
#else
#define NBGV_VERSION_STRING(x) x
#endif";
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (2 by maintainers)
Assuming option 1 works I can probably make and contribute the fix. Should only be one line.
The Unicode option would probably be best. RC files in the modern era are supposed to be Unicode according to the VC++ team when I recently was trying to get VSCode to diff them right (more a problem with git itself, turns out).