SQL Server UTF8 collations
See original GitHub issueThis does not behave as documented and expected.
If I have a entity for which I use Fluent API to define properties…
The SQL field (in my example) is varchar(255) using collation Latin1_General_100_BIN2_UTF8
in EF defined as
p.Property(prop => prop.Param).IsUnicode(false).UseCollation("Latin1_General_100_BIN2_UTF8").HasMaxLength(255);
However, unicode chars get’s corrupted anyway on SQL both on Azure as on 2019 express.
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
- ID: b27be469-3758-941c-0c5d-3a54b1545493
- Version Independent ID: d89924aa-668e-229c-8170-5731c3fff4c1
- Content: Collations and case sensitivity - EF Core
- Content Source: entity-framework/core/miscellaneous/collations-and-case-sensitivity.md
- Product: entity-framework
- Technology: entity-framework-core
- GitHub Login: @roji
- Microsoft Alias: avickers
Issue Analytics
- State:
- Created 2 years ago
- Comments:25 (13 by maintainers)
Top Results From Across the Web
Collation and Unicode support - SQL Server
Collation ; UTF-8 (_UTF8), Enables UTF-8 encoded data to be stored in SQL Server. If this option isn't selected, SQL Server uses the...
Read more >How to Use UTF-8 Collation in SQL Server database?
UTF-8 is one way of saving Unicode. What you have used to represent the Unicode is escape codes used in string literals, that's...
Read more >Introducing UTF-8 support for SQL Server
Like UTF-16, UTF-8 is only available to Windows collations that support Supplementary Characters, as introduced in SQL Server 2012.
Read more >SQL Server UTF-8 support - 4Js
Support for UTF-8 collation in CHAR/VARCHAR columns with SQL Server 2019. Microsoft™ SQL Server 2019 introduced support for UTF-8 database collations: When ...
Read more >Impact of UTF-8 support in SQL Server 2019
The new UTF-8 collations can provide benefits in storage space, but if page compression is used, the benefit is no better than older...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Design proposal:
tl;dr allow users to configure UTF8 by explicitly setting both the column type to
char
/varchar
and IsUnicode to true:We can add a sugar method which does the above:
Notes:
char
/varchar
also sets DbType=AnsiString (note that Unicode still remains true in the type mapping - not ideal).char
/varchar
, i.e. DbType is still AnsiString.varchar(max)
and IsUnicode=true - this would opt into UTF8. The column type is exactly what it should be in migrations (and also in the query pipline etc.), and IsUnicode tells us to send DbType.String instead of DbType.AnsiString._UTF8
).char
/varchar
property with a collation ending with UTF8 (including at the database level) should lead to the correct UTF8 property being scaffolded (i.e. with either UseUTF8 or IsUnicode(true)`)Global model configuration
The default database collation can already be set via
modelBuilder.UseCollation()
:All string properties can be configured to be UTF8 by default via pre-convention model configuration:
We could also add a
ConfigureUTF8()
extension method to do the above.@clement911
The collation isn’t something that gets specified on a parameter, but rather on the column (or at the database level for all columns); see our docs for more info on this.
Aside from that, as @egbertn wrote above, a workaround exists but requiring editing the migration to change the type to varchar (doing something better is what this issue tracks). There’s no reason to avoid editing the migration file - it’s perfectly fine (and frequently recommended) to customize migration code after generating it, see our docs. I definitely wouldn’t avoid UTF8 just because it requires a one-time edit to migration code.