[FEATURE REQ] Disable \u002B encoding of + base64 encoded bytes[] by Azure.Data.Tables
See original GitHub issueLibrary or service name. Azure.Data.Tables
Is your feature request related to a problem? Please describe. In Azure Table Storage, the way you set binary fields to base64 encode the bytes and then send them as a JSON string.
When Azure.Data.Tables does this encoding it uses System.Text.Json which defaults to unicode escaping some “unsafe” characters. For example, the +
character is encoded to \u002b
. This is unnecessary and bloats the payload even more.
For example, this is an example payload:
POST https://REDACTED.table.core.windows.net/mytable?$format=application%2Fjson%3Bodata%3Dminimalmetadata HTTP/1.1 Host: REDACTED.table.core.windows.net x-ms-version: 2019-02-02 DataServiceVersion: 3.0 Prefer: return-no-content Accept: application/json;odata=minimalmetadata x-ms-client-request-id: 60214504-eb92-488c-810c-f4bb5184332e x-ms-return-client-request-id: true User-Agent: azsdk-net-Data.Tables/12.0.0-beta.6 (.NET 5.0.4; Microsoft Windows 10.0.19042) x-ms-date: Thu, 25 Mar 2021 06:24:46 GMT Authorization: REDACTED Content-Type: application/json;odata=nometadata Content-Length: 160 {"PartitionKey":"49530451-b75d-41bb-a246-94b1e1c750fd","RowKey":"rk","Timestamp":null,"Binary":"YP\u002ByFhysX0OJ5z5hGerSew==","Binary@odata.type":"Edm.Binary"}
This last YP\u002ByFhysX0OJ5z5hGerSew==
binary really could be sent as YP\+yFhysX0OJ5z5hGerSew==
and the number of bytes would just be lower with no negative impact to the client or the service.
As far as I can tell there is no way to modify this behavior, but perhaps I am missing something.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
I’m going to defer this and link it to https://github.com/Azure/azure-sdk-for-net/issues/15383, which is a backlog item to support custom serialization all up. If we decide to implement that (which I assume we will at some point) this kind of configuration would be a subset of that.
Yes, the
+
is the source of the bloat.For JSON serialized by the SDK and sent to Azure Table Storage (i.e. request bodies), I’m curious what threat model necessitates encoding plus signs. More specifically, how is this protecting either the customer (data sent by the customer is implicitly trustworthy by the customer) and or the service?
Previous SDK versions (notably ones using Newtonsoft.Json like WindowsAzure.Storage) do no such encoding.
I would hope that Azure Table Storage and Cosmos DB properly sanitize the JSON provided in HTTP request bodies. If they did not, the well behaving Azure SDK would not be a proper defense mechanism for such services since a malicious actor need not use the SDK. But if decision is more of a “code hygiene” thing to made audits simpler, I understand.
It does seem sad to waste so many bytes though. 1/64 base64 characters goes from 1 byte (
+
) to 6 bytes (\u002B
).It’s not a huge blocker though, I simply tweaked my code to start a new HTTP request a bit earlier than before to account for the bloat.