question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spatial.Converters throws null reference exception when data serialized with System.Text.Json.JsonSerializer class

See original GitHub issue

Description When attempting to load data with spatial data from CosmosDB, the Microsoft.Azure.Cosmos.Spatial.Converters.GeometryJsonConverter throws a NullReferenceException in ReadJson when the spatial data was stored using System.Text.Json.JsonSerializer.SerializeAsync(Stream, object). This is because the Microsoft serializer writes spatial data in a different format than the Newtonsoft deserializer used in the GeometryJsonConverter class.

This problem arose when attempting to follow the official sample code for bulk inserts provided at https://github.com/Azure-Samples/cosmos-dotnet-bulk-import-throughput-optimizer/blob/main/src/Program.cs

To Reproduce

  1. Start with the official sample code for bulk CosmosDB inserts linked above
  2. In your sample, add a field of type Microsoft.Azure.Cosmos.Spatial.Point called location
  3. Run your sample; your CosmosDB collection should have an object with a location value
  4. Attempt to read your data back using code similar to this:
var queryable = Container
	.GetItemLinqQueryable<MyType>(true)
	.Where(p => p.id = "my id");

var iterator = queryable.ToFeedIterator();

var models = new List<MyType>();
while (iterator.HasMoreResults)
{
	var response = iterator.ReadNextAsync()?.Result;
	if (response is null) break;

	models.AddRange(response);
}

Expected behavior The query should execute and return your data

Actual behavior The method iterator.ReadNextAsync() throws an AggregateException containing a NullReferenceException terminating at Microsoft.Azure.Cosmos.Spatial.Converters.GeometryJsonConverter.ReadJson(JsonReader reader, Type objectType, Object existingValue, JsonSerializer serializer)

Environment summary SDK Version: .NET 5.0.3, Microsoft.Azure.Cosmos package 3.16.0 OS Version: Windows 10

Additional context The cause appears to be a mismatch between the way Newtonsoft and Microsoft serialize Point data. The example code for bulk-inserts linked above uses System.Text.Json.JsonSerializer to serialize the data for insertion into CosmosDB. But the Microsoft.Azure.Cosmos.FeedIteratorCore<T> class uses Newtonsoft tools to deserialize.

Newtonsoft’s serialization looks like this:

    "location": {
        "type": "Point",
        "coordinates": [
            -90.740237,
            39.950254
        ]
    },

while Microsoft’s looks like this:

"location": {
            "Position": {
                "Coordinates": [
                    -87.9066,
                    41.9795
                ],
                "Longitude": -87.9066,
                "Latitude": 41.9795,
                "Altitude": null
            },
            "Crs": {
                "Type": 0
            },
            "Type": 0,
            "BoundingBox": null,
            "AdditionalProperties": {}
        }

Examining the source code at https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/Spatial/Converters/GeometryJsonConverter.cs has this at line s 63-66:

JToken typeToken = token["type"];
if (typeToken.Type != JTokenType.String)
{
    throw new JsonSerializationException(RMResources.SpatialInvalidGeometryType);
}

My guess is that line 64 throws the NullReferenceException because token["type"] will be null in the Microsoft-serialized example.

I tried to fix this by replacing await System.Text.Json.JsonSerializer.SerializeAsync(stream, model) in the sample code with this bulkier Newtonsoft code:

var itemsToInsert = new Dictionary<string, Stream>(models.Count);
var jsonSerializer = new JsonSerializer() { NullValueHandling = NullValueHandling.Ignore };
foreach (var model in models)
{
	var stream = new MemoryStream();
	var streamWriter = new StreamWriter(stream);
	var jsonWriter = new JsonTextWriter(streamWriter);
	jsonSerializer.Serialize(jsonWriter, model);
	await streamWriter.FlushAsync();

	itemsToInsert.Add(model.id, stream);
}

This worked, but I am seriously worried about all the open streams after it finishes. It looks like a potential memory leak to me.

(Note also that the sample uses the model’s PartitionKey as the key in the itemsToInsert dictionary, which means the sample breaks when you attempt to add two items that are in the same partition. My fix to that problem was to use the ID as the dictionary key and get the item’s partition key

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
PunzunLtdcommented, Mar 15, 2021

@ealsur , I believe the GeometryJsonConverter throws the NRE at line 64:

63 JToken typeToken = token["type"];
64 if (typeToken.Type != JTokenType.String)
65 {
66    throw new JsonSerializationException(RMResources.SpatialInvalidGeometryType);
67 }

…because I believe token["type"] returns null in that case.

I’m still concerned that the mismatch between Newtonsoft and Microsoft on the Point class means existing data may cause the SDK to fail in future releases.

1reaction
PunzunLtdcommented, Mar 15, 2021

OK. I appreciate what you’re saying. But there’s still a bug here, even if it’s only a usability bug. Specifically, your statement “Spatial types, provided by the SDK, are Newtonsoft.Json compatible” is generally true but does not remain true when you use System.Text.Json.JsonSerializer.SerializeAsync(Stream, object) to serialize a Point object.

I filed this report because tracking down what was actually happening took a couple of hours. I think it would have been easier to track down if, instead of throwing a NullReferenceException the SDK threw a SerializationException instead. And I filed it here because the SDK, not the sample code or the JSON libraries, threw the exception that took me a couple of hours of digging through the SDK source code to figure out. I believe I identified exactly why the SDK throws an NRE, and I believe that this is an inappropriate outcome in the general scenario I’ve described (serializing a Point with System.Text.Json and deserializing it with whatever the SDK is using).

Another possibility is for the team responsible for System.Text.Json to make their serialization of Point compatible with naïve serializations such as Newtonsoft’s. That would ensure that when the Azure SDK eventually switches to System.Text.Json, the switch doesn’t break what could be millions of CosmosDB documents that serialized geospatial data using Newtonsoft’s serializer.

Finally, if you provide feedback to the team responsible for the sample code I followed, please also mention that populating the dictionary itemsToInsert using the partition key as the dictionary key will failif a logical partition contains more than one document.

Read more comments on GitHub >

github_iconTop Results From Across the Web

System.Text.Json fails to serialize and throws internal ...
Json failes to serialize and throws the following internal exception: "Object reference not set to an instance of an object." at System.Text.
Read more >
NullReferenceException when deserializing object ...
Calling JsonSerializer.Deserialize for an object containing a nullable struct property will throw a NullReferenceException when passing ...
Read more >
Deserialize `false` as `null` with System.Text.Json
I'm accessing an api that for some reason returns false whenever null would be used normally. I now need a way to deserialize...
Read more >
Serialize throws exception when type parameter is null - . ...
SerializeToUtf8Bytes(Object, Type, JsonSerializerOptions) overloads that have a Type parameter throw an ArgumentNullException when null is ...
Read more >
Migrate from Newtonsoft.Json to System.Text.Json - .NET
Newtonsoft.Json can serialize or deserialize numbers represented by JSON strings (surrounded by quotes). For example, it can accept: {" ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found