NEST removes null value from after_key Dictionary when aggregating
See original GitHub issueWe have a piece of code to do Composite Aggregation on our data, and in it we’re running it on two fields with missing_bucket
set to true.
Our issue is that when one of the fields becomes null
in the data, the after_key
is serialized incorrectly on the next request.
Note: At the bottom. There is an absolute minimal reproduction.
Our code (boiled down):
static void Main(string[] args)
{
IConnectionPool pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
IConnection connection = new HttpConnection();
ConnectionSettings connSettings = new ConnectionSettings(pool, connection);
connSettings.ThrowExceptions();
connSettings.DisableDirectStreaming();
ElasticClient client = new ElasticClient(connSettings);
// Grouping
SearchRequest<JObject> search = new SearchRequest<JObject>("some_index", "_doc");
search.Size = 0;
List<ICompositeAggregationSource> aggregateList = new List<ICompositeAggregationSource>();
aggregateList.Add(new TermsCompositeAggregationSource("1")
{
Field = "PropertyA.keyword",
MissingBucket = true
});
aggregateList.Add(new TermsCompositeAggregationSource("2")
{
Field = "PropertyB.keyword",
MissingBucket = true
});
CompositeAggregation compositeAggregation = new CompositeAggregation("composite")
{
Sources = aggregateList
};
search.Aggregations = compositeAggregation;
while (true)
{
int pageSize = 10; // We use 1000, 10 is for testing
compositeAggregation.Size = pageSize;
ISearchResponse<JObject> result = client.Search<JObject>(search);
BucketAggregate aggA = (BucketAggregate)result.Aggregations["composite"];
if (!aggA.Items.Any())
break;
// Prepare next request
// This is what fails the next round
compositeAggregation.After = aggA.AfterKey;
// .. work with data ..
}
}
In the above, ES fails our second (or some subsequent request) with:
DebugInformation
# FailureReason: BadResponse while attempting POST on http://localhost:9200/some_index/_doc/_search?typed_keys=true
# Audit trail of this API call:
- [1] BadResponse: Node: http://localhost:9200/ Took: 00:00:00.0494512
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: Request failed to execute. Call: Status code 400 from: POST /some_index/_doc/_search?typed_keys=true. ServerError: Type: search_phase_execution_exception Reason: "all shards failed" CausedBy: "Type: illegal_argument_exception Reason: "[after] has 1 value(s) but [sources] has 2" CausedBy: "Type: illegal_argument_exception Reason: "[after] has 1 value(s) but [sources] has 2"""
at Elasticsearch.Net.Transport`1.HandleElasticsearchClientException(RequestData data, Exception clientException, IElasticsearchResponse response)
at Elasticsearch.Net.Transport`1.FinalizeResponse[TResponse](RequestData requestData, IRequestPipeline pipeline, List`1 seenExceptions, TResponse response)
at Elasticsearch.Net.Transport`1.Request[TResponse](HttpMethod method, String path, PostData data, IRequestParameters requestParameters)
at Nest.LowLevelDispatch.SearchDispatch[TResponse](IRequest`1 p, SerializableData`1 body)
at Nest.ElasticClient.Nest.IHighLevelToLowLevelDispatcher.Dispatch[TRequest,TQueryString,TResponse](TRequest request, Func`3 responseGenerator, Func`3 dispatch)
at ConsoleApp10.Program.Main(String[] args) in C:\Users\MichaelBisbjerg\source\repos\ConsoleApp10\ConsoleApp10\Program.cs:line 89
# Request:
{
"aggs": {
"composite": {
"composite": {
"after": {
"1": "value1"
},
"size": 10,
"sources": [{
"1": {
"terms": {
"field": "PropertyA.keyword",
"missing_bucket": true
}
}
}, {
"2": {
"terms": {
"field": "PropertyB.keyword",
"missing_bucket": true
}
}
}
]
}
}
},
"size": 0
}
# Response:
{
"error": {
"root_cause": [{
"type": "illegal_argument_exception",
"reason": "[after] has 1 value(s) but [sources] has 2"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [{
"shard": 0,
"index": "some_index",
"node": "Z7iIXKGMQZSRN6MDZ1h3Jg",
"reason": {
"type": "illegal_argument_exception",
"reason": "[after] has 1 value(s) but [sources] has 2"
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[after] has 1 value(s) but [sources] has 2",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[after] has 1 value(s) but [sources] has 2"
}
}
},
"status": 400
}
# Exception:
Elasticsearch.Net.ElasticsearchClientException: Request failed to execute. Call: Status code 400 from: POST /some_index/_doc/_search?typed_keys=true. ServerError: Type: search_phase_execution_exception Reason: "all shards failed" CausedBy: "Type: illegal_argument_exception Reason: "[after] has 1 value(s) but [sources] has 2" CausedBy: "Type: illegal_argument_exception Reason: "[after] has 1 value(s) but [sources] has 2"""
at Elasticsearch.Net.Transport`1.HandleElasticsearchClientException(RequestData data, Exception clientException, IElasticsearchResponse response)
at Elasticsearch.Net.Transport`1.FinalizeResponse[TResponse](RequestData requestData, IRequestPipeline pipeline, List`1 seenExceptions, TResponse response)
at Elasticsearch.Net.Transport`1.Request[TResponse](HttpMethod method, String path, PostData data, IRequestParameters requestParameters)
at Nest.LowLevelDispatch.SearchDispatch[TResponse](IRequest`1 p, SerializableData`1 body)
at Nest.ElasticClient.Nest.IHighLevelToLowLevelDispatcher.Dispatch[TRequest,TQueryString,TResponse](TRequest request, Func`3 responseGenerator, Func`3 dispatch)
at ConsoleApp10.Program.Main(String[] args) in C:\Users\MichaelBisbjerg\source\repos\ConsoleApp10\ConsoleApp10\Program.cs:line 89
When debugging, I clearly see that the aggA.AfterKey
is a dictionary consisting of two values, but when it’s sent to ES again, it’s only with one.
I’ve reproduced the issue further, with just the serializer, using this code:
void ReproduceSerializer()
{
IConnectionPool pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
ConnectionSettings connSettings = new ConnectionSettings(pool);
ElasticClient client = new ElasticClient(connSettings);
using (MemoryStream ms = new MemoryStream())
{
Dictionary<string, object> dictionary = new Dictionary<string, object>
{
{"1", "C:\\" },
{"2", null }
};
client.RequestResponseSerializer.Serialize(dictionary, ms);
byte[] d = ms.ToArray();
string p = Encoding.UTF8.GetString(d);
/*
Issue: "p" is just
{
"1": "C:\\"
}
Rather than:
{
"1": "C:\\",
"2": null
}
*/
}
}
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Nest aggregation results are null however there are data in ...
I'm working on aggregations in NEST, so far everything has worked well, but now when I try to access nested fields through .children...
Read more >Null values in AfterKey of Composite Aggregation are ...
Hi guys, I'm using Composite Aggregation to summarize data. Since some fields may be missing, I set missing_bucket = true, so the returned ......
Read more >Terms aggregation | Elasticsearch Guide [8.9]
A multi-bucket value source based aggregation where buckets are dynamically built - one per unique value. Example:.
Read more >removing null values from a dictionary | /*code-comments*/
I recently was converting a Python dictionary to a JSON object to include in the body of a POST request. Unfortunately, this triggered...
Read more >Null Values in Aggregate Functions
You can choose to treat null values in aggregate functions as NULL or zero. By default, the Integration Service treats null values as...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’ve merged in https://github.com/elastic/elasticsearch-net/pull/3800 to mark
AfterKey
as obsolete onBucketAggregate
to discourage its usage, and introduced aCompositeAfterKey
property which is of typeCompositeKey
and will honour null values when being passed into subsequent composite aggregation calls.Yes, this will work. The first way should be discouraged because
BucketAggregate
is an intermediate type used internally to hold the data for a number of different aggregations. At the very least,AfterKey
onBucketAggregate
should be of typeCompositeKey
. Will open a PR now to fix this in 7.x, and a PR to obsolete it in 6.x, and use a different property.