Pinot query ArrayIndexOutOfBounds for some multi-value column queries
See original GitHub issueWhen we run the below query on our cluster via Pinot UI, we get "errorCode": 200, "message": "QueryExecutionError:\njava.lang.ArrayIndexOutOfBoundsException"
The query is simple as
select account_key from point_entry where account_key = 'xxxxx' and account_fields_values not in ('intercompany_clearing') limit 1 option(timeoutMs=10000)
Both of the columns in the query are multi-value columns.
The Server error log is below:
Caught exception while executing operator of index: 4 (query: QueryContext{_tableName='point_entry_OFFLINE', _selectExpressions=[account_key], _aliasList=[null], _filter=(account_key = 'xxxx' AND account_fields_values NOT IN ('intercompany_clearing','cash','merchant_balance','fx_clearing') AND created_at <= '1634813999'), _groupByExpressions=null, _havingFilter=null, _orderByExpressions=null, _limit=100, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=9863}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:point_entry_OFFLINE), pinotQuery:PinotQuery(dataSource:DataSource(tableName:point_entry_OFFLINE), selectList:[Expression(type:IDENTIFIER, identifier:Identifier(name:account_key))], filterExpression:Expression(type:FUNCTION, functionCall:Function(operator:AND, operands:[Expression(type:FUNCTION, functionCall:Function(operator:EQUALS, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:account_key)), Expression(type:LITERAL, literal:<Literal stringValue:xxxxx>)])), Expression(type:FUNCTION, functionCall:Function(operator:NOT_IN, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:account_fields_values)), Expression(type:LITERAL, literal:<Literal stringValue:intercompany_clearing>), Expression(type:LITERAL, literal:<Literal stringValue:cash>), Expression(type:LITERAL, literal:<Literal stringValue:merchant_balance>), Expression(type:LITERAL, literal:<Literal stringValue:fx_clearing>)])), Expression(type:FUNCTION, functionCall:Function(operator:LESS_THAN_OR_EQUAL, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:created_at)), Expression(type:LITERAL, literal:<Literal stringValue:1634813999>)]))])), orderByList:[], limit:100, queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=9863}))})
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
Add support for querying raw Multi-value columns · Issue #8875
Today though it is possible to declare a multi-value column as a raw column (without ... at org.apache.pinot.core.operator.query.
Read more >Complex Type (Array, Map) Handling - Apache Pinot Docs
Apache Pinot's data model supports primitive data types (including int, long, float, double, BigDecimal string, bytes), as well as limited multi-value types ...
Read more >Optimizing Apache Pinot's Query Performance - YouTube
00:00:00 Welcome 00:01:17 Subbu Intro00:03:21 Optimizing Pinot Realtime Performance by Subbu Subramaniam, Sr. Staff Engineer, ...
Read more >Pinot nested json ingestion - Stack Overflow
Since it's an array, so both fields are multi-value columns in Pinot. 2. Directly ingest JSON records. In this case, we treat each...
Read more >Comparison of the Open Source OLAP Systems for Big Data
On the flip side, ClickHouse, Druid and Pinot don't support queries that require ... Druid's __time column value to some coarse granularity, e....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For now we have pin pointed to a segment that was generated long ago. Could be the data that was wrongly generated; for now we can work around this by adding a time filter to skip those segments. Will keep digging
Forgot to close it. Happens only to one segment that is probably generated by some version of library that may have issues.