question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pinot query ArrayIndexOutOfBounds for some multi-value column queries

See original GitHub issue

When we run the below query on our cluster via Pinot UI, we get "errorCode": 200, "message": "QueryExecutionError:\njava.lang.ArrayIndexOutOfBoundsException"

The query is simple as

select account_key from point_entry where account_key = 'xxxxx' and account_fields_values not in ('intercompany_clearing') limit 1 option(timeoutMs=10000)

Both of the columns in the query are multi-value columns.

The Server error log is below:

Caught exception while executing operator of index: 4 (query: QueryContext{_tableName='point_entry_OFFLINE', _selectExpressions=[account_key], _aliasList=[null], _filter=(account_key = 'xxxx' AND account_fields_values NOT IN ('intercompany_clearing','cash','merchant_balance','fx_clearing') AND created_at <= '1634813999'), _groupByExpressions=null, _havingFilter=null, _orderByExpressions=null, _limit=100, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=9863}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:point_entry_OFFLINE), pinotQuery:PinotQuery(dataSource:DataSource(tableName:point_entry_OFFLINE), selectList:[Expression(type:IDENTIFIER, identifier:Identifier(name:account_key))], filterExpression:Expression(type:FUNCTION, functionCall:Function(operator:AND, operands:[Expression(type:FUNCTION, functionCall:Function(operator:EQUALS, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:account_key)), Expression(type:LITERAL, literal:<Literal stringValue:xxxxx>)])), Expression(type:FUNCTION, functionCall:Function(operator:NOT_IN, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:account_fields_values)), Expression(type:LITERAL, literal:<Literal stringValue:intercompany_clearing>), Expression(type:LITERAL, literal:<Literal stringValue:cash>), Expression(type:LITERAL, literal:<Literal stringValue:merchant_balance>), Expression(type:LITERAL, literal:<Literal stringValue:fx_clearing>)])), Expression(type:FUNCTION, functionCall:Function(operator:LESS_THAN_OR_EQUAL, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:created_at)), Expression(type:LITERAL, literal:<Literal stringValue:1634813999>)]))])), orderByList:[], limit:100, queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=9863}))})

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
dongxiaomancommented, Oct 26, 2021

For now we have pin pointed to a segment that was generated long ago. Could be the data that was wrongly generated; for now we can work around this by adding a time filter to skip those segments. Will keep digging

0reactions
dongxiaomancommented, Dec 22, 2021

Forgot to close it. Happens only to one segment that is probably generated by some version of library that may have issues.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add support for querying raw Multi-value columns · Issue #8875
Today though it is possible to declare a multi-value column as a raw column (without ... at org.apache.pinot.core.operator.query.
Read more >
Complex Type (Array, Map) Handling - Apache Pinot Docs
Apache Pinot's data model supports primitive data types (including int, long, float, double, BigDecimal string, bytes), as well as limited multi-value types ...
Read more >
Optimizing Apache Pinot's Query Performance - YouTube
00:00:00 Welcome 00:01:17 Subbu Intro00:03:21 Optimizing Pinot Realtime Performance by Subbu Subramaniam, Sr. Staff Engineer, ...
Read more >
Pinot nested json ingestion - Stack Overflow
Since it's an array, so both fields are multi-value columns in Pinot. 2. Directly ingest JSON records. In this case, we treat each...
Read more >
Comparison of the Open Source OLAP Systems for Big Data
On the flip side, ClickHouse, Druid and Pinot don't support queries that require ... Druid's __time column value to some coarse granularity, e....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found