question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support no forward index for column

See original GitHub issue

Currently a text column can be created without any forward index, which is useful when using the column only for filtering. In this situation, the raw (original) text data is not needed, only the text index (see https://github.com/apache/incubator-pinot/pull/6284/).

There are other situations for non-text columns where this same functionality is useful to reduce the size of the column. In our particular use case, we’re generating unique terms for a (large) string field, which we save as a multi-value STRING column. We need an inverted index for fast filtering, but we do not need the forward index, which (leaving aside the inverted index, which is built at load time) accounts for more than 80% of the total segment size.

@kishoreg suggested “having a empty forward Index reader impl” as a way of implementing this.

We could possible handle the configuration of this via a new noFwdIndexColumns table config field, similar to the noDictionaryColumns config setting.

There would be situations where specifying no forward index for a column would trigger a table config error, for example doing this for a metrics column (or so I assume).

I’m also not sure whether it would be valid to have a column that has no index/dictionary/forward index; does this mean ignore the field in the input data?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
siddharthteotiacommented, Dec 12, 2022

@somandal - I think you may want to update user docs and open follow up issues for the pending work and link here.

1reaction
somandalcommented, Oct 13, 2022

Here’s a document which discusses the reload problem and how to solve it for forwardIndexDisabled columns. Please take a look and leave your feedback. cc @Jackie-Jiang @siddharthteotia @vvivekiyer

Just a note that a few details still need to be figured out and I will update the document as and when we figure them out.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Support no forward index for column · Issue #6473 - GitHub
Currently a text column can be created without any forward index, which is useful when using the column only for filtering.
Read more >
Forward Index - Apache Pinot Docs
Forward index disabled columns cannot be present in the GROUP BY and ORDER BY clauses. They also cannot be part of the HAVING...
Read more >
Db2 11 - Introduction - How indexes can help to avoid sorts
For Db2 to be able to use an index to access ordered data, you must define an index on the same columns as...
Read more >
Attribute indexes in the geodatabase—ArcGIS Pro
Multicolumn indexes are not supported in file geodatabases. Click the right arrow button to move the field or fields to the Fields selected...
Read more >
Configure indexes | Apache Pinot: Getting Started
What indexes does Pinot support?​. By default, Pinot creates a forward index for every column. The forward index generally stores documents in insertion ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found