presto plugin developing problems
See original GitHub issueHi,all: I am trying to build a new presto-plugin in my independent project( not in presto-root project ). So i add some maven dependencies into my pom, and build out my plugin.
` <dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-spi</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.airlift</groupId>
<artifactId>slice</artifactId>
<scope>provided</scope>
</dependency>`
My 1th problem is: how can i debug my plugin code without deploying my plugin into directory “plugins” on presto cluster? Can i start-up a local presto server in my IDE to debug my code?
My 2th problem is:
I build a presto-plugin to read data from a new file format like “carbondata”, and most columns in “carbondata” format is decoded in global dictionary.
now i’m using interface RecordSet to get all decoded records.
However in some cases, we do not need the decoded step when doing some aggregation jobs,
So is there any optimizations in presto can delay the decoding process?
Issue Analytics
- State:
- Created 7 years ago
- Comments:11 (6 by maintainers)
You should also delay decoding columns until Presto asks for a column. For example, a common query is:
SELECT * FROM table t WHERE someVeryRareCondition(t.x)
If someVeryRareCondition never returns true, then Presto will only ask for data from column x.
If your datasource has column oriented, then you will want to use the PageSource API, which is more efficient to Presto. For lazy decoding in PageSource, we use LazyBlock.
-dain
Ad. 1. You can do:
mvn install
and then use SNAPSHOT version in your local Presto server.Ad. 2. I haven’t heard of anything like that. However you will get a list of columns (projection) which are going to be used. That way you don’t need to decode values for columns that are not going to be used.