question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add endpoint for external to get information about UDFs

See original GitHub issue

Currently we have implemented some Pinot UDFs, however they are not transparently pushdown-able from external query engines like Presto.

The proposed change will add a new endpoint in PinotController. The endpoint will provide information of Pinot UDFs, so that we will have a generic way to understand what can be pushed down to pinot.

In general I’m thinking of the fields we need are:

`function_name`
`function_type`
`arguments_types`
`return_type`

e.g.

[
  {
    "function_name": "segmentPartitionedDistinctCount",
    "function_type": "AGGREGATION",
    "arguments_types": [ "ANY" ],
    "return_type": [ "LONG" ]
  },
  ...
]

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
Jackie-Jiangcommented, Aug 25, 2020

How do you plan to maintain this endpoint? Not sure if this can help with query push down as the behavior must be identical on presto side and pinot side. IMO it is safer to manually select functions to push down instead of using name matching. It can cause unexpected behavior if the behavior is not exactly the same.

0reactions
yupeng9commented, Nov 5, 2020

How do you plan to maintain this endpoint? Not sure if this can help with query push down as the behavior must be identical on presto side and pinot side. IMO it is safer to manually select functions to push down instead of using name matching. It can cause unexpected behavior if the behavior is not exactly the same.

We can use reflection to auto generate this information or we can get it from TransformFunctionFactory. I think this is useful when we have many releases of Pinot and Presto and we are keep adding new functions.

Also cc: @yupeng9

+1 This helps a lot minimizing the efforts of syncing the functions between Pinot and Presto. Moreover, I feel there are functions from Pinot that may not be able to introduce the counterpart in Presto. For example, the IdSet function introduced by https://github.com/apache/incubator-pinot/pull/5926

Read more comments on GitHub >

github_iconTop Results From Across the Web

Introduction to External Functions - Snowflake Documentation
Inside Snowflake, the external function is stored as a database object that contains information that Snowflake uses to call the remote service.
Read more >
Querying with user defined functions - Amazon Athena
The USING EXTERNAL FUNCTION clause specifies a UDF or multiple UDFs that can be referenced by a subsequent SELECT statement in the query....
Read more >
BigQuery remote UDFs with Cloud Functions
Real Time APIs: Enrich BigQuery data using external APIs to obtain the latest stock price data, weather updates, or geocoding information.
Read more >
User-defined Functions | Apache Flink
User-defined functions (UDFs) are extension points to call frequently used logic or custom logic that cannot be expressed otherwise in queries.
Read more >
How to: Access an External Data Source from a UDF ...
This example shows how to access an external database from a user-defined ... Close(); return (rowCount); } catch (Exception e) { return (e....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found