question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support parameterized SQL in bigquery integration (bq_solid_for_queries)

See original GitHub issue
@solid
def get_foo_id_list(_) -> List[str]:
    return ['a', 'b', 'c']

bq_solid_for_queries([
   f""" select * from my_table where id in {foo_id_list}"
]

I’d like to construct a SQL query using the output of an upstream solid. How would I do that?

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
sryzacommented, Jun 8, 2020

One way of addressing this would be with a bigquery_solid decorator. Thoughts?

@solid
def get_foo_id_list(_) -> List[str]:
    return ['a', 'b', 'c']

@bigquery_solid(input_defs=[InputDefinition(List[str])])
def my_bq_solid(foo_id_list):
    return [f""" select * from my_table where id in {foo_id_list}"]
1reaction
natekuppcommented, Feb 20, 2020

hey @zzztimbo thanks for checking!

Right now, bq_solid_for_queries is a solid factory, and expects the queries to be available at pipeline compilation time, vs. fed in as a Dagster input. If it’s possible to feed ['a', 'b', 'c'] in non-Dagster Python code at pipeline construction time, you can try that.

If you definitely need SQL to be an input to the solid, the below should work:

from dagster_pandas import DataFrame
from dagster_gcp.bigquery.configs import define_bigquery_query_config
from dagster_gcp.bigquery.solids import _preprocess_config
from google.cloud.bigquery.job import QueryJobConfig

@solid(
    config=define_bigquery_query_config(), required_resource_keys={'bigquery'},
)
def bq_query_input_solid(context, sql_queries: List[str]) -> List[DataFrame]:
    query_job_config = _preprocess_config(context.solid_config.get('query_job_config', {}))

    results = []
    for sql_query in sql_queries:
        cfg = QueryJobConfig(**query_job_config) if query_job_config else None
        context.log.info(
            'executing query %s with config: %s'
            % (sql_query, cfg.to_api_repr() if cfg else '(no config provided)')
        )
        results.append(context.resources.bigquery.query(sql_query, job_config=cfg).to_dataframe())

    return results

I’ll keep this issue open to track adding the above to the dagster_gcp library, since this seems like it should be functionality we provide out of the box.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Running parameterized queries | BigQuery - Google Cloud
BigQuery supports query parameters to help prevent SQL injection when queries are constructed using user input. This feature is only available with Google ......
Read more >
Working With BigQuery Parameterized Queries: Made Easy 101
This write-up is going to look at Google BigQuery Parameterized Queries, explaining how it works to show that using parameters in your query ......
Read more >
How To Run Parameterized Queries With Google BigQuery ...
BigQuery supports query parameters to help prevent SQL injection when queries are constructed using user input. In this Google BigQuery API ...
Read more >
Use parameters in a custom query - Looker Studio Help
Parameters let you build more responsive, customizable reports. You can pass parameters in a data source back to the underlying query. To use...
Read more >
BigQuery Parameterization - Jupyter Notebooks Gallery
Google BigQuery Standard SQL supports parameterization. It is interesting to be able to use Python variables defined in the notebook as parameter values...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found