question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

document pandas-gbq vision and roadmap

See original GitHub issue

Both pandas-gbq and google-cloud-bigquery are doing many of the same things, and increasingly so (e.g. .to_dataframe() in google-cloud-bigquery)

  • Are there different use cases? Can we define those?
  • Should we focus development on one and wrap the other? Even if not wholly, for a subset of functionality?
  • Is there some direction from Google? @tswast spends a lot of time on both libraries so he is probably best placed to offer guidance

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:1
  • Comments:12 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
tswastcommented, Jul 19, 2021

In the interest in not keeping issues open forever, I’m going to treat this issue as a request to document the project vision/roadmap. That should be useful for contributors and also understanding the purpose of this project compared to using the pandas connector in google-cloud-bigquery directly.

2reactions
tswastcommented, Jan 4, 2019

To make this task more concrete, I’d like to propose the two following sub-tasks:

  • read_gbq calls google-cloud-bigquery’s to_dataframe under the covers. Now that pandas-gbq uses the same logic as pandas for null handling, I don’t expect any change in behavior.
    • I don’t know how we’d implement a progress bar for downloading the dataframe. We may want to upstream the progress bar features (using tqdm) to google-cloud-bigquery library or add some sort of hook so that we can show progress bar.
  • to_gbq calls google-cloud-bigquery’s load_table_from_dataframe. load_table_from_dataframe uses Parquet rather CSV but is otherwise quite similar. It may work better with struct and array columns.

With the exception of schema overriding, I think it should be possible to implement these subtasks without changing the public interface of pandas-gbq.

Read more comments on GitHub >

github_iconTop Results From Across the Web

document pandas-gbq vision and roadmap #149 - GitHub
Built-in user-based authentication (3-legged OAuth, 3LO). Based on conversations with @jonparrott, I think it's probably never that google-cloud ...
Read more >
Welcome to pandas-gbq's documentation! — pandas-gbq ...
The pandas_gbq module provides a wrapper for Google's BigQuery analytics web service to simplify retrieving results from BigQuery tables using SQL-like queries ...
Read more >
Pandas-gbq: Google BigQuery Connector for Pandas - Morioh
pandas-gbq is a package providing an interface to the Google BigQuery API from pandas. Library Documentation · Product Documentation. Installation. Install ...
Read more >
The CREATE MODEL statement | BigQuery ML - Google Cloud
Note: This syntax statement provides a comprehensive list of model types with their model options. When creating a model, use that model specific...
Read more >
Creating a Roadmap: A Guide to Get You Started - ProductPlan
First, your organization has already determined your product's vision: the big-picture plan for what the product will accomplish in the market and for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found