question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pd.io.gspread.read_frame - from Google Spreadsheet to Pandas DataFrame

See original GitHub issue

Google Spreadsheet is an online spreadsheet. Google Document can be use to generate survey and results of survey will be stored as a Google Spreadsheet document.

Maybe Pandas should provide a pd.io.gspread.read_frame function that will read a given Google Spreadsheet document (using email, password, url or name of document and range) and return a DataFrame.

pd.io.gspread.read_frame(email, password, filename, sheet, cell_range)

gspread package could help http://burnash.github.io/gspread/

I wrote a little bit of code for that… but it could probably be improve and add into Pandas.

email = '...@...'
password = '...'
cell_range = 'A1:R20'

gc = gspread.login(email, password)
wks = gc.open(filename).sheet1

cell_list = wks.range('')

# Build a NumPy array
(row, col) = (cell_list[-1].row, cell_list[-1].col)
data = np.empty((row-1,col), dtype=object)
data[:] = np.nan

k = 0
cols = []
for i in range(row):
    for j in range(col):
        val = cell_list[k].value
        if i==0:
            if val != None:
                if val not in cols:
                    cols.append(val)
                else: # add a number if colname ever exists
                    ii = 1
                    while True:
                        new_val = val + '_' + str(ii)
                        if new_val not in cols:
                            break
                        ii += 1
                    cols.append(new_val)
            else:
                cols.append('col_'+str(j))
                #cols.append(j)
        else:
            if val != None:
                data[i-1, j] = val
        k += 1

Issue Analytics

  • State:closed
  • Created 10 years ago
  • Comments:28 (17 by maintainers)

github_iconTop GitHub Comments

3reactions
jtratnercommented, Sep 28, 2013

Interesting. I’m not totally clear on what you’re trying to do above - why are you special casing row 0 (instead of just doing something with row 0 first and then changing the line to: for i in range(1, row): instead? It’s also unclear why you’re using k and j in a for loop, but maybe I’m missing something.

Based on a quick read of gspread, the easiest thing to do would be something like:

values = wks.get_all_values()
header, rows = values[0], values[1:]
df = DataFrame(rows, columns=header)

That said, the above is particularly inefficient because it first stores all of the cells in Cell objects, creates a defaultdict, stores everything, then uses a double list comprehension to get all the data.

1 big question: Why would you use gspread over the Google Data API? Are their particular advantages?

1reaction
maybelinotcommented, Sep 29, 2015

Here is library that provides possibilities for uploading/downloading between spreadsheets and pandas DataFrame, for those who tries to find solutions in this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Reading google sheets Into a Pandas Dataframe with gspread ...
Reading google sheets Into a Pandas Dataframe with gspread and OAuth2. Do you know if you can read a google sheet into pandas?...
Read more >
How to read Google Sheets data in Pandas with GSpread
The GSpread package makes it quick and easy to read Google Sheets spreadsheets from Google Drive and load them into Pandas dataframes.
Read more >
Google Sheets, Meet Pandas DataFrame | by Nicholas Ballard
There's a few libraries we will talk about using for Google Sheets I/O in Python. No matter the approach, you have to enable...
Read more >
Using Gspread-Pandas — gspread-pandas 3.2.2 documentation
The goal of these objects is to make it easy to work with a variety of concepts in Google Sheets and Pandas DataFrames....
Read more >
Update existing google sheet with a pandas data frame and ...
pygsheets have pandas support inbuild. import pygsheets gc = pygsheets.authorize(service_file='file.json') #open the google spreadsheet sh ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found