question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Resumable media (chunked upload) slow

See original GitHub issue

Issue

I’m getting a max of around 20Mbps on any file upload which is resumable=True and chunked (regardless of chunksize), however I get my full internet speed of around 180Mbps upload on the same file when not using this method. The problem is that if the file is too big to fit in memory (or even threading many uploads) the computer will run out of memory and I will have to use chunked upload which is just too slow.

Am I doing something wrong? How can I upload a large file without having to worry about running out of memory and not be limited to 20Mbps?

Environment details

  • OS: Windows 10
  • Python version: 3.5
  • pip version: 19.0.1
  • google-api-python-client version: 1.7.8

Steps to reproduce

  1. Use the code below
  2. Monitor traffic output limits

Sample Code

Shared code between both examples:

import os
from google.oauth2 import service_account
from googleapiclient.discovery import build, MediaFileUpload

def create_drive_service(user_email,SERVICE_ACCOUNT_JSON_FILE,SCOPES=None):
    if SCOPES is None: SCOPES = ['https://www.googleapis.com/auth/drive']
    credentials = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_JSON_FILE)
    credentials = credentials.with_scopes(SCOPES)
    credentials = credentials.with_subject(user_email)
    return build('drive', 'v3', credentials=credentials)
#
# Fill in the following
#
jsonpath = 'c:\\path\\to\\service_account.json'
service = create_drive_service('user@domain',jsonpath)
filename = 'filename.ext'
filepath = 'c:\\path\\to\\folder\\' + filename
parent_id = ''

Code that is slow:

file_metadata = {"name":filename,"parents":[parent_id]}
media = MediaFileUpload(filepath,chunksize=-1,resumable=True)
request = service.files().create(body=file_metadata,media_body=media,fields='id')
while response is None:
    status, response = request.next_chunk()
file = request.execute()

Code that works full-speed but not if the file is too big to fit in memory:

file_metadata = {"name":filename,"parents":[parent_id]}
media = MediaFileUpload(filepath,resumable=False)
request = service.files().create(body=file_metadata,media_body=media,fields='id').execute()

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
jfernandez04commented, Apr 5, 2019

@acidnine I set it like this and have no problem with my upload.

media = MediaFileUpload(archivo, mimetype='text/plain', chunksize=256 * 1024, resumable=True)

0reactions
shivamsn97commented, Dec 7, 2022

@LEChaney were you able to increase your upload speed? I am stuck at the same problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Google Drive Python API Resumable media (chunked upload ...
I'm getting a max of around 20Mbps on any file upload which is resumable=True and chunked (regardless of chunksize), however I get my...
Read more >
Optimize Cloud Storage Upload Performance with Client ...
The chunk size affects the performance of a resumable upload, where larger chunk sizes typically make uploads quicker, but there's a tradeoff ...
Read more >
Resumable File Uploads to Auphonic
The Solution: Chunked, Resumable Uploads​​ If there is a network interruption or change, the upload will be retried automatically. This solutions ...
Read more >
They all shall pass: a guide to handling large file uploads
Several methods to face these problems include chunking, resumable uploads, and using distributed storage networks.
Read more >
Resumable file upload in PHP: Handle large file ... - Medium
In this post, we will see an attempt to solve this problem in PHP by uploading files in resumable chunks using tus protocol....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found