Last data chunk does not get uploaded to GCP bucket
See original GitHub issueI created a dataset following: https://clear.ml/docs/latest/docs/clearml_data/data_management_examples/data_man_simple
When I upload it to my GCP bucket by:
(yolo555) ➜ yolov5 git:(master) ✗ clearml-data close --storage gs://xxx/clearml-test --chunk-size 128 --verbose
The last prompts are:
Uploading dataset changes (98 files compressed to 94.76 MiB) to gs://icm-data-lake/clearml-test
Uploading dataset changes (98 files compressed to 94.73 MiB) to gs://icm-data-lake/clearml-test
Uploading dataset changes (97 files compressed to 94.54 MiB) to gs://icm-data-lake/clearml-test
Uploading dataset changes (98 files compressed to 94.23 MiB) to gs://icm-data-lake/clearml-test
2022-09-08 09:55:31,054 - clearml.storage - ERROR - Failed uploading: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Read timed out. (read timeout=60)
File compression and upload completed: total size 38.47 GiB, 320 chunk(s) stored (average size 123.11 MiB)
Dataset closed and finalized
Did the last chunk failed to get uploaded or is it just a false alarm?
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top Results From Across the Web
Perform resumable uploads | Cloud Storage - Google Cloud
Once you have initiated a resumable upload, there are two ways to upload the object's data: In a single chunk: This approach is...
Read more >Upload in chunks to Google Cloud Storage error out 503 ...
I got the solution it was my mistake. As mentioned here, Google is very particular about chunk size. Chunk size restriction: All chunks...
Read more >Creating a resumable upload from chunks · Issue #132 - GitHub
When I retrieve a chunk on the api endpoint, I tried two methods: using file.save() as the documentation says "Resumable uploads are ......
Read more >gsutil Archives - Jayendra's Cloud Certification Blog
Streaming uploads are useful when uploading data whose final size is not known at the start of the upload, such as when generating...
Read more >Cloud Storage Go Reference
An object holds arbitrary data as a sequence of bytes, like a file. You refer to objects using a handle, just as with...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @mikel-brostrom,
That…shouldn’t happen 😃 Does this persist? IE, does it happen every time? Also, it might sound silly, but did you try downloading and checking if all files are there? Should be easy to compare original and downloaded files.
In the meantime I’ll check internally if we defend somehow against partial upload issues.
I run the command again. No issue this time:
I guess this is not an issue @erezalg anymore 😄