embedding upload not working with new_dataset_name
See original GitHub issueWhen using lightly-magic or lightly-upload to upload embedding to a new dataset with new_dataset_name
, the download of the already existing embeddings (none are there) fails:
(venv) user@guest lightly % lightly-magic input_dir=/Users/datasets/clothing-dataset-small/test/dress trainer.max_epochs=0 token=TOKEN new_dataset_name="blub"
########## Starting to embed your dataset.
Compute efficiency: 0.24: 100%|█████| 1/1 [00:09<00:00, 9.04s/it]
Embeddings are stored at /Users/GitHub/lightly/lightly_outputs/2021-11-15/16-03-22/embeddings.csv
########## Starting to upload your dataset to the Lightly platform.
Uploading images (with 12 workers).
100%|███████▏| 15/15 [00:01<00:00, 13.48imgs/s]
Finished the upload of the dataset.
Starting upload of embeddings.
Traceback (most recent call last):
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/cli/lightly_cli.py", line 87, in lightly_cli
return _lightly_cli(cfg)
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/cli/lightly_cli.py", line 38, in _lightly_cli
_upload_cli(cfg)
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/cli/upload_cli.py", line 101, in _upload_cli
embeddings = api_workflow_client.embeddings_api \
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api/embeddings_api.py", line 55, in get_embeddings_by_dataset_id
(data) = self.get_embeddings_by_dataset_id_with_http_info(dataset_id, **kwargs) # noqa: E501
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api/embeddings_api.py", line 115, in get_embeddings_by_dataset_id_with_http_info
return self.api_client.call_api(
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api_client.py", line 326, in call_api
return self.__call_api(resource_path, method,
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api_client.py", line 158, in __call_api
response_data = self.request(
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api_client.py", line 348, in request
return self.rest_client.GET(url,
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/rest.py", line 234, in GET
return self.request("GET", url,
File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/rest.py", line 228, in request
raise ApiException(http_resp=r)
lightly.openapi_generated.swagger_client.rest.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Content-Type': 'application/json; charset=utf-8', 'x-cloud-trace-context': '3aa5b5da3e128a2a549edae34d0156cf/10076201567220533420;o=1', 'Vary': 'Origin, Accept-Encoding', 'Content-Security-Policy': "script-src https:/*.lightly.ai https:/*.youtube.com https:/*.gstatic.com https:/*.google.com https:/*.doubleclick.net https:/*.google-analytics.com https:/*.googletagmanager.com https:/*.fullstory.com;img-src *;default-src 'self';base-uri 'self';block-all-mixed-content;font-src 'self' https: data:;frame-ancestors 'self';object-src 'none';script-src-attr 'none';style-src 'self' https: 'unsafe-inline';upgrade-insecure-requests", 'X-DNS-Prefetch-Control': 'off', 'Expect-CT': 'max-age=0', 'X-Frame-Options': 'SAMEORIGIN', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains', 'X-Download-Options': 'noopen', 'X-Content-Type-Options': 'nosniff', 'X-Permitted-Cross-Domain-Policies': 'none', 'Origin-Agent-Cluster': '?1', 'X-XSS-Protection': '0', 'Cross-Origin-Opener-Policy': 'same-origin-allow-popups', 'Cross-Origin-Resource-Policy': 'same-site', 'ETag': 'W/"1d-RxIXLBAjGcBSG7XheTm9uqfrCcY"', 'Date': 'Mon, 15 Nov 2021 15:03:35 GMT', 'Server': 'Google Frontend', 'Content-Length': '29'})
HTTP response body: {
"code": "BAD_REQUEST"
}
The same error occurs when using a non-exiting dataset id.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Insert an object in your Excel spreadsheet - Microsoft Support
Learn to insert objects such as Word documents, PowerPoint presentations, Visio drawings, graphs, to name a few, in your Excel spreadsheet.
Read more >Best Practices for Published Data Sources - Tableau Help
To rename a published data source, choose the More actions menu next to the name of your data source. Then, choose Rename and...
Read more >Embedded Data - Qualtrics
For a vast majority of users, embedded data field names are no longer ... Click Create New Field or Choose From Dropdown and...
Read more >Upload Dataset Stages - Slate Knowledge Base - Technolutions
When the file has uploaded, selecting the name of the Source Format opens the Remap stage, where source fields can be mapped to...
Read more >How do I upload and embed a media file from my com...
Using the Rich Content Editor, you can upload and embed media files from ... To change the file name, type a new file...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@natejenkins A
Bad Request
only happens when there is indeed a very bad request when trying to route the client request. e.g the supplied ObjectId is not valid (as in this case, the dataset_id was empty/''
).As the http spec does not offer much flexibility via the numeric status code, we send more specific error messages back if we can. Have a look at our more specific error codes. E.g
MALFORMED_REQUEST
is interesting and will give you additional information where exactly in the request body/queryparams the client did something wrong.I think the problem is on line 102:
https://github.com/lightly-ai/lightly/blob/36e1bf05c78f928b7a75b7e253e51c00e5c79240/lightly/cli/upload_cli.py#L101-L102
If you change this to:
then the requests completes successfully.
What looks to be happening is that earlier in the code during the creation of the new dataset,
api_workflow_client
has its instance variabledataset_id
set but the local variabledataset_id
is not updated:https://github.com/lightly-ai/lightly/blob/36e1bf05c78f928b7a75b7e253e51c00e5c79240/lightly/cli/upload_cli.py#L34-L52
If you pass in the dataset_id via the cli then that local variable is set on line 34.
Personally I think it would make sense and make it more readable to have
create_dataset
inapi_workflow_datasets
not only set the instance variable but return the dataset_id, which can then be used to update the local variabledataset_id
inupload_cli.py
.