question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

embedding upload not working with new_dataset_name

See original GitHub issue

When using lightly-magic or lightly-upload to upload embedding to a new dataset with new_dataset_name, the download of the already existing embeddings (none are there) fails:

(venv) user@guest lightly % lightly-magic input_dir=/Users/datasets/clothing-dataset-small/test/dress trainer.max_epochs=0 token=TOKEN new_dataset_name="blub"
########## Starting to embed your dataset.
Compute efficiency: 0.24: 100%|█████| 1/1 [00:09<00:00,  9.04s/it]
Embeddings are stored at /Users/GitHub/lightly/lightly_outputs/2021-11-15/16-03-22/embeddings.csv
########## Starting to upload your dataset to the Lightly platform.
Uploading images (with 12 workers).
 100%|███████▏| 15/15 [00:01<00:00, 13.48imgs/s]
 Finished the upload of the dataset.
Starting upload of embeddings.
Traceback (most recent call last):
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/cli/lightly_cli.py", line 87, in lightly_cli
    return _lightly_cli(cfg)
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/cli/lightly_cli.py", line 38, in _lightly_cli
    _upload_cli(cfg)
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/cli/upload_cli.py", line 101, in _upload_cli
    embeddings = api_workflow_client.embeddings_api \
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api/embeddings_api.py", line 55, in get_embeddings_by_dataset_id
    (data) = self.get_embeddings_by_dataset_id_with_http_info(dataset_id, **kwargs)  # noqa: E501
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api/embeddings_api.py", line 115, in get_embeddings_by_dataset_id_with_http_info
    return self.api_client.call_api(
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api_client.py", line 326, in call_api
    return self.__call_api(resource_path, method,
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api_client.py", line 158, in __call_api
    response_data = self.request(
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/api_client.py", line 348, in request
    return self.rest_client.GET(url,
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/rest.py", line 234, in GET
    return self.request("GET", url,
  File "/Users/GitHub/lightly/venv/lib/python3.9/site-packages/lightly/openapi_generated/swagger_client/rest.py", line 228, in request
    raise ApiException(http_resp=r)
lightly.openapi_generated.swagger_client.rest.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Content-Type': 'application/json; charset=utf-8', 'x-cloud-trace-context': '3aa5b5da3e128a2a549edae34d0156cf/10076201567220533420;o=1', 'Vary': 'Origin, Accept-Encoding', 'Content-Security-Policy': "script-src https:/*.lightly.ai https:/*.youtube.com https:/*.gstatic.com https:/*.google.com https:/*.doubleclick.net https:/*.google-analytics.com https:/*.googletagmanager.com https:/*.fullstory.com;img-src *;default-src 'self';base-uri 'self';block-all-mixed-content;font-src 'self' https: data:;frame-ancestors 'self';object-src 'none';script-src-attr 'none';style-src 'self' https: 'unsafe-inline';upgrade-insecure-requests", 'X-DNS-Prefetch-Control': 'off', 'Expect-CT': 'max-age=0', 'X-Frame-Options': 'SAMEORIGIN', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains', 'X-Download-Options': 'noopen', 'X-Content-Type-Options': 'nosniff', 'X-Permitted-Cross-Domain-Policies': 'none', 'Origin-Agent-Cluster': '?1', 'X-XSS-Protection': '0', 'Cross-Origin-Opener-Policy': 'same-origin-allow-popups', 'Cross-Origin-Resource-Policy': 'same-site', 'ETag': 'W/"1d-RxIXLBAjGcBSG7XheTm9uqfrCcY"', 'Date': 'Mon, 15 Nov 2021 15:03:35 GMT', 'Server': 'Google Frontend', 'Content-Length': '29'})
HTTP response body: {
    "code": "BAD_REQUEST"
}

The same error occurs when using a non-exiting dataset id.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
japrescottcommented, Nov 16, 2021

@natejenkins A Bad Request only happens when there is indeed a very bad request when trying to route the client request. e.g the supplied ObjectId is not valid (as in this case, the dataset_id was empty/'').

As the http spec does not offer much flexibility via the numeric status code, we send more specific error messages back if we can. Have a look at our more specific error codes. E.g MALFORMED_REQUEST is interesting and will give you additional information where exactly in the request body/queryparams the client did something wrong.

1reaction
natejenkinscommented, Nov 16, 2021

I think the problem is on line 102:

https://github.com/lightly-ai/lightly/blob/36e1bf05c78f928b7a75b7e253e51c00e5c79240/lightly/cli/upload_cli.py#L101-L102

If you change this to:

            embeddings = api_workflow_client.embeddings_api \
                .get_embeddings_by_dataset_id(dataset_id=api_workflow_client.dataset_id)

then the requests completes successfully.

What looks to be happening is that earlier in the code during the creation of the new dataset, api_workflow_client has its instance variable dataset_id set but the local variable dataset_id is not updated:

https://github.com/lightly-ai/lightly/blob/36e1bf05c78f928b7a75b7e253e51c00e5c79240/lightly/cli/upload_cli.py#L34-L52

If you pass in the dataset_id via the cli then that local variable is set on line 34.

Personally I think it would make sense and make it more readable to have create_dataset in api_workflow_datasets not only set the instance variable but return the dataset_id, which can then be used to update the local variable dataset_id in upload_cli.py.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Insert an object in your Excel spreadsheet - Microsoft Support
Learn to insert objects such as Word documents, PowerPoint presentations, Visio drawings, graphs, to name a few, in your Excel spreadsheet.
Read more >
Best Practices for Published Data Sources - Tableau Help
To rename a published data source, choose the More actions menu next to the name of your data source. Then, choose Rename and...
Read more >
Embedded Data - Qualtrics
For a vast majority of users, embedded data field names are no longer ... Click Create New Field or Choose From Dropdown and...
Read more >
Upload Dataset Stages - Slate Knowledge Base - Technolutions
When the file has uploaded, selecting the name of the Source Format opens the Remap stage, where source fields can be mapped to...
Read more >
How do I upload and embed a media file from my com...
Using the Rich Content Editor, you can upload and embed media files from ... To change the file name, type a new file...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found