[BUG] MLFLOW_S3_ENDPOINT_URL is ignored
See original GitHub issueIssues Policy acknowledgement
- I have read and agree to submit bug reports in accordance with the issues policy
Willingness to contribute
No. I cannot contribute a bug fix at this time.
MLflow version
mlflow, version 1.29.0
System information
- Arch Linux
- Python 3.10.8
Describe the problem
I’m trying to start the mlflow tracking server with an bucket and a postgresql database attached. The object store is not hosted by AWS but implements the S3 api interface.
When adding an artifact to the database mflow throws an error that it cannot connect to the bucket. It complains about not being able to connect to AWS.
I have set all the nesessary environment variables.
Tracking information
No notebook
Code to reproduce issue
MLFLOW_TRACKING_URI=http://0.0.0.0:5000
MLFLOW_BACKEND_STORE_URI=postgresql://<user>:<password>@<host>:<port>
MLFLOW_S3_ENDPOINT_URL=https://<edacted>.<redacted>.cloud
AWS_ACCESS_KEY_ID=<redacted>
AWS_SECRET_ACCESS_KEY=<redacted>
mlflow server --default-artifact-root s3://mlflow/ --host 0.0.0.0
Stack trace
Traceback (most recent call last):
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/util/connection.py", line 72, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/httpsession.py", line 455, in send
urllib_response = conn.urlopen(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/util/retry.py", line 525, in increment
raise six.reraise(type(error), error, _stacktrace)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connection.py", line 358, in connect
self.sock = conn = self._new_conn()
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPSConnection object at 0x7f9e5b7b9e70>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/flask/app.py", line 2525, in wsgi_app
response = self.full_dispatch_request()
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/flask/app.py", line 1822, in full_dispatch_request
rv = self.handle_user_exception(e)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/flask/app.py", line 1820, in full_dispatch_request
rv = self.dispatch_request()
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/flask/app.py", line 1796, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/mlflow/server/handlers.py", line 456, in wrapper
return func(*args, **kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/mlflow/server/handlers.py", line 526, in wrapper
return func(*args, **kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/mlflow/server/handlers.py", line 909, in _list_artifacts
artifact_entities = _get_artifact_repo(run).list_artifacts(path)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/mlflow/store/artifact/s3_artifact_repo.py", line 159, in list_artifacts
for result in results:
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/paginate.py", line 269, in __iter__
response = self._make_request(current_kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/paginate.py", line 357, in _make_request
return self._method(**current_kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/client.py", line 514, in _api_call
return self._make_api_call(operation_name, kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/client.py", line 921, in _make_api_call
http, parsed_response = self._make_request(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/client.py", line 944, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/endpoint.py", line 119, in make_request
return self._send_request(request_dict, operation_model)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/endpoint.py", line 202, in _send_request
while self._needs_retry(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/endpoint.py", line 354, in _needs_retry
responses = self._event_emitter.emit(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/hooks.py", line 412, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/hooks.py", line 256, in emit
return self._emit(event_name, kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/hooks.py", line 239, in _emit
response = handler(**kwargs)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/retryhandler.py", line 207, in __call__
if self._checker(**checker_kwargs):
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/retryhandler.py", line 284, in __call__
should_retry = self._should_retry(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/retryhandler.py", line 320, in _should_retry
return self._checker(attempt_number, response, caught_exception)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/retryhandler.py", line 363, in __call__
checker_response = checker(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/retryhandler.py", line 247, in __call__
return self._check_caught_exception(
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/retryhandler.py", line 416, in _check_caught_exception
raise caught_exception
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/endpoint.py", line 281, in _do_get_response
http_response = self._send(request)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/endpoint.py", line 377, in _send
return self.http_session.send(request)
File "$HOME/.local/share/virtualenvs/ai-panoptes-O4NUaGdJ/lib/python3.10/site-packages/botocore/httpsession.py", line 484, in send
raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://s3.nl-ams.amazonaws.com/mlflow?list-type=2&prefix=1%2F6c3af048e4a74ded9dd43ddd898c0d3c%2Fartifacts%2F&delimiter=%2F&encoding-type=url"
Other info / logs
No response
What component(s) does this bug affect?
-
area/artifacts
: Artifact stores and artifact logging -
area/build
: Build and test infrastructure for MLflow -
area/docs
: MLflow documentation pages -
area/examples
: Example code -
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models
: MLmodel format, model serialization/deserialization, flavors -
area/pipelines
: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates -
area/projects
: MLproject format, project running backends -
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs -
area/server-infra
: MLflow Tracking server backend -
area/tracking
: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
-
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server -
area/docker
: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows
: Windows support
What language(s) does this bug affect?
-
language/r
: R APIs and clients -
language/java
: Java APIs and clients -
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
-
integrations/azure
: Azure and Azure ML integrations -
integrations/sagemaker
: SageMaker integrations -
integrations/databricks
: Databricks integrations
Issue Analytics
- State:
- Created a year ago
- Comments:19 (10 by maintainers)
Top Results From Across the Web
MLflow artifacts on S3 but not in UI - Stack Overflow
I solved this problem; the MLFlow server had the wrong artifact location in my case. This connection pointed to a non-existent address. –...
Read more >Configuring a Data Science Workbench - Emily F. Gorcenski
Starting here, I can configure a toolset. First, I'll want an experiment and asset tracking solution. I'll need a visualization and ...
Read more >Ops … I did it again – MLOps with Kubeflow, MLflow - LinkedIn
In this article we will use Kubeflow and MLflow to build the isolated workspace and MLOps pipelines for analytical teams. Currently we use ......
Read more >mlflow Changelog - pyup.io
Fixed a bug in S3 artifact logging functionality where `MLFLOW_S3_ENDPOINT_URL` was ignored (2629, poppash) - Fixed a bug where Sqlite in-memory was not ......
Read more >MLflow Tracking — MLflow 2.0.1 documentation
To store artifacts in a custom endpoint, set the MLFLOW_S3_ENDPOINT_URL to your endpoint's URL. For example, if you are using Digital Ocean Spaces:....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Don’t know if this could help, I’ve had a similar issue with remote tracking (with mlflow==1.29.0 and mlflow==1.30.0) of artifacts on Digital Ocean Spaces compatible S3 and resolved with this harupy comment, before I wasn’t setting MLFLOW_S3_ENDPOINT_URL=
https://<region-name>.digitaloceanspaces.com
(https://github.com/mlflow/mlflow/issues/5439) also on the client side (only S3 credentials) and I was getting an error like:Could not connect to the endpoint URL: "https://<bucket-name>.s3.<region-name>.amazonaws.com/5/190cbf0a10734f308d070059f0dd8698/artifacts/model/model.pkl"
from a
mlflow.sklearn.log_model
on a simple jupyter lab on the mlflow sklearn_logistic_regression example, so there was clearly a malformed url instead of the correct onehttps://<bucket-name>.<region-name>.digitaloceanspaces.com
.IMHO this issue clashes with the official documentation when a a warning box says:
continuing with
Actually the opposite is true, so now if this is the intended way to make S3 compatbile storage working for remote artifact storage maybe the official documentation has to be (temporarily? if this is a bug) updated.
Fixed by the following changes:
MLFLOW_DEFAULT_ARTIFACT_ROOT
in the client environment.Todos: