question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed to launch Katib experiment - 404 page not found

See original GitHub issue

/kind bug

What steps did you take and what happened: Setup kubeflow and installed katib manually as mentioned in https://github.com/kubeflow/katib/issues/1415 Start a katib experiment with kale out of a jupyter notebook. The experiment was created and the pipeline was also uploaded but not launched.

Type: RPC

Method: katib.create_katib_experiment()

Code: 6 (UnhandledError)

Transaction ID: ylpewg72bh

Message: Failed to launch Katib experiment

Details: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'Date': 'Mon, 08 Mar 2021 12:14:43 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found

kale.log:

2021-03-08 12:14:42 run:83 [[DEBUG]] [TID=axkqxleth9] [] Decoding ctx of RPC function 'kfp.create_experiment'
2021-03-08 12:14:42 run:95 [[DEBUG]] [TID=axkqxleth9] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Decoding kwargs of RPC function 'kfp.create_experiment'
2021-03-08 12:14:42 run:104 [[DEBUG]] [TID=axkqxleth9] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Importing RPC function 'kfp.create_experiment'
2021-03-08 12:14:42 run:114 [[INFO]] [TID=axkqxleth9] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Executing RPC function 'create_experiment(experiment_name=test-v1ef9)'
2021-03-08 12:14:43 _client:352 [[INFO]] Creating experiment test-v1ef9.
2021-03-08 12:14:43 run:83 [[DEBUG]] [TID=ylpewg72bh] [] Decoding ctx of RPC function 'katib.create_katib_experiment'
2021-03-08 12:14:43 run:95 [[DEBUG]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Decoding kwargs of RPC function 'katib.create_katib_experiment'
2021-03-08 12:14:43 run:104 [[DEBUG]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Importing RPC function 'katib.create_katib_experiment'
2021-03-08 12:14:43 run:114 [[INFO]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Executing RPC function 'create_katib_experiment(pipeline_id=832dfc28-61be-4fb5-af12-7877778b26ef, pipeline_metadata={'autosnapshot': True, 'docker_image': 'jupyter-kale:latest', 'experiment': {'id': '7f611f1b-bf8e-4709-80ef-c55d6644931c', 'name': 'test'}, 'experiment_name': 'test-v1ef9', 'katib_metadata': {'parameters': [{'feasibleSpace': {'max': '2000', 'min': '100', 'step': '100'}, 'name': 'N_ESTIMATORS', 'parameterType': 'int'}, {'feasibleSpace': {'list': ['10', '20', '30', '40', '50', '100']}, 'name': 'MAX_DEPTH', 'parameterType': 'categorical'}, {'feasibleSpace': {'max': '4', 'min': '1', 'step': '1'}, 'name': 'MIN_SAMPLES_LEAF', 'parameterType': 'int'}, {'feasibleSpace': {'list': ['2', '5', '10']}, 'name': 'MIN_SAMPLES_SPLIT', 'parameterType': 'categorical'}], 'objective': {'additionalMetricNames': [], 'goal': 0.85, 'objectiveMetricName': 'random-forest-accuracy', 'type': 'maximize'}, 'algorithm': {'algorithmName': 'random', 'algorithmSettings': [{'name': 'random_state', 'value': '10'}, {'name': 'acq_optimizer', 'value': 'auto'}, {'name': 'acq_func', 'value': 'gp_hedge'}, {'name': 'base_estimator', 'value': 'GP'}]}, 'maxTrialCount': 10, 'maxFailedTrialCount': 3, 'parallelTrialCount': 5}, 'katib_run': True, 'pipeline_description': 'Fine tune a RF classifier on the Titanic dataset', 'pipeline_name': 'titanic-hp-tuning', 'snapshot_volumes': True, 'steps_defaults': [], 'volumes': []}, output_path=/home/jovyan/medium/minikf)'
2021-03-08 12:14:43 katib:181 [[INFO]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Saving Katib experiment definition at /home/jovyan/medium/minikf/test-v1ef9.katib.yaml
2021-03-08 12:14:43 katib:91 [[DEBUG]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Launching Katib Experiment 'test-v1ef9'...
2021-03-08 12:14:43 katib:97 [[ERROR]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Failed to launch Katib experiment
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/kale/rpc/katib.py", line 95, in _launch_katib_experiment
    katib_experiment)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/custom_objects_api.py", line 178, in create_namespaced_custom_object
    (data) = self.create_namespaced_custom_object_with_http_info(group, version, namespace, plural, body, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/custom_objects_api.py", line 277, in create_namespaced_custom_object_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 334, in call_api
    _return_http_data_only, collection_formats, _preload_content, _request_timeout)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 168, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 377, in request
    body=body)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 265, in POST
    body=body)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 221, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'Date': 'Mon, 08 Mar 2021 12:14:43 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found

What did you expect to happen: The katib experiment is launched.

Anything else you would like to add: Can I figure out which DNS address is being requested?

Environment:

  • Kubeflow version: kfctl v1.2.0-0-gbc038f9
  • OnPremise Kubernetes Cluster
  • Kubernetes version: v1.17.0
  • OS: Ubuntu 18.04.5 LTS

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:15 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
andreyvelichcommented, Aug 4, 2021

@Siddarth-Pattnaik I think you should update your Kubeflow version to 1.1 at least to use Katib SDK. In Kubeflow 1.0.1 we had Katib v1alpha3 version which SDK doesn’t support.

0reactions
stale[bot]commented, Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting Deployments on GKE - Kubeflow
Check deployment manager page and see if there's a failed deployment. ... 404 Page Not Found When Accessing Central Dashboard.
Read more >
set the custom image in katib base on kubeflow - Stack Overflow
is it possible to use the custom image when create a new experiment through the autoML(autoML) ui in kubeflow?
Read more >
Get started with Charmed Kubeflow | Documentation
Kubeflow Dashboard can't be accessed; Applications in an error state ... Set up 5 of the main components of Charmed Kubeflow: Notebooks, Pipelines,...
Read more >
Kale and Kubeflow in vSphere with Kubernetes - Cloud Advisors
Kale will create an Experiment-Katib-Resource. ... When you have a call which fails, and it's not relevant, you'll ignore this error, ...
Read more >
Issues and Workarounds
EZESC-244: Kubernetes host upgrade fails with error "repomd.xml signature could not be verified". Symptom: The HPE Ezmeral Runtime Enterprise UI reports that a ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found