[Kubeflow Dex Distribution] KF Pipelines 100% Unusable - MULTIPLE PEOPLE REPORTING
See original GitHub issueWhat steps did you take:
KFP in KF 1.2.0 with Dex on K8s 1.18.9 does not work. I receive an error in the KF dashboard when attempting to view pipelines:
Error: failed to retrieve list of pipelines. Click Details for more information. -> An error occured, no healthy upstream
What happened:
Installed Kubeflow 1.2.0 on-prem as per installation instructions. Any attempt to see pipelines or use pipelines fails.
What did you expect to happen:
I expected to be able to use Pipelines
Environment:
Kubernetes version 1.18.9 Kubeflow version 1.2.0 Installed with Dex, configured after deploy to use LDAP.
ml-pipelines pod fails to start completely. Logs indicate
How did you deploy Kubeflow Pipelines (KFP)?
Installed Kubeflow Pipelines as part of Kubeflow installation for on-prem with dex.
KFP version: 1.0.4
KFP SDK version: I HAVEN’T GOTTEN FAR ENOUGH TO USE THIS!
Anything else you would like to add:
ml-pipeline pod refuses to run:
$ kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
admission-webhook-bootstrap-stateful-set-0 1/1 Running 0 4d20h
admission-webhook-deployment-5d9ccb5696-f6zs6 1/1 Running 0 4d20h
application-controller-stateful-set-0 1/1 Running 0 4d21h
argo-ui-684bcb587f-z84nh 1/1 Running 0 4d16h
cache-deployer-deployment-6667847478-7h2w8 2/2 Running 2 4d21h
cache-server-bd9c859db-755zj 2/2 Running 527 4d21h
centraldashboard-895c4c768-46xgc 1/1 Running 0 4d21h
jupyter-web-app-deployment-6588c6f544-c5m45 1/1 Running 0 3d3h
katib-controller-75c8d47f8c-5k2tr 1/1 Running 0 4d21h
katib-db-manager-6c88c68d79-cgxdh 1/1 Running 0 4d16h
katib-mysql-858f68f588-zvhnj 1/1 Running 0 4d21h
katib-ui-68f59498d4-bkscp 1/1 Running 0 4d21h
kfserving-controller-manager-0 2/2 Running 0 36h
kubeflow-pipelines-profile-controller-69c94df75b-xtpfj 1/1 Running 0 4d21h
metacontroller-0 1/1 Running 0 4d21h
metadata-db-757dc9c7b5-pt75k 1/1 Running 0 4d21h
metadata-envoy-deployment-6ff58757f6-57pjc 1/1 Running 0 4d21h
metadata-grpc-deployment-76d69f69c8-xcmjk 1/1 Running 3 4d21h
metadata-writer-6d94ffb7df-mhnxj 2/2 Running 1 4d21h
minio-66c9cd74c9-jrss8 1/1 Running 0 4d21h
ml-pipeline-54989c9946-s2f46 1/2 Running 926 4d21h
ml-pipeline-persistenceagent-7f6bf7646-ldct6 2/2 Running 0 4d21h
ml-pipeline-scheduledworkflow-66db7bcf5d-q244j 2/2 Running 0 4d16h
ml-pipeline-ui-756b58fb-gpwn9 2/2 Running 0 4d21h
ml-pipeline-viewer-crd-58f59f87db-dmj2l 2/2 Running 2 4d21h
ml-pipeline-visualizationserver-6f9ff4974-k4cf9 2/2 Running 0 4d21h
mpi-operator-77bb5d8f4b-w4dhj 1/1 Running 0 4d21h
mxnet-operator-68b688bb69-b5985 1/1 Running 0 4d16h
mysql-7694c6b8b7-jthp6 2/2 Running 0 4d17h
notebook-controller-deployment-58447d4b4c-6ll57 1/1 Running 0 4d21h
profiles-deployment-78d4549cbc-z9lld 2/2 Running 0 4d21h
pytorch-operator-b79799447-f8nnl 1/1 Running 0 4d21h
seldon-controller-manager-5fc5dfc86c-nh2qm 1/1 Running 0 4d21h
spark-operatorsparkoperator-67c6bc65fb-8tgn5 1/1 Running 0 4d21h
tf-job-operator-5c97f4bf7-g5vtw 1/1 Running 0 4d21h
workflow-controller-5c7cc7976d-5n6tb 1/1 Running 0 4d16h
$ kubectl logs -n kubeflow ml-pipeline-54989c9946-s2f46 ml-pipeline-api-server
I0301 20:22:00.153656 6 client_manager.go:134] Initializing client manager
I0301 20:22:00.153817 6 config.go:50] Config DBConfig.ExtraParams not specified, skipping
[mysql] 2021/03/01 20:22:01 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:02 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:04 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:07 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:10 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:13 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:16 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:23 packets.go:36: unexpected EOF
$ kubectl logs -n kubeflow mysql-7694c6b8b7-jthp6 mysql
...
MySQL init process done. Ready for start up.
2021-02-25 03:04:17 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2021-02-25 03:04:17 0 [Note] mysqld (mysqld 5.6.44) starting as process 1 ...
2021-02-25 03:04:17 1 [Note] Plugin 'FEDERATED' is disabled.
2021-02-25 03:04:17 1 [Note] InnoDB: Using atomics to ref count buffer pool pages
2021-02-25 03:04:17 1 [Note] InnoDB: The InnoDB memory heap is disabled
2021-02-25 03:04:17 1 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-02-25 03:04:17 1 [Note] InnoDB: Memory barrier is not used
2021-02-25 03:04:17 1 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-02-25 03:04:17 1 [Note] InnoDB: Using Linux native AIO
2021-02-25 03:04:17 1 [Note] InnoDB: Using CPU crc32 instructions
2021-02-25 03:04:17 1 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2021-02-25 03:04:17 1 [Note] InnoDB: Completed initialization of buffer pool
2021-02-25 03:04:17 1 [Note] InnoDB: Highest supported file format is Barracuda.
2021-02-25 03:04:17 1 [Note] InnoDB: 128 rollback segment(s) are active.
2021-02-25 03:04:17 1 [Note] InnoDB: Waiting for purge to start
2021-02-25 03:04:17 1 [Note] InnoDB: 5.6.44 started; log sequence number 1625997
2021-02-25 03:04:17 1 [Note] Server hostname (bind-address): '*'; port: 3306
2021-02-25 03:04:17 1 [Note] IPv6 is available.
2021-02-25 03:04:17 1 [Note] - '::' resolves to '::';
2021-02-25 03:04:17 1 [Note] Server socket created on IP: '::'.
2021-02-25 03:04:17 1 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2021-02-25 03:04:17 1 [Warning] 'proxies_priv' entry '@ root@mysql-7694c6b8b7-jthp6' ignored in --skip-name-resolve mode.
2021-02-25 03:04:17 1 [Note] Event Scheduler: Loaded 0 events
2021-02-25 03:04:17 1 [Note] mysqld: ready for connections.
Version: '5.6.44' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server (GPL)
Cache Server also is unable to connect to MYSQL
$ kubectl logs -n kubeflow cache-server-bd9c859db-755zj server
2021/03/01 20:19:21 Initing client manager....
[mysql] 2021/03/01 20:19:22 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:24 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:25 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:27 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:30 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:33 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:39 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:46 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:19:55 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:20:07 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:20:26 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:21:02 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:21:40 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:22:35 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:23:58 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:09 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:50 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:51 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:52 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:54 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:56 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:25:59 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:26:02 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:26:06 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:26:15 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:26:20 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:26:34 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:27:03 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:27:45 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:28:11 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:29:39 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:30:12 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:31:32 packets.go:36: unexpected EOF
[mysql] 2021/03/01 20:32:07 packets.go:36: unexpected EOF
F0301 20:32:07.437107 1 error.go:305] invalid connection
goroutine 1 [running]:
github.com/golang/glog.stacks(0xc000786600, 0xc0004790a0, 0x3f, 0x40)
/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:769 +0xd4
github.com/golang/glog.(*loggingT).output(0x237c4c0, 0xc000000003, 0xc000479080, 0x20d8f16, 0x8, 0x131, 0x0)
/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:720 +0x329
github.com/golang/glog.(*loggingT).printf(0x237c4c0, 0x3, 0x14ca0b3, 0x2, 0xc0006c58f8, 0x1, 0x1)
/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:655 +0x14b
github.com/golang/glog.Fatalf(0x14ca0b3, 0x2, 0xc0006c58f8, 0x1, 0x1)
/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:1148 +0x67
github.com/kubeflow/pipelines/backend/src/common/util.TerminateIfError(0x1649b00, 0xc0005eca40)
/go/src/github.com/kubeflow/pipelines/backend/src/common/util/error.go:305 +0x79
main.initMysql(0x7ffefc6905bd, 0x5, 0x7ffefc6905cd, 0x5, 0x7ffefc6905dd, 0x4, 0x7ffefc6905ec, 0x7, 0x7ffefc6905fe, 0x4, ...)
/go/src/github.com/kubeflow/pipelines/backend/src/cache/client_manager.go:157 +0x466
main.initDBClient(0x7ffefc6905bd, 0x5, 0x7ffefc6905cd, 0x5, 0x7ffefc6905dd, 0x4, 0x7ffefc6905ec, 0x7, 0x7ffefc6905fe, 0x4, ...)
/go/src/github.com/kubeflow/pipelines/backend/src/cache/client_manager.go:71 +0x599
main.(*ClientManager).init(0xc0006c5db8, 0x7ffefc6905bd, 0x5, 0x7ffefc6905cd, 0x5, 0x7ffefc6905dd, 0x4, 0x7ffefc6905ec, 0x7, 0x7ffefc6905fe, ...)
/go/src/github.com/kubeflow/pipelines/backend/src/cache/client_manager.go:57 +0x80
main.NewClientManager(0x7ffefc6905bd, 0x5, 0x7ffefc6905cd, 0x5, 0x7ffefc6905dd, 0x4, 0x7ffefc6905ec, 0x7, 0x7ffefc6905fe, 0x4, ...)
/go/src/github.com/kubeflow/pipelines/backend/src/cache/client_manager.go:169 +0xab
main.main()
/go/src/github.com/kubeflow/pipelines/backend/src/cache/main.go:71 +0x367
Attempted suggestions for repair (ALL fail - please do not suggest)
- ISTIO disable ISTIO_MUTUAL -> DISABLE : This allows the mysql db to be populated but the KFP UI will NOT startup.
- ISTIO configure STRICT vs PERMISSIVE : Pipelines and Jupyter Notebooks will not come up.
The product as advertised online does not work on a vanilla on-prem, K8s installation. It appears to work on GCP, Azure, AwS, and possibly IBM.
Provided diagnostic tools are not compatible with an on-prem installation:
$ kfp diagnose_me
Google Cloud SDK is not installed, gcloud, gsutil and kubectl are required for this app to run. Please follow instructions at https://cloud.google.com/sdk/install to install the SDK.
/kind bug
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (5 by maintainers)
Top GitHub Comments
@Bobgy is there a solution for that?
You are right, I was only referring to the distribution