Pod status becomes CrashLoopBackOff when using file-metric-collector
See original GitHub issue/kind bug
What steps did you take and what happened: Hi, I am trying to run keras code with saving metrics to the file, using file-metric-collector But I am in trouble as below. Trial is still Running status.
vagrant@minikf:~$ kubectl get pods -n kubeflow-user
NAME READY STATUS RESTARTS AGE
keras-example-2dfmhlcm-worker-0 0/2 CrashLoopBackOff 14 15m
keras-example-2vzmffv8-worker-0 0/2 CrashLoopBackOff 14 15m
keras-example-jc9sn4nw-worker-0 0/2 CrashLoopBackOff 14 15m
keras-example-random-7cf48d4bf7-dmbtb 1/1 Running 0 15m
vagrant@minikf:~$ kubectl get trial -n kubeflow-user
NAME TYPE STATUS AGE
keras-example-2dfmhlcm Running True 15m
keras-example-2vzmffv8 Running True 15m
keras-example-jc9sn4nw Running True 15m
vagrant@minikf:~$ kubectl describe pod keras-example-2dfmhlcm-worker-0 -n kubeflow-user
Name: keras-example-2dfmhlcm-worker-0
Namespace: kubeflow-user
Priority: 0
PriorityClassName: <none>
Node: minikube/10.10.10.10
Start Time: Sun, 15 Mar 2020 19:40:58 -0700
Labels: controller-name=tf-operator
group-name=kubeflow.org
job-name=keras-example-2dfmhlcm
job-role=master
tf-job-name=keras-example-2dfmhlcm
tf-replica-index=0
tf-replica-type=worker
Annotations: sidecar.istio.io/inject: false
Status: Running
IP: 172.17.0.70
Controlled By: TFJob/keras-example-2dfmhlcm
Init Containers:
startup-lock-init-container:
Container ID: docker://c14afd242090537a5409fc6d684b2dad8fca03221753de28e6a5177bbf5c9f96
Image: gcr.io/arrikto-public/startup-lock-init@sha256:0fbe996a2f6b380d7c566ba16255ec034faec983c2661da778fe09b3e744ad21
Image ID: docker-pullable://gcr.io/arrikto-public/startup-lock-init@sha256:0fbe996a2f6b380d7c566ba16255ec034faec983c2661da778fe09b3e744ad21
Port: <none>
Host Port: <none>
Args:
--host
$(HOST_IP)
--port
10101
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 15 Mar 2020 19:41:01 -0700
Finished: Sun, 15 Mar 2020 19:41:01 -0700
Ready: True
Restart Count: 0
Environment:
HOST_IP: (v1:status.hostIP)
Mounts: <none>
Containers:
tensorflow:
Container ID: docker://4e208c08d87d5d1c16d6171ce64b9ac0e0fcfeffab3f839417650eae3b4aa85c
Image: docker.io/jeun0241/katib_keras_v3
Image ID: docker-pullable://jeun0241/katib_keras_v3@sha256:58f172fc6e388cbb6879991305299c1f096205f11991b8590bd022a7b53f1b99
Port: 2222/TCP
Host Port: 0/TCP
Command:
sh
-c
Args:
python /var/katib_keras/main.py --epochs=2 --log_dir='/var/katib_keras/cifar10.log' --learning_rate=0.017554690620699392 --batch_size=155 && echo completed > /var/katib_keras/$$$$.pid
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Sun, 15 Mar 2020 20:07:25 -0700
Finished: Sun, 15 Mar 2020 20:07:25 -0700
Ready: False
Restart Count: 10
Requests:
cpu: 1m
memory: 1M
Environment: <none>
Mounts:
/var/katib_keras from metrics-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-8r72s (ro)
metrics-logger-and-collector:
Container ID: docker://879aa09616d3cafaec341adb859b5c59637fd26bd137b0f6e23e3a84d7ef55a2
Image: gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0
Image ID: docker-pullable://gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector@sha256:9869c814da18054ee339f25229a72a74c962e077c8fa8b099ecf3f60d495d13d
Port: <none>
Host Port: <none>
Args:
-t
keras-example-2dfmhlcm
-m
accuracy;loss
-s
katib-db-manager.kubeflow:6789
-path
/var/katib_keras/cifar10.log
-f
{metricName: ([\w|-]+), metricValue: ((-?\d+)(\.\d+)?)}
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Sun, 15 Mar 2020 20:07:25 -0700
Finished: Sun, 15 Mar 2020 20:07:25 -0700
Ready: False
Restart Count: 10
Limits:
cpu: 500m
ephemeral-storage: 5Gi
memory: 100Mi
Requests:
cpu: 50m
ephemeral-storage: 500Mi
memory: 10Mi
Environment: <none>
Mounts:
/var/katib_keras from metrics-volume (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-8r72s:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-8r72s
Optional: false
metrics-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 28m default-scheduler Successfully assigned kubeflow-user/keras-example-2dfmhlcm-worker-0 to minikube
Normal Pulled 28m kubelet, minikube Container image "gcr.io/arrikto-public/startup-lock-init@sha256:0fbe996a2f6b380d7c566ba16255ec034faec983c2661da778fe09b3e744ad21" already present on machine
Normal Created 28m kubelet, minikube Created container startup-lock-init-container
Normal Started 28m kubelet, minikube Started container startup-lock-init-container
Normal Pulling 28m kubelet, minikube Pulling image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
Normal Pulled 28m kubelet, minikube Successfully pulled image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
Normal Started 28m (x2 over 28m) kubelet, minikube Started container metrics-logger-and-collector
Warning BackOff 28m (x2 over 28m) kubelet, minikube Back-off restarting failed container
Normal Pulled 27m (x3 over 28m) kubelet, minikube Container image "docker.io/jeun0241/katib_keras_v3" already present on machine
Normal Created 27m (x3 over 28m) kubelet, minikube Created container tensorflow
Normal Created 27m (x3 over 28m) kubelet, minikube Created container metrics-logger-and-collector
Normal Pulled 27m (x2 over 28m) kubelet, minikube Container image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0" already present on machine
Normal Started 27m (x3 over 28m) kubelet, minikube Started container tensorflow
Warning BackOff 3m22s (x123 over 28m) kubelet, minikube Back-off restarting failed container
- docker logs
vagrant@minikf:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8d94203a1032 k8s.gcr.io/pause:3.1 "/pause" 53 minutes ago Up 53 minutes k8s_POD_keras-example-jc9sn4nw-worker-0_kubeflow-user_9179e635-672f-11ea-87be-0800271cfe03_0
0773985eb93b k8s.gcr.io/pause:3.1 "/pause" 53 minutes ago Up 53 minutes k8s_POD_keras-example-2dfmhlcm-worker-0_kubeflow-user_91374b3b-672f-11ea-87be-0800271cfe03_0
51c959588e14 k8s.gcr.io/pause:3.1 "/pause" 53 minutes ago Up 53 minutes k8s_POD_keras-example-2vzmffv8-worker-0_kubeflow-user_90fbf8f1-672f-11ea-87be-0800271cfe03_0
3307e1b301d1 56c0051f100c "python main.py" 54 minutes ago Up 54 minutes k8s_suggestion_keras-example-random-7cf48d4bf7-dmbtb_kubeflow-user_83fb88e9-672f-11ea-87be-0800271cfe03_0
c41d550d2dbb k8s.gcr.io/pause:3.1 "/pause" 54 minutes ago Up 54 minutes k8s_POD_keras-example-random-7cf48d4bf7-dmbtb_kubeflow-user_83fb88e9-672f-11ea-87be-0800271cfe03_0
vagrant@minikf:~$ docker logs 3307e1b301d1
INFO:hyperopt.utils:Failed to load dill, try installing dill via "pip install dill" for enhanced pickling support.
INFO:hyperopt.fmin:Failed to load dill, try installing dill via "pip install dill" for enhanced pickling support.
ERROR:grpc._server:Exception calling application: Method not implemented!
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/grpc/_server.py", line 434, in _call_behavior
response_or_iterator = behavior(argument, context)
File "/usr/src/app/github.com/kubeflow/katib/pkg/apis/manager/v1alpha3/python/api_pb2_grpc.py", line 135, in ValidateAlgorithmSettings
raise NotImplementedError('Method not implemented!')
NotImplementedError: Method not implemented!
- minikube logs
vagrant@minikf:~$ minikube logs
==> coredns <==
.:53
2020-03-16T01:31:35.034Z [INFO] CoreDNS-1.3.1
2020-03-16T01:31:35.034Z [INFO] linux/amd64, go1.11.4, 6b56a9c
CoreDNS-1.3.1
linux/amd64, go1.11.4, 6b56a9c
2020-03-16T01:31:35.034Z [INFO] plugin/reload: Running configuration MD5 = 599b9eb76b8c147408aed6a0bbe0f669
==> dmesg <==
[ +0.000001] __x64_sys_write+0x1e/0x20
[ +0.000001] do_syscall_64+0x5e/0x140
[ +0.000002] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ +0.000001] RIP: 0033:0x56116f6f7580
[ +0.000001] Code: 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 49 c7 c2 00 00 00 00 49 c7 c0 00 00 00 00 49 c7 c1 00 00 00 00 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[ +0.000001] RSP: 002b:000000c000cc4910 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ +0.000001] RAX: ffffffffffffffda RBX: 000000c00005c000 RCX: 000056116f6f7580
[ +0.000000] RDX: 0000000000000002 RSI: 000000c000cc4afe RDI: 000000000000000a
[ +0.000001] RBP: 000000c000cc4960 R08: 0000000000000000 R09: 0000000000000000
[ +0.000001] R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000c
[ +0.000000] R13: 0000000000000032 R14: 0000561171365e68 R15: 0000000000000000
[ +0.000002] ---[ end trace b0b550b7b8142a1e ]---
[ +39.415462] systemd-fstab-generator[3198]: Mount point ext3 is not a valid path, ignoring.
[ +0.178858] systemd-fstab-generator[3230]: Mount point ext3 is not a valid path, ignoring.
[ +0.196463] systemd-fstab-generator[3283]: Mount point ext3 is not a valid path, ignoring.
[ +0.744680] systemd-fstab-generator[3338]: Mount point ext3 is not a valid path, ignoring.
[ +0.080607] systemd-fstab-generator[3358]: Mount point ext3 is not a valid path, ignoring.
[ +0.187665] systemd-fstab-generator[3402]: Mount point ext3 is not a valid path, ignoring.
[ +0.105639] systemd-fstab-generator[3422]: Mount point ext3 is not a valid path, ignoring.
[ +0.334824] systemd-fstab-generator[3469]: Mount point ext3 is not a valid path, ignoring.
[ +0.184957] systemd-fstab-generator[3489]: Mount point ext3 is not a valid path, ignoring.
[ +0.224083] systemd-fstab-generator[3534]: Mount point ext3 is not a valid path, ignoring.
[ +0.138828] systemd-fstab-generator[3554]: Mount point ext3 is not a valid path, ignoring.
[ +0.236530] systemd-fstab-generator[3594]: Mount point ext3 is not a valid path, ignoring.
[ +0.257815] systemd-fstab-generator[3614]: Mount point ext3 is not a valid path, ignoring.
[ +0.195213] systemd-fstab-generator[3654]: Mount point ext3 is not a valid path, ignoring.
[Mar15 18:15] systemd-fstab-generator[3674]: Mount point ext3 is not a valid path, ignoring.
[ +0.293551] systemd-fstab-generator[3715]: Mount point ext3 is not a valid path, ignoring.
[ +0.110957] systemd-fstab-generator[3735]: Mount point ext3 is not a valid path, ignoring.
[ +8.617339] systemd-fstab-generator[4130]: Mount point ext3 is not a valid path, ignoring.
[ +18.227501] systemd-fstab-generator[4492]: Mount point ext3 is not a valid path, ignoring.
[Mar15 18:16] tee (6915): /proc/6052/oom_adj is deprecated, please use /proc/6052/oom_score_adj instead.
[Mar15 18:20] printk: rsyslogd (20063): Attempt to access syslog with CAP_SYS_ADMIN but no CAP_SYSLOG (deprecated).
[ +7.974861] db_root: cannot open: /etc/target
[Mar15 18:25] EXT4-fs (dm-3): mounting with "discard" option, but the device does not support discard
[Mar15 18:26] overlayfs: upperdir is in-use by another mount, accessing files from both mounts will result in undefined behavior.
[ +0.000003] overlayfs: workdir is in-use by another mount, accessing files from both mounts will result in undefined behavior.
[ +20.280790] EXT4-fs (dm-6): mounting with "discard" option, but the device does not support discard
[ +25.443567] overlayfs: lowerdir is in-use as upperdir/workdir
[Mar15 18:27] EXT4-fs (dm-13): mounting with "discard" option, but the device does not support discard
[ +12.292494] EXT4-fs (dm-14): mounting with "discard" option, but the device does not support discard
[ +7.222790] EXT4-fs (dm-15): mounting with "discard" option, but the device does not support discard
[Mar15 18:28] show_signal: 5 callbacks suppressed
[ +22.486167] overlayfs: lowerdir is in-use as upperdir/workdir
[Mar15 18:33] hrtimer: interrupt took 7986382 ns
[Mar15 18:36] systemd-fstab-generator[15565]: Mount point ext3 is not a valid path, ignoring.
[ +2.217612] systemd-fstab-generator[15792]: Mount point ext3 is not a valid path, ignoring.
[ +1.140860] systemd-fstab-generator[15863]: Mount point ext3 is not a valid path, ignoring.
[ +0.893805] systemd-fstab-generator[15987]: Mount point ext3 is not a valid path, ignoring.
[ +1.340843] systemd-fstab-generator[16105]: Mount point ext3 is not a valid path, ignoring.
==> kernel <==
20:13:30 up 1:59, 2 users, load average: 3.34, 4.60, 6.45
Linux minikf 5.2.0-050200rc6-generic #201906222033 SMP Sun Jun 23 00:36:46 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
==> kube-addon-manager <==
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:06:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:07:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:07:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:08:08+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:08:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:09:08+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:09:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:10:08+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:10:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:11:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:11:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:12:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:12:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:13:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:13:09+00:00 ==
==> kube-apiserver <==
E0316 02:59:13.304935 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 02:59:13.304945 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:01:13.301055 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:01:13.303343 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:01:13.303355 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:01:13.303374 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:01:13.305685 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:01:13.305694 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:02:13.303538 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:02:13.305867 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:02:13.305880 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:02:13.305894 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:02:13.307715 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:02:13.307725 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:04:13.306083 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:04:13.308255 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:04:13.308268 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:04:13.308284 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:04:13.309752 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:04:13.309762 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:06:13.310792 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:06:13.312923 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:06:13.312938 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:06:13.312952 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:06:13.314817 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:06:13.314830 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:07:13.313283 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:07:13.315558 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:07:13.315573 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:07:13.315587 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:07:13.317431 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:07:13.317463 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:09:13.315930 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:09:13.318035 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:09:13.318050 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:09:13.318065 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:09:13.320185 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:09:13.320197 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:11:13.317644 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:11:13.320879 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:11:13.320893 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:11:13.320914 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:11:13.323729 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:11:13.323741 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:12:13.323358 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:12:13.355797 1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:12:13.355817 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:12:13.355839 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:12:13.363558 1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:12:13.363575 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
==> kube-proxy <==
I0316 01:16:09.844831 1 config.go:202] Starting service config controller
I0316 01:16:09.844840 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0316 01:16:09.945577 1 controller_utils.go:1034] Caches are synced for service config controller
I0316 01:16:09.945798 1 controller_utils.go:1034] Caches are synced for endpoints config controller
I0316 01:24:40.451118 1 trace.go:81] Trace[1232360537]: "iptables restore" (started: 2020-03-16 01:24:37.169600297 +0000 UTC m=+509.131752355) (total time: 3.281475998s):
Trace[1232360537]: [3.281475998s] [3.281426975s] END
I0316 01:29:08.331863 1 trace.go:81] Trace[902179412]: "iptables restore" (started: 2020-03-16 01:29:05.899390281 +0000 UTC m=+777.861542363) (total time: 2.432444272s):
Trace[902179412]: [2.432444272s] [2.432382409s] END
E0316 01:30:01.341044 1 reflector.go:270] k8s.io/client-go/informers/factory.go:133: Failed to watch *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&resourceVersion=5543&timeout=5m27s&timeoutSeconds=327&watch=true: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:11.636815 1 trace.go:81] Trace[2040050030]: "iptables save" (started: 2020-03-16 01:29:51.576568137 +0000 UTC m=+823.538720222) (total time: 20.060206033s):
Trace[2040050030]: [20.060206033s] [20.060206033s] END
I0316 01:30:14.491502 1 trace.go:81] Trace[193692283]: "iptables save" (started: 2020-03-16 01:30:11.636914345 +0000 UTC m=+843.599066411) (total time: 2.854547766s):
Trace[193692283]: [2.854547766s] [2.854547766s] END
E0316 01:30:19.418829 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:19.418920 1 reflector.go:270] k8s.io/client-go/informers/factory.go:133: Failed to watch *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&resourceVersion=4685&timeout=7m53s&timeoutSeconds=473&watch=true: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:24.827656 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:24.827805 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:26.842614 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:26.842772 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:27.070086 1 trace.go:81] Trace[1540084176]: "iptables restore" (started: 2020-03-16 01:30:15.223623929 +0000 UTC m=+847.185775975) (total time: 11.846428497s):
Trace[1540084176]: [11.846428497s] [11.846385203s] END
E0316 01:30:28.265496 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:28.405258 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:31.218286 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:31.219128 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:34.592543 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:35.985587 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:38.099759 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:39.146485 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:39.146523 1 trace.go:81] Trace[1740911148]: "iptables save" (started: 2020-03-16 01:30:32.037280192 +0000 UTC m=+863.999462098) (total time: 7.0430792s):
Trace[1740911148]: [7.0430792s] [7.0430792s] END
E0316 01:30:39.158178 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:40.428729 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:40.429840 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:41.615398 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:41.616731 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:42.781473 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:42.781581 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:43.797401 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:43.797882 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:58.450512 1 trace.go:81] Trace[1298402344]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:46.476398984 +0000 UTC m=+878.438551034) (total time: 11.974090209s):
Trace[1298402344]: [11.974090209s] [11.974090209s] END
E0316 01:30:58.450535 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:58.450740 1 trace.go:81] Trace[912542047]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.07242909 +0000 UTC m=+877.034581193) (total time: 13.378297644s):
Trace[912542047]: [13.378297644s] [13.378297644s] END
E0316 01:30:58.450749 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:31:19.842368 1 trace.go:81] Trace[1372274666]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:59.526358196 +0000 UTC m=+891.488510248) (total time: 20.315965682s):
Trace[1372274666]: [20.315849461s] [20.315849461s] Objects listed
I0316 01:31:20.151919 1 trace.go:81] Trace[1829181234]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:59.525840543 +0000 UTC m=+891.487992616) (total time: 20.626047204s):
Trace[1829181234]: [20.625887426s] [20.625887426s] Objects listed
==> kube-scheduler <==
Trace[688142464]: [10.012338371s] [10.012338371s] END
E0316 01:30:55.721424 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.StorageClass: Get https://localhost:8443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.721602 1 trace.go:81] Trace[1731965608]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.709600475 +0000 UTC m=+30.409933491) (total time: 10.011985732s):
Trace[1731965608]: [10.011985732s] [10.011985732s] END
E0316 01:30:55.721614 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.PersistentVolumeClaim: Get https://localhost:8443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.721784 1 trace.go:81] Trace[189113385]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.708434939 +0000 UTC m=+30.408767988) (total time: 10.013335922s):
Trace[189113385]: [10.013335922s] [10.013335922s] END
E0316 01:30:55.721793 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.StatefulSet: Get https://localhost:8443/apis/apps/v1/statefulsets?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.721821 1 trace.go:81] Trace[107151580]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.708463688 +0000 UTC m=+30.408796709) (total time: 10.013339543s):
Trace[107151580]: [10.013339543s] [10.013339543s] END
E0316 01:30:55.721834 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.PodDisruptionBudget: Get https://localhost:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.731799 1 trace.go:81] Trace[37345855]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.730270778 +0000 UTC m=+30.430603788) (total time: 10.001461625s):
Trace[37345855]: [10.001461625s] [10.001461625s] END
E0316 01:30:55.731856 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.PersistentVolume: Get https://localhost:8443/api/v1/persistentvolumes?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.732081 1 trace.go:81] Trace[1154873847]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.730339884 +0000 UTC m=+30.430672894) (total time: 10.001699551s):
Trace[1154873847]: [10.001699551s] [10.001699551s] END
E0316 01:30:55.732097 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.ReplicationController: Get https://localhost:8443/api/v1/replicationcontrollers?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.813474 1 trace.go:81] Trace[1448715838]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.809623636 +0000 UTC m=+30.509956673) (total time: 10.003823513s):
Trace[1448715838]: [10.003823513s] [10.003823513s] END
E0316 01:30:55.813530 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.822105 1 trace.go:81] Trace[663455364]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.815380857 +0000 UTC m=+30.515713908) (total time: 10.006697176s):
Trace[663455364]: [10.006697176s] [10.006697176s] END
E0316 01:30:55.822163 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.ReplicaSet: Get https://localhost:8443/apis/apps/v1/replicasets?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.864035 1 trace.go:81] Trace[55394677]: "Reflector k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:223 ListAndWatch" (started: 2020-03-16 01:30:45.855584604 +0000 UTC m=+30.555917684) (total time: 10.008402184s):
Trace[55394677]: [10.008402184s] [10.008402184s] END
E0316 01:30:55.864061 1 reflector.go:126] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:223: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=status.phase%3DFailed%!C(MISSING)status.phase%3DSucceeded&limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:31:12.742992 1 trace.go:81] Trace[748316010]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.81370936 +0000 UTC m=+41.514042379) (total time: 15.929244629s):
Trace[748316010]: [15.921722302s] [15.921722302s] Objects listed
I0316 01:31:13.087143 1 trace.go:81] Trace[624725376]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.72423489 +0000 UTC m=+41.424569591) (total time: 16.362840695s):
Trace[624725376]: [16.36276513s] [16.36276513s] Objects listed
I0316 01:31:13.101093 1 trace.go:81] Trace[1856609687]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.724733813 +0000 UTC m=+41.425066834) (total time: 16.37632406s):
Trace[1856609687]: [16.37627825s] [16.37627825s] Objects listed
I0316 01:31:13.101424 1 trace.go:81] Trace[1564355179]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.694817873 +0000 UTC m=+41.395150985) (total time: 16.406584871s):
Trace[1564355179]: [16.406562153s] [16.406562153s] Objects listed
I0316 01:31:13.104366 1 trace.go:81] Trace[1085617361]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.733182796 +0000 UTC m=+41.433515870) (total time: 16.37114514s):
Trace[1085617361]: [16.371089522s] [16.371089522s] Objects listed
I0316 01:31:13.178539 1 trace.go:81] Trace[195603164]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.739070512 +0000 UTC m=+41.439403560) (total time: 16.439387269s):
Trace[195603164]: [16.439279288s] [16.439279288s] Objects listed
I0316 01:31:13.179847 1 trace.go:81] Trace[973338140]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.726034862 +0000 UTC m=+41.426367910) (total time: 16.453782937s):
Trace[973338140]: [16.453684439s] [16.453684439s] Objects listed
I0316 01:31:13.374846 1 trace.go:81] Trace[25725130]: "Reflector k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:223 ListAndWatch" (started: 2020-03-16 01:30:56.865061138 +0000 UTC m=+41.565394187) (total time: 16.509741041s):
Trace[25725130]: [16.341250117s] [16.341250117s] Objects listed
I0316 01:31:13.385107 1 trace.go:81] Trace[324039104]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.822673273 +0000 UTC m=+41.523006291) (total time: 16.562396125s):
Trace[324039104]: [16.562198342s] [16.562198342s] Objects listed
I0316 01:31:13.491892 1 trace.go:81] Trace[811062640]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.734207097 +0000 UTC m=+41.434540118) (total time: 16.757303463s):
Trace[811062640]: [16.75682728s] [16.75682728s] Objects listed
I0316 01:31:14.312649 1 controller_utils.go:1027] Waiting for caches to sync for scheduler controller
I0316 01:31:14.414804 1 controller_utils.go:1034] Caches are synced for scheduler controller
I0316 01:31:14.414908 1 leaderelection.go:217] attempting to acquire leader lease kube-system/kube-scheduler...
I0316 01:31:41.677546 1 leaderelection.go:227] successfully acquired lease kube-system/kube-scheduler
==> kubelet <==
-- Logs begin at Sun 2020-03-15 18:14:05 PDT, end at Sun 2020-03-15 20:13:30 PDT. --
Mar 15 20:12:46 minikf kubelet[4552]: E0315 20:12:46.856668 4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:12:47 minikf kubelet[4552]: E0315 20:12:47.857242 4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:47 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:47 minikf kubelet[4552]: ]
Mar 15 20:12:48 minikf kubelet[4552]: I0315 20:12:48.389137 4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:12:52 minikf kubelet[4552]: E0315 20:12:52.878461 4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:52 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:52 minikf kubelet[4552]: ]
Mar 15 20:12:52 minikf kubelet[4552]: W0315 20:12:52.901225 4552 reflector.go:289] object-"default"/"admission-webhook-config-f8bhm66bg2": watch of *v1.ConfigMap ended with: too old resource version: 52075 (55931)
Mar 15 20:12:53 minikf kubelet[4552]: I0315 20:12:53.389364 4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:12:53 minikf kubelet[4552]: I0315 20:12:53.389418 4552 kubelet.go:1963] SyncLoop (container unhealthy): "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:12:57 minikf kubelet[4552]: W0315 20:12:57.606129 4552 reflector.go:289] object-"kubeflow"/"jupyter-web-app-parameters": watch of *v1.ConfigMap ended with: too old resource version: 51879 (55978)
Mar 15 20:12:57 minikf kubelet[4552]: E0315 20:12:57.857035 4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:12:58 minikf kubelet[4552]: E0315 20:12:58.863881 4552 pod_workers.go:190] Error syncing pod 90fbf8f1-672f-11ea-87be-0800271cfe03 ("keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:58 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:58 minikf kubelet[4552]: ]
Mar 15 20:13:01 minikf kubelet[4552]: E0315 20:13:01.857039 4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:01 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:01 minikf kubelet[4552]: ]
Mar 15 20:13:03 minikf kubelet[4552]: I0315 20:13:03.388729 4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:13:04 minikf kubelet[4552]: E0315 20:13:04.860442 4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:04 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:04 minikf kubelet[4552]: ]
Mar 15 20:13:08 minikf kubelet[4552]: I0315 20:13:08.388879 4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:13:12 minikf kubelet[4552]: E0315 20:13:12.856976 4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: I0315 20:13:13.389021 4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:13:13 minikf kubelet[4552]: I0315 20:13:13.389075 4552 kubelet.go:1963] SyncLoop (container unhealthy): "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: E0315 20:13:13.842313 4552 pod_workers.go:190] Error syncing pod 088bbcb2-6725-11ea-96c3-0800271cfe03 ("istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: E0315 20:13:13.860068 4552 pod_workers.go:190] Error syncing pod 90fbf8f1-672f-11ea-87be-0800271cfe03 ("keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: ]
Mar 15 20:13:14 minikf kubelet[4552]: E0315 20:13:14.629564 4552 pod_workers.go:190] Error syncing pod 088bbcb2-6725-11ea-96c3-0800271cfe03 ("istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:14 minikf kubelet[4552]: E0315 20:13:14.856604 4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:14 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:14 minikf kubelet[4552]: ]
Mar 15 20:13:17 minikf kubelet[4552]: E0315 20:13:17.856361 4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:17 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:17 minikf kubelet[4552]: ]
Mar 15 20:13:24 minikf kubelet[4552]: E0315 20:13:24.855963 4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: E0315 20:13:25.856404 4552 pod_workers.go:190] Error syncing pod 088bbcb2-6725-11ea-96c3-0800271cfe03 ("istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: E0315 20:13:25.859177 4552 pod_workers.go:190] Error syncing pod 90fbf8f1-672f-11ea-87be-0800271cfe03 ("keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: ]
Mar 15 20:13:28 minikf kubelet[4552]: E0315 20:13:28.860201 4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:28 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:28 minikf kubelet[4552]: ]
Mar 15 20:13:29 minikf kubelet[4552]: W0315 20:13:29.021076 4552 reflector.go:289] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMap ended with: too old resource version: 53788 (56278)
Mar 15 20:13:29 minikf kubelet[4552]: E0315 20:13:29.855845 4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:29 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:29 minikf kubelet[4552]: ]
==> storage-provisioner <==
E0316 01:29:58.667283 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:29:58.900926 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:01.069924 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:01.070038 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:01.303386 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:06.045703 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:06.045788 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:06.045863 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:07.249362 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:13.012767 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:13.013229 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:13.013346 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:19.418003 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:19.420404 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:26.843573 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:26.843692 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:26.843759 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:27.911139 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:27.911536 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:27.914175 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:30.434582 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:30.434898 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:30.439168 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:33.714728 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:34.412636 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:34.412699 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:35.806554 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:35.806646 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:35.806767 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:37.905548 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:37.905614 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:37.905726 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:39.174734 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:39.174795 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:39.174947 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:40.706575 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:40.727022 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:40.727380 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:41.712944 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:41.751510 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:41.759825 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:42.782318 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:42.782797 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:42.783161 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:43.807143 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:43.807379 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:43.807597 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:58.390643 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: net/http: TLS handshake timeout
E0316 01:30:58.466686 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: net/http: TLS handshake timeout
E0316 01:30:58.564805 1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: net/http: TLS handshake timeout
What did you expect to happen: I thought there would be no problem because the log file was created successfully and I set the file-metric-collector to refer to the generated log file. I’d appreciate it if you let me know what’s wrong.
- The keras code and yaml file that I was trying to run
import tensorflow as tf
from keras.optimizers import Adam
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.resnet50 import ResNet50
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.inception_v3 import InceptionV3
import logging
import argparse
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
(x_train, x_valid) = x_train[100:1100], x_train[:100]
(y_train, y_valid) = y_train[100:1100], y_train[:100]
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--num_classes', type=int, default=10,
help='the number of classes')
parser.add_argument('--batch_size', type=int, default=256,
help='Number of samples per gradient update.')
parser.add_argument('--epochs', type=int, default=5,
help='Number of epochs to run trainer.')
parser.add_argument('--learning_rate', type=float, default=0.001,
help='Initial learning rate')
parser.add_argument('--network', type=str, default='vgg16',
help='Open source deep learning model')
parser.add_argument('--image_shape', default='32, 32, 3',
help='Shape of training images')
parser.add_argument('--log_dir', type=str, default='./cifar10.log',
help='Summaries log directory')
args = parser.parse_args()
logging.basicConfig(filename=args.log_dir, level=logging.DEBUG)
img_shape = args.image_shape.split(',')
img_w = int(img_shape[0].strip())
img_h = int(img_shape[1].strip())
img_c = int(img_shape[2].strip())
if args.network == 'resnet50':
model = ResNet50(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
elif args.network == 'vgg16':
model = VGG16(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
elif args.network == 'vgg19':
model = VGG19(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
elif args.network == 'inceptionv3':
model = InceptionV3(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
opt = Adam(lr=args.learning_rate)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
datagen = ImageDataGenerator()
print(">>> Data Loaded. Training starts.")
for e in range(args.epochs):
print("\nTotal Epoch {}/{}".format(e + 1, args.epochs))
history = model.fit_generator(generator=datagen.flow(x_train, y_train, batch_size=args.batch_size),
steps_per_epoch=int(len(x_train)/args.batch_size)+1,
epochs=1,
verbose=1,
validation_data=(x_valid, y_valid))
logging.info('\n{{metricName: accuracy, metricValue: {:.4f}}};{{metricName: loss, metricValue: {:.4f}}}\n'.format(
history.history['val_accuracy'][-1], history.history['val_loss'][-1]))
print("Training-Accuracy={}".format(history.history['accuracy'][-1]))
print("Training-Loss={}".format(history.history['loss'][-1]))
print("Validation-Accuracy={}".format(history.history['val_accuracy'][-1]))
print("Validation-Loss={}".format(history.history['val_loss'][-1]))
apiVersion: "kubeflow.org/v1alpha3"
kind: Experiment
metadata:
namespace: kubeflow-user
name: keras-example
spec:
objective:
type: maximize
goal: 0.99
objectiveMetricName: accuracy
additionalMetricNames:
- loss
metricsCollectorSpec:
source:
filter:
metricsFormat:
- "{metricName: ([\\w|-]+), metricValue: ((-?\\d+)(\\.\\d+)?)}"
fileSystemPath:
path: "/var/katib_keras/cifar10.log"
kind: File
collector:
kind: File
algorithm:
algorithmName: random
parallelTrialCount: 3
maxTrialCount: 3
maxFailedTrialCount: 3
parameters:
- name: --learning_rate
parameterType: double
feasibleSpace:
min: "0.01"
max: "0.05"
- name: --batch_size
parameterType: int
feasibleSpace:
min: "100"
max: "200"
trialTemplate:
goTemplate:
rawTemplate: |-
apiVersion: "kubeflow.org/v1"
kind: TFJob
metadata:
name: {{.Trial}}
namespace: {{.NameSpace}}
spec:
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: OnFailure
template:
spec:
containers:
- name: tensorflow
image: docker.io/jeun0241/katib_keras_v3
imagePullPolicy: Always
command:
- "python"
- "/var/katib_keras/main.py"
- "--epochs=2"
- "--log_dir='/var/katib_keras/cifar10.log'"
{{- with .HyperParameters}}
{{- range .}}
- "{{.Name}}={{.Value}}"
{{- end}}
{{- end}}
- generated log file
WARNING:tensorflow:From /opt/anaconda3/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /opt/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.
WARNING:tensorflow:From /opt/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
INFO:root:
{metricName: accuracy, metricValue: 0.0972};{metricName: loss, metricValue: 2.3030}
INFO:root:
{metricName: accuracy, metricValue: 0.0976};{metricName: loss, metricValue: 2.3027}
INFO:root:
{metricName: accuracy, metricValue: 0.0920};{metricName: loss, metricValue: 2.3027}
INFO:root:
{metricName: accuracy, metricValue: 0.0920};{metricName: loss, metricValue: 2.3027}
INFO:root:
{metricName: accuracy, metricValue: 0.0920};{metricName: loss, metricValue: 2.3027}
Anything else you would like to add: In addition, I pulled the latest version of the file-metrics-collector image and tried it again, but it automatically pulls the v0.8.0 image as below.
Normal Pulling 28m kubelet, minikube Pulling image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
Normal Pulled 28m kubelet, minikube Successfully pulled image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
[Miscellaneous information that will assist in solving the issue.]
Environment:
- Katib version: v1alpha3 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-ui v0.8.0 540d9308c9f6 4 weeks ago 54.4MB gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector v0.8.0 ff9ce96cf37c 4 weeks ago 25.2MB gcr.io/kubeflow-images-public/katib/v1alpha3/katib-controller v0.8.0 7c5162abd775 4 weeks ago 53.8MB gcr.io/kubeflow-images-public/katib/v1alpha3/suggestion-hyperopt v0.8.0 56c0051f100c 4 weeks ago 1.23GB gcr.io/kubeflow-images-public/katib/v1alpha3/katib-db-manager v0.8.0 32229959fe81 4 weeks ago 28.5MB
- MiniKF box version: arrikto/minikf (virtualbox, 20200305.1.0)
- Kubeflow version:
- Minikube version: minikube version: v1.2.0
- Kubernetes version: (use
kubectl version
): Client Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.3”, GitCommit:“5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0”, GitTreeState:“clean”, BuildDate:“2019-06-06T01:44:30Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.3”, GitCommit:“5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0”, GitTreeState:“clean”, BuildDate:“2019-06-06T01:36:19Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”} - OS (e.g. from
/etc/os-release
):
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Understanding Kubernetes CrashLoopBackoff Events
CrashLoopBackOff is a status message that indicates one of your pods is in a constant state of flux—one or more containers are failing...
Read more >Kubernetes CrashLoopBackOff: What it is, and how to fix it?
Learn to visualize, alert, and troubleshoot a Kubernetes CrashLoopBackOff: A pod starting, crashing, starting again, and crashing again.
Read more >Kubernetes CrashLoopBackOff Error: What It Is and How to Fix It
CrashLoopBackOff is a common Kubernetes error, which indicates that a pod failed to start, Kubernetes tried to restart it, and it continued to...
Read more >Troubleshoot and Fix Kubernetes CrashLoopBackoff Status
The CrashLoopBackoff status is a notification that the pod is being restarted due to an error and is waiting for the specified 'backoff'...
Read more >Why the pod status is coming as crashloopbackoff in my case?
Also in order to reliably run one Pod to completion you should use kubernetes Jobs. THat will create pod in Completed status.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
In my case, the log file was not created because I entered the key value incorrectly for validation accuracy. The key value was changed from ‘val_accuracy’ to ‘val_acc’ and it was executed normally. This issue seems to depend on the Keras version. Thanks for your help!
You should check logs from
metrics-logger-and-collector
container. Maybe in your training container you don’t have permissions to create files under /var/ folder, so you have to change folder to /train/ or something similar?