question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pod status becomes CrashLoopBackOff when using file-metric-collector

See original GitHub issue

/kind bug

What steps did you take and what happened: Hi, I am trying to run keras code with saving metrics to the file, using file-metric-collector But I am in trouble as below. Trial is still Running status.

vagrant@minikf:~$ kubectl get pods -n kubeflow-user
NAME                                    READY   STATUS             RESTARTS   AGE
keras-example-2dfmhlcm-worker-0         0/2     CrashLoopBackOff   14         15m
keras-example-2vzmffv8-worker-0         0/2     CrashLoopBackOff   14         15m
keras-example-jc9sn4nw-worker-0         0/2     CrashLoopBackOff   14         15m
keras-example-random-7cf48d4bf7-dmbtb   1/1     Running            0          15m

vagrant@minikf:~$ kubectl get trial -n kubeflow-user
NAME                     TYPE      STATUS   AGE
keras-example-2dfmhlcm   Running   True     15m
keras-example-2vzmffv8   Running   True     15m
keras-example-jc9sn4nw   Running   True     15m
vagrant@minikf:~$ kubectl describe pod keras-example-2dfmhlcm-worker-0 -n kubeflow-user
Name:               keras-example-2dfmhlcm-worker-0
Namespace:          kubeflow-user
Priority:           0
PriorityClassName:  <none>
Node:               minikube/10.10.10.10
Start Time:         Sun, 15 Mar 2020 19:40:58 -0700
Labels:             controller-name=tf-operator
                    group-name=kubeflow.org
                    job-name=keras-example-2dfmhlcm
                    job-role=master
                    tf-job-name=keras-example-2dfmhlcm
                    tf-replica-index=0
                    tf-replica-type=worker
Annotations:        sidecar.istio.io/inject: false
Status:             Running
IP:                 172.17.0.70
Controlled By:      TFJob/keras-example-2dfmhlcm
Init Containers:
  startup-lock-init-container:
    Container ID:  docker://c14afd242090537a5409fc6d684b2dad8fca03221753de28e6a5177bbf5c9f96
    Image:         gcr.io/arrikto-public/startup-lock-init@sha256:0fbe996a2f6b380d7c566ba16255ec034faec983c2661da778fe09b3e744ad21
    Image ID:      docker-pullable://gcr.io/arrikto-public/startup-lock-init@sha256:0fbe996a2f6b380d7c566ba16255ec034faec983c2661da778fe09b3e744ad21
    Port:          <none>
    Host Port:     <none>
    Args:
      --host
      $(HOST_IP)
      --port
      10101
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 15 Mar 2020 19:41:01 -0700
      Finished:     Sun, 15 Mar 2020 19:41:01 -0700
    Ready:          True
    Restart Count:  0
    Environment:
      HOST_IP:   (v1:status.hostIP)
    Mounts:     <none>
Containers:
  tensorflow:
    Container ID:  docker://4e208c08d87d5d1c16d6171ce64b9ac0e0fcfeffab3f839417650eae3b4aa85c
    Image:         docker.io/jeun0241/katib_keras_v3
    Image ID:      docker-pullable://jeun0241/katib_keras_v3@sha256:58f172fc6e388cbb6879991305299c1f096205f11991b8590bd022a7b53f1b99
    Port:          2222/TCP
    Host Port:     0/TCP
    Command:
      sh
      -c
    Args:
      python /var/katib_keras/main.py --epochs=2 --log_dir='/var/katib_keras/cifar10.log' --learning_rate=0.017554690620699392 --batch_size=155 && echo completed > /var/katib_keras/$$$$.pid
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Sun, 15 Mar 2020 20:07:25 -0700
      Finished:     Sun, 15 Mar 2020 20:07:25 -0700
    Ready:          False
    Restart Count:  10
    Requests:
      cpu:        1m
      memory:     1M
    Environment:  <none>
    Mounts:
      /var/katib_keras from metrics-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-8r72s (ro)
  metrics-logger-and-collector:
    Container ID:  docker://879aa09616d3cafaec341adb859b5c59637fd26bd137b0f6e23e3a84d7ef55a2
    Image:         gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0
    Image ID:      docker-pullable://gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector@sha256:9869c814da18054ee339f25229a72a74c962e077c8fa8b099ecf3f60d495d13d
    Port:          <none>
    Host Port:     <none>
    Args:
      -t
      keras-example-2dfmhlcm
      -m
      accuracy;loss
      -s
      katib-db-manager.kubeflow:6789
      -path
      /var/katib_keras/cifar10.log
      -f
      {metricName: ([\w|-]+), metricValue: ((-?\d+)(\.\d+)?)}
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Sun, 15 Mar 2020 20:07:25 -0700
      Finished:     Sun, 15 Mar 2020 20:07:25 -0700
    Ready:          False
    Restart Count:  10
    Limits:
      cpu:                500m
      ephemeral-storage:  5Gi
      memory:             100Mi
    Requests:
      cpu:                50m
      ephemeral-storage:  500Mi
      memory:             10Mi
    Environment:          <none>
    Mounts:
      /var/katib_keras from metrics-volume (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-8r72s:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-8r72s
    Optional:    false
  metrics-volume:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  28m                    default-scheduler  Successfully assigned kubeflow-user/keras-example-2dfmhlcm-worker-0 to minikube
  Normal   Pulled     28m                    kubelet, minikube  Container image "gcr.io/arrikto-public/startup-lock-init@sha256:0fbe996a2f6b380d7c566ba16255ec034faec983c2661da778fe09b3e744ad21" already present on machine
  Normal   Created    28m                    kubelet, minikube  Created container startup-lock-init-container
  Normal   Started    28m                    kubelet, minikube  Started container startup-lock-init-container
  Normal   Pulling    28m                    kubelet, minikube  Pulling image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
  Normal   Pulled     28m                    kubelet, minikube  Successfully pulled image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
  Normal   Started    28m (x2 over 28m)      kubelet, minikube  Started container metrics-logger-and-collector
  Warning  BackOff    28m (x2 over 28m)      kubelet, minikube  Back-off restarting failed container
  Normal   Pulled     27m (x3 over 28m)      kubelet, minikube  Container image "docker.io/jeun0241/katib_keras_v3" already present on machine
  Normal   Created    27m (x3 over 28m)      kubelet, minikube  Created container tensorflow
  Normal   Created    27m (x3 over 28m)      kubelet, minikube  Created container metrics-logger-and-collector
  Normal   Pulled     27m (x2 over 28m)      kubelet, minikube  Container image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0" already present on machine
  Normal   Started    27m (x3 over 28m)      kubelet, minikube  Started container tensorflow
  Warning  BackOff    3m22s (x123 over 28m)  kubelet, minikube  Back-off restarting failed container
  • docker logs
vagrant@minikf:~$ docker ps  
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS               NAMES
8d94203a1032        k8s.gcr.io/pause:3.1   "/pause"                 53 minutes ago      Up 53 minutes                           k8s_POD_keras-example-jc9sn4nw-worker-0_kubeflow-user_9179e635-672f-11ea-87be-0800271cfe03_0
0773985eb93b        k8s.gcr.io/pause:3.1   "/pause"                 53 minutes ago      Up 53 minutes                           k8s_POD_keras-example-2dfmhlcm-worker-0_kubeflow-user_91374b3b-672f-11ea-87be-0800271cfe03_0
51c959588e14        k8s.gcr.io/pause:3.1   "/pause"                 53 minutes ago      Up 53 minutes                           k8s_POD_keras-example-2vzmffv8-worker-0_kubeflow-user_90fbf8f1-672f-11ea-87be-0800271cfe03_0
3307e1b301d1        56c0051f100c           "python main.py"         54 minutes ago      Up 54 minutes                           k8s_suggestion_keras-example-random-7cf48d4bf7-dmbtb_kubeflow-user_83fb88e9-672f-11ea-87be-0800271cfe03_0
c41d550d2dbb        k8s.gcr.io/pause:3.1   "/pause"                 54 minutes ago      Up 54 minutes                           k8s_POD_keras-example-random-7cf48d4bf7-dmbtb_kubeflow-user_83fb88e9-672f-11ea-87be-0800271cfe03_0
vagrant@minikf:~$ docker logs 3307e1b301d1
INFO:hyperopt.utils:Failed to load dill, try installing dill via "pip install dill" for enhanced pickling support.
INFO:hyperopt.fmin:Failed to load dill, try installing dill via "pip install dill" for enhanced pickling support.
ERROR:grpc._server:Exception calling application: Method not implemented!
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/grpc/_server.py", line 434, in _call_behavior
    response_or_iterator = behavior(argument, context)
  File "/usr/src/app/github.com/kubeflow/katib/pkg/apis/manager/v1alpha3/python/api_pb2_grpc.py", line 135, in ValidateAlgorithmSettings
    raise NotImplementedError('Method not implemented!')
NotImplementedError: Method not implemented!
  • minikube logs
vagrant@minikf:~$ minikube logs
==> coredns <==
.:53
2020-03-16T01:31:35.034Z [INFO] CoreDNS-1.3.1
2020-03-16T01:31:35.034Z [INFO] linux/amd64, go1.11.4, 6b56a9c
CoreDNS-1.3.1
linux/amd64, go1.11.4, 6b56a9c
2020-03-16T01:31:35.034Z [INFO] plugin/reload: Running configuration MD5 = 599b9eb76b8c147408aed6a0bbe0f669

==> dmesg <==
[  +0.000001]  __x64_sys_write+0x1e/0x20
[  +0.000001]  do_syscall_64+0x5e/0x140
[  +0.000002]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  +0.000001] RIP: 0033:0x56116f6f7580
[  +0.000001] Code: 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 49 c7 c2 00 00 00 00 49 c7 c0 00 00 00 00 49 c7 c1 00 00 00 00 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[  +0.000001] RSP: 002b:000000c000cc4910 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[  +0.000001] RAX: ffffffffffffffda RBX: 000000c00005c000 RCX: 000056116f6f7580
[  +0.000000] RDX: 0000000000000002 RSI: 000000c000cc4afe RDI: 000000000000000a
[  +0.000001] RBP: 000000c000cc4960 R08: 0000000000000000 R09: 0000000000000000
[  +0.000001] R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000c
[  +0.000000] R13: 0000000000000032 R14: 0000561171365e68 R15: 0000000000000000
[  +0.000002] ---[ end trace b0b550b7b8142a1e ]---
[ +39.415462] systemd-fstab-generator[3198]: Mount point ext3 is not a valid path, ignoring.
[  +0.178858] systemd-fstab-generator[3230]: Mount point ext3 is not a valid path, ignoring.
[  +0.196463] systemd-fstab-generator[3283]: Mount point ext3 is not a valid path, ignoring.
[  +0.744680] systemd-fstab-generator[3338]: Mount point ext3 is not a valid path, ignoring.
[  +0.080607] systemd-fstab-generator[3358]: Mount point ext3 is not a valid path, ignoring.
[  +0.187665] systemd-fstab-generator[3402]: Mount point ext3 is not a valid path, ignoring.
[  +0.105639] systemd-fstab-generator[3422]: Mount point ext3 is not a valid path, ignoring.
[  +0.334824] systemd-fstab-generator[3469]: Mount point ext3 is not a valid path, ignoring.
[  +0.184957] systemd-fstab-generator[3489]: Mount point ext3 is not a valid path, ignoring.
[  +0.224083] systemd-fstab-generator[3534]: Mount point ext3 is not a valid path, ignoring.
[  +0.138828] systemd-fstab-generator[3554]: Mount point ext3 is not a valid path, ignoring.
[  +0.236530] systemd-fstab-generator[3594]: Mount point ext3 is not a valid path, ignoring.
[  +0.257815] systemd-fstab-generator[3614]: Mount point ext3 is not a valid path, ignoring.
[  +0.195213] systemd-fstab-generator[3654]: Mount point ext3 is not a valid path, ignoring.
[Mar15 18:15] systemd-fstab-generator[3674]: Mount point ext3 is not a valid path, ignoring.
[  +0.293551] systemd-fstab-generator[3715]: Mount point ext3 is not a valid path, ignoring.
[  +0.110957] systemd-fstab-generator[3735]: Mount point ext3 is not a valid path, ignoring.
[  +8.617339] systemd-fstab-generator[4130]: Mount point ext3 is not a valid path, ignoring.
[ +18.227501] systemd-fstab-generator[4492]: Mount point ext3 is not a valid path, ignoring.
[Mar15 18:16] tee (6915): /proc/6052/oom_adj is deprecated, please use /proc/6052/oom_score_adj instead.
[Mar15 18:20] printk: rsyslogd (20063): Attempt to access syslog with CAP_SYS_ADMIN but no CAP_SYSLOG (deprecated).
[  +7.974861] db_root: cannot open: /etc/target
[Mar15 18:25] EXT4-fs (dm-3): mounting with "discard" option, but the device does not support discard
[Mar15 18:26] overlayfs: upperdir is in-use by another mount, accessing files from both mounts will result in undefined behavior.
[  +0.000003] overlayfs: workdir is in-use by another mount, accessing files from both mounts will result in undefined behavior.
[ +20.280790] EXT4-fs (dm-6): mounting with "discard" option, but the device does not support discard
[ +25.443567] overlayfs: lowerdir is in-use as upperdir/workdir
[Mar15 18:27] EXT4-fs (dm-13): mounting with "discard" option, but the device does not support discard
[ +12.292494] EXT4-fs (dm-14): mounting with "discard" option, but the device does not support discard
[  +7.222790] EXT4-fs (dm-15): mounting with "discard" option, but the device does not support discard
[Mar15 18:28] show_signal: 5 callbacks suppressed
[ +22.486167] overlayfs: lowerdir is in-use as upperdir/workdir
[Mar15 18:33] hrtimer: interrupt took 7986382 ns
[Mar15 18:36] systemd-fstab-generator[15565]: Mount point ext3 is not a valid path, ignoring.
[  +2.217612] systemd-fstab-generator[15792]: Mount point ext3 is not a valid path, ignoring.
[  +1.140860] systemd-fstab-generator[15863]: Mount point ext3 is not a valid path, ignoring.
[  +0.893805] systemd-fstab-generator[15987]: Mount point ext3 is not a valid path, ignoring.
[  +1.340843] systemd-fstab-generator[16105]: Mount point ext3 is not a valid path, ignoring.

==> kernel <==
 20:13:30 up  1:59,  2 users,  load average: 3.34, 4.60, 6.45
Linux minikf 5.2.0-050200rc6-generic #201906222033 SMP Sun Jun 23 00:36:46 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

==> kube-addon-manager <==
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:06:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:07:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:07:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:08:08+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:08:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:09:08+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:09:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:10:08+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:10:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:11:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:11:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:12:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:12:09+00:00 ==
INFO: Leader is minikf
INFO: == Kubernetes addon ensure completed at 2020-03-16T03:13:07+00:00 ==
INFO: == Reconciling with deprecated label ==
error: no objects passed to apply
INFO: == Reconciling with addon-manager label ==
serviceaccount/storage-provisioner unchanged
INFO: == Kubernetes addon reconcile completed at 2020-03-16T03:13:09+00:00 ==

==> kube-apiserver <==
E0316 02:59:13.304935       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 02:59:13.304945       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:01:13.301055       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:01:13.303343       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:01:13.303355       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:01:13.303374       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:01:13.305685       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:01:13.305694       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:02:13.303538       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:02:13.305867       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:02:13.305880       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:02:13.305894       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:02:13.307715       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:02:13.307725       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:04:13.306083       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:04:13.308255       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:04:13.308268       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:04:13.308284       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:04:13.309752       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:04:13.309762       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:06:13.310792       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:06:13.312923       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:06:13.312938       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:06:13.312952       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:06:13.314817       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:06:13.314830       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:07:13.313283       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:07:13.315558       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:07:13.315573       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:07:13.315587       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:07:13.317431       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:07:13.317463       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:09:13.315930       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:09:13.318035       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:09:13.318050       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:09:13.318065       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:09:13.320185       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:09:13.320197       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:11:13.317644       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:11:13.320879       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:11:13.320893       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:11:13.320914       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:11:13.323729       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:11:13.323741       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.
I0316 03:12:13.323358       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.webhook.cert-manager.io
E0316 03:12:13.355797       1 controller.go:114] loading OpenAPI spec for "v1beta1.webhook.cert-manager.io" failed with: OpenAPI spec does not exist
I0316 03:12:13.355817       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.webhook.cert-manager.io: Rate Limited Requeue.
I0316 03:12:13.355839       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.custom.metrics.k8s.io
E0316 03:12:13.363558       1 controller.go:114] loading OpenAPI spec for "v1beta1.custom.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0316 03:12:13.363575       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.custom.metrics.k8s.io: Rate Limited Requeue.

==> kube-proxy <==
I0316 01:16:09.844831       1 config.go:202] Starting service config controller
I0316 01:16:09.844840       1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0316 01:16:09.945577       1 controller_utils.go:1034] Caches are synced for service config controller
I0316 01:16:09.945798       1 controller_utils.go:1034] Caches are synced for endpoints config controller
I0316 01:24:40.451118       1 trace.go:81] Trace[1232360537]: "iptables restore" (started: 2020-03-16 01:24:37.169600297 +0000 UTC m=+509.131752355) (total time: 3.281475998s):
Trace[1232360537]: [3.281475998s] [3.281426975s] END
I0316 01:29:08.331863       1 trace.go:81] Trace[902179412]: "iptables restore" (started: 2020-03-16 01:29:05.899390281 +0000 UTC m=+777.861542363) (total time: 2.432444272s):
Trace[902179412]: [2.432444272s] [2.432382409s] END
E0316 01:30:01.341044       1 reflector.go:270] k8s.io/client-go/informers/factory.go:133: Failed to watch *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&resourceVersion=5543&timeout=5m27s&timeoutSeconds=327&watch=true: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:11.636815       1 trace.go:81] Trace[2040050030]: "iptables save" (started: 2020-03-16 01:29:51.576568137 +0000 UTC m=+823.538720222) (total time: 20.060206033s):
Trace[2040050030]: [20.060206033s] [20.060206033s] END
I0316 01:30:14.491502       1 trace.go:81] Trace[193692283]: "iptables save" (started: 2020-03-16 01:30:11.636914345 +0000 UTC m=+843.599066411) (total time: 2.854547766s):
Trace[193692283]: [2.854547766s] [2.854547766s] END
E0316 01:30:19.418829       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:19.418920       1 reflector.go:270] k8s.io/client-go/informers/factory.go:133: Failed to watch *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&resourceVersion=4685&timeout=7m53s&timeoutSeconds=473&watch=true: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:24.827656       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:24.827805       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:26.842614       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:26.842772       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:27.070086       1 trace.go:81] Trace[1540084176]: "iptables restore" (started: 2020-03-16 01:30:15.223623929 +0000 UTC m=+847.185775975) (total time: 11.846428497s):
Trace[1540084176]: [11.846428497s] [11.846385203s] END
E0316 01:30:28.265496       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:28.405258       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:31.218286       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:31.219128       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:34.592543       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:35.985587       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:38.099759       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:39.146485       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:39.146523       1 trace.go:81] Trace[1740911148]: "iptables save" (started: 2020-03-16 01:30:32.037280192 +0000 UTC m=+863.999462098) (total time: 7.0430792s):
Trace[1740911148]: [7.0430792s] [7.0430792s] END
E0316 01:30:39.158178       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:40.428729       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:40.429840       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:41.615398       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:41.616731       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:42.781473       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:42.781581       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:43.797401       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
E0316 01:30:43.797882       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: connect: connection refused
I0316 01:30:58.450512       1 trace.go:81] Trace[1298402344]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:46.476398984 +0000 UTC m=+878.438551034) (total time: 11.974090209s):
Trace[1298402344]: [11.974090209s] [11.974090209s] END
E0316 01:30:58.450535       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:58.450740       1 trace.go:81] Trace[912542047]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.07242909 +0000 UTC m=+877.034581193) (total time: 13.378297644s):
Trace[912542047]: [13.378297644s] [13.378297644s] END
E0316 01:30:58.450749       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://localhost:8443/api/v1/endpoints?labelSelector=%!s(MISSING)ervice.kubernetes.io%!F(MISSING)service-proxy-name&limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:31:19.842368       1 trace.go:81] Trace[1372274666]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:59.526358196 +0000 UTC m=+891.488510248) (total time: 20.315965682s):
Trace[1372274666]: [20.315849461s] [20.315849461s] Objects listed
I0316 01:31:20.151919       1 trace.go:81] Trace[1829181234]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:59.525840543 +0000 UTC m=+891.487992616) (total time: 20.626047204s):
Trace[1829181234]: [20.625887426s] [20.625887426s] Objects listed

==> kube-scheduler <==
Trace[688142464]: [10.012338371s] [10.012338371s] END
E0316 01:30:55.721424       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.StorageClass: Get https://localhost:8443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.721602       1 trace.go:81] Trace[1731965608]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.709600475 +0000 UTC m=+30.409933491) (total time: 10.011985732s):
Trace[1731965608]: [10.011985732s] [10.011985732s] END
E0316 01:30:55.721614       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.PersistentVolumeClaim: Get https://localhost:8443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.721784       1 trace.go:81] Trace[189113385]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.708434939 +0000 UTC m=+30.408767988) (total time: 10.013335922s):
Trace[189113385]: [10.013335922s] [10.013335922s] END
E0316 01:30:55.721793       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.StatefulSet: Get https://localhost:8443/apis/apps/v1/statefulsets?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.721821       1 trace.go:81] Trace[107151580]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.708463688 +0000 UTC m=+30.408796709) (total time: 10.013339543s):
Trace[107151580]: [10.013339543s] [10.013339543s] END
E0316 01:30:55.721834       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.PodDisruptionBudget: Get https://localhost:8443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.731799       1 trace.go:81] Trace[37345855]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.730270778 +0000 UTC m=+30.430603788) (total time: 10.001461625s):
Trace[37345855]: [10.001461625s] [10.001461625s] END
E0316 01:30:55.731856       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.PersistentVolume: Get https://localhost:8443/api/v1/persistentvolumes?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.732081       1 trace.go:81] Trace[1154873847]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.730339884 +0000 UTC m=+30.430672894) (total time: 10.001699551s):
Trace[1154873847]: [10.001699551s] [10.001699551s] END
E0316 01:30:55.732097       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.ReplicationController: Get https://localhost:8443/api/v1/replicationcontrollers?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.813474       1 trace.go:81] Trace[1448715838]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.809623636 +0000 UTC m=+30.509956673) (total time: 10.003823513s):
Trace[1448715838]: [10.003823513s] [10.003823513s] END
E0316 01:30:55.813530       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.822105       1 trace.go:81] Trace[663455364]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:45.815380857 +0000 UTC m=+30.515713908) (total time: 10.006697176s):
Trace[663455364]: [10.006697176s] [10.006697176s] END
E0316 01:30:55.822163       1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.ReplicaSet: Get https://localhost:8443/apis/apps/v1/replicasets?limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:30:55.864035       1 trace.go:81] Trace[55394677]: "Reflector k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:223 ListAndWatch" (started: 2020-03-16 01:30:45.855584604 +0000 UTC m=+30.555917684) (total time: 10.008402184s):
Trace[55394677]: [10.008402184s] [10.008402184s] END
E0316 01:30:55.864061       1 reflector.go:126] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:223: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=status.phase%3DFailed%!C(MISSING)status.phase%3DSucceeded&limit=500&resourceVersion=0: net/http: TLS handshake timeout
I0316 01:31:12.742992       1 trace.go:81] Trace[748316010]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.81370936 +0000 UTC m=+41.514042379) (total time: 15.929244629s):
Trace[748316010]: [15.921722302s] [15.921722302s] Objects listed
I0316 01:31:13.087143       1 trace.go:81] Trace[624725376]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.72423489 +0000 UTC m=+41.424569591) (total time: 16.362840695s):
Trace[624725376]: [16.36276513s] [16.36276513s] Objects listed
I0316 01:31:13.101093       1 trace.go:81] Trace[1856609687]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.724733813 +0000 UTC m=+41.425066834) (total time: 16.37632406s):
Trace[1856609687]: [16.37627825s] [16.37627825s] Objects listed
I0316 01:31:13.101424       1 trace.go:81] Trace[1564355179]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.694817873 +0000 UTC m=+41.395150985) (total time: 16.406584871s):
Trace[1564355179]: [16.406562153s] [16.406562153s] Objects listed
I0316 01:31:13.104366       1 trace.go:81] Trace[1085617361]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.733182796 +0000 UTC m=+41.433515870) (total time: 16.37114514s):
Trace[1085617361]: [16.371089522s] [16.371089522s] Objects listed
I0316 01:31:13.178539       1 trace.go:81] Trace[195603164]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.739070512 +0000 UTC m=+41.439403560) (total time: 16.439387269s):
Trace[195603164]: [16.439279288s] [16.439279288s] Objects listed
I0316 01:31:13.179847       1 trace.go:81] Trace[973338140]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.726034862 +0000 UTC m=+41.426367910) (total time: 16.453782937s):
Trace[973338140]: [16.453684439s] [16.453684439s] Objects listed
I0316 01:31:13.374846       1 trace.go:81] Trace[25725130]: "Reflector k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:223 ListAndWatch" (started: 2020-03-16 01:30:56.865061138 +0000 UTC m=+41.565394187) (total time: 16.509741041s):
Trace[25725130]: [16.341250117s] [16.341250117s] Objects listed
I0316 01:31:13.385107       1 trace.go:81] Trace[324039104]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.822673273 +0000 UTC m=+41.523006291) (total time: 16.562396125s):
Trace[324039104]: [16.562198342s] [16.562198342s] Objects listed
I0316 01:31:13.491892       1 trace.go:81] Trace[811062640]: "Reflector k8s.io/client-go/informers/factory.go:133 ListAndWatch" (started: 2020-03-16 01:30:56.734207097 +0000 UTC m=+41.434540118) (total time: 16.757303463s):
Trace[811062640]: [16.75682728s] [16.75682728s] Objects listed
I0316 01:31:14.312649       1 controller_utils.go:1027] Waiting for caches to sync for scheduler controller
I0316 01:31:14.414804       1 controller_utils.go:1034] Caches are synced for scheduler controller
I0316 01:31:14.414908       1 leaderelection.go:217] attempting to acquire leader lease  kube-system/kube-scheduler...
I0316 01:31:41.677546       1 leaderelection.go:227] successfully acquired lease kube-system/kube-scheduler

==> kubelet <==
-- Logs begin at Sun 2020-03-15 18:14:05 PDT, end at Sun 2020-03-15 20:13:30 PDT. --
Mar 15 20:12:46 minikf kubelet[4552]: E0315 20:12:46.856668    4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:12:47 minikf kubelet[4552]: E0315 20:12:47.857242    4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:47 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:47 minikf kubelet[4552]: ]
Mar 15 20:12:48 minikf kubelet[4552]: I0315 20:12:48.389137    4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:12:52 minikf kubelet[4552]: E0315 20:12:52.878461    4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:52 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:52 minikf kubelet[4552]: ]
Mar 15 20:12:52 minikf kubelet[4552]: W0315 20:12:52.901225    4552 reflector.go:289] object-"default"/"admission-webhook-config-f8bhm66bg2": watch of *v1.ConfigMap ended with: too old resource version: 52075 (55931)
Mar 15 20:12:53 minikf kubelet[4552]: I0315 20:12:53.389364    4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:12:53 minikf kubelet[4552]: I0315 20:12:53.389418    4552 kubelet.go:1963] SyncLoop (container unhealthy): "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:12:57 minikf kubelet[4552]: W0315 20:12:57.606129    4552 reflector.go:289] object-"kubeflow"/"jupyter-web-app-parameters": watch of *v1.ConfigMap ended with: too old resource version: 51879 (55978)
Mar 15 20:12:57 minikf kubelet[4552]: E0315 20:12:57.857035    4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:12:58 minikf kubelet[4552]: E0315 20:12:58.863881    4552 pod_workers.go:190] Error syncing pod 90fbf8f1-672f-11ea-87be-0800271cfe03 ("keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:58 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:12:58 minikf kubelet[4552]: ]
Mar 15 20:13:01 minikf kubelet[4552]: E0315 20:13:01.857039    4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:01 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:01 minikf kubelet[4552]: ]
Mar 15 20:13:03 minikf kubelet[4552]: I0315 20:13:03.388729    4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:13:04 minikf kubelet[4552]: E0315 20:13:04.860442    4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:04 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:04 minikf kubelet[4552]: ]
Mar 15 20:13:08 minikf kubelet[4552]: I0315 20:13:08.388879    4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:13:12 minikf kubelet[4552]: E0315 20:13:12.856976    4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: I0315 20:13:13.389021    4552 prober.go:112] Liveness probe for "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03):mixer" failed (failure): Get http://172.17.0.20:15014/version: dial tcp 172.17.0.20:15014: connect: connection refused
Mar 15 20:13:13 minikf kubelet[4552]: I0315 20:13:13.389075    4552 kubelet.go:1963] SyncLoop (container unhealthy): "istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: E0315 20:13:13.842313    4552 pod_workers.go:190] Error syncing pod 088bbcb2-6725-11ea-96c3-0800271cfe03 ("istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: E0315 20:13:13.860068    4552 pod_workers.go:190] Error syncing pod 90fbf8f1-672f-11ea-87be-0800271cfe03 ("keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:13 minikf kubelet[4552]: ]
Mar 15 20:13:14 minikf kubelet[4552]: E0315 20:13:14.629564    4552 pod_workers.go:190] Error syncing pod 088bbcb2-6725-11ea-96c3-0800271cfe03 ("istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:14 minikf kubelet[4552]: E0315 20:13:14.856604    4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:14 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:14 minikf kubelet[4552]: ]
Mar 15 20:13:17 minikf kubelet[4552]: E0315 20:13:17.856361    4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:17 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:17 minikf kubelet[4552]: ]
Mar 15 20:13:24 minikf kubelet[4552]: E0315 20:13:24.855963    4552 pod_workers.go:190] Error syncing pod 087cc3ce-6725-11ea-96c3-0800271cfe03 ("istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-telemetry-598cdc58bf-hfdcz_istio-system(087cc3ce-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: E0315 20:13:25.856404    4552 pod_workers.go:190] Error syncing pod 088bbcb2-6725-11ea-96c3-0800271cfe03 ("istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"), skipping: failed to "StartContainer" for "mixer" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=mixer pod=istio-policy-757f4d7856-cpw4k_istio-system(088bbcb2-6725-11ea-96c3-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: E0315 20:13:25.859177    4552 pod_workers.go:190] Error syncing pod 90fbf8f1-672f-11ea-87be-0800271cfe03 ("keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2vzmffv8-worker-0_kubeflow-user(90fbf8f1-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:25 minikf kubelet[4552]: ]
Mar 15 20:13:28 minikf kubelet[4552]: E0315 20:13:28.860201    4552 pod_workers.go:190] Error syncing pod 91374b3b-672f-11ea-87be-0800271cfe03 ("keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:28 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-2dfmhlcm-worker-0_kubeflow-user(91374b3b-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:28 minikf kubelet[4552]: ]
Mar 15 20:13:29 minikf kubelet[4552]: W0315 20:13:29.021076    4552 reflector.go:289] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMap ended with: too old resource version: 53788 (56278)
Mar 15 20:13:29 minikf kubelet[4552]: E0315 20:13:29.855845    4552 pod_workers.go:190] Error syncing pod 9179e635-672f-11ea-87be-0800271cfe03 ("keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"), skipping: [failed to "StartContainer" for "tensorflow" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=tensorflow pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:29 minikf kubelet[4552]: , failed to "StartContainer" for "metrics-logger-and-collector" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=metrics-logger-and-collector pod=keras-example-jc9sn4nw-worker-0_kubeflow-user(9179e635-672f-11ea-87be-0800271cfe03)"
Mar 15 20:13:29 minikf kubelet[4552]: ]

==> storage-provisioner <==
E0316 01:29:58.667283       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:29:58.900926       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:01.069924       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:01.070038       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:01.303386       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:06.045703       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:06.045788       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:06.045863       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:07.249362       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:13.012767       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:13.013229       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:13.013346       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:19.418003       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:19.420404       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:26.843573       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:26.843692       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:26.843759       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:27.911139       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:27.911536       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:27.914175       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:30.434582       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:30.434898       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:30.439168       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:33.714728       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:34.412636       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:34.412699       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:35.806554       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:35.806646       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:35.806767       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:37.905548       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:37.905614       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:37.905726       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:39.174734       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:39.174795       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:39.174947       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:40.706575       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:40.727022       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:40.727380       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:41.712944       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:41.751510       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:41.759825       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:42.782318       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:42.782797       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:42.783161       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:43.807143       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:43.807379       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:43.807597       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 10.96.0.1:443: getsockopt: connection refused
E0316 01:30:58.390643       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: Get https://10.96.0.1:443/api/v1/persistentvolumes?resourceVersion=0: net/http: TLS handshake timeout
E0316 01:30:58.466686       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:379: Failed to list *v1.StorageClass: Get https://10.96.0.1:443/apis/storage.k8s.io/v1/storageclasses?resourceVersion=0: net/http: TLS handshake timeout
E0316 01:30:58.564805       1 reflector.go:205] k8s.io/minikube/vendor/github.com/r2d4/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: Get https://10.96.0.1:443/api/v1/persistentvolumeclaims?resourceVersion=0: net/http: TLS handshake timeout

What did you expect to happen: I thought there would be no problem because the log file was created successfully and I set the file-metric-collector to refer to the generated log file. I’d appreciate it if you let me know what’s wrong.

  • The keras code and yaml file that I was trying to run
import tensorflow as tf
from keras.optimizers import Adam
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.resnet50 import ResNet50
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.inception_v3 import InceptionV3

import logging
import argparse

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

(x_train, x_valid) = x_train[100:1100], x_train[:100]
(y_train, y_valid) = y_train[100:1100], y_train[:100]

if __name__ == '__main__':
  parser = argparse.ArgumentParser()
  parser.add_argument('--num_classes', type=int, default=10,
                      help='the number of classes')
  parser.add_argument('--batch_size', type=int, default=256,
                      help='Number of samples per gradient update.')
  parser.add_argument('--epochs', type=int, default=5,
                      help='Number of epochs to run trainer.')
  parser.add_argument('--learning_rate', type=float, default=0.001,
                      help='Initial learning rate')
  parser.add_argument('--network', type=str, default='vgg16',
                      help='Open source deep learning model')
  parser.add_argument('--image_shape', default='32, 32, 3',
                      help='Shape of training images')
  parser.add_argument('--log_dir', type=str, default='./cifar10.log',
                      help='Summaries log directory')

  args = parser.parse_args()

  logging.basicConfig(filename=args.log_dir, level=logging.DEBUG)

  img_shape = args.image_shape.split(',')
  img_w = int(img_shape[0].strip())
  img_h = int(img_shape[1].strip())
  img_c = int(img_shape[2].strip())

  if args.network == 'resnet50':
      model = ResNet50(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
  elif args.network == 'vgg16':
      model = VGG16(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
  elif args.network == 'vgg19':
      model = VGG19(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))
  elif args.network == 'inceptionv3':
      model = InceptionV3(weights=None, classes=args.num_classes, input_shape=(img_w, img_h, img_c))

  opt = Adam(lr=args.learning_rate)
  model.compile(loss='sparse_categorical_crossentropy',
                optimizer=opt,
                metrics=['accuracy'])

  datagen = ImageDataGenerator()

  print(">>> Data Loaded. Training starts.")
  for e in range(args.epochs):
      print("\nTotal Epoch {}/{}".format(e + 1, args.epochs))
      history = model.fit_generator(generator=datagen.flow(x_train, y_train, batch_size=args.batch_size),
                                    steps_per_epoch=int(len(x_train)/args.batch_size)+1,
                                    epochs=1,
                                    verbose=1,
                                    validation_data=(x_valid, y_valid))
      logging.info('\n{{metricName: accuracy, metricValue: {:.4f}}};{{metricName: loss, metricValue: {:.4f}}}\n'.format(
          history.history['val_accuracy'][-1], history.history['val_loss'][-1]))
      print("Training-Accuracy={}".format(history.history['accuracy'][-1]))
      print("Training-Loss={}".format(history.history['loss'][-1]))
      print("Validation-Accuracy={}".format(history.history['val_accuracy'][-1]))
      print("Validation-Loss={}".format(history.history['val_loss'][-1]))
apiVersion: "kubeflow.org/v1alpha3"
kind: Experiment
metadata:
  namespace: kubeflow-user
  name: keras-example
spec:
  objective:
    type: maximize
    goal: 0.99
    objectiveMetricName: accuracy
    additionalMetricNames:
    - loss
  metricsCollectorSpec:
    source:
      filter:
        metricsFormat:
        - "{metricName: ([\\w|-]+), metricValue: ((-?\\d+)(\\.\\d+)?)}"
      fileSystemPath:
        path: "/var/katib_keras/cifar10.log"
        kind: File
    collector:
      kind: File
  algorithm:
    algorithmName: random
  parallelTrialCount: 3
  maxTrialCount: 3
  maxFailedTrialCount: 3
  parameters:
    - name: --learning_rate
      parameterType: double
      feasibleSpace:
        min: "0.01"
        max: "0.05"
    - name: --batch_size
      parameterType: int
      feasibleSpace:
        min: "100"
        max: "200"
  trialTemplate:
    goTemplate:
        rawTemplate: |-
          apiVersion: "kubeflow.org/v1"
          kind: TFJob
          metadata:
            name: {{.Trial}}
            namespace: {{.NameSpace}}
          spec:
           tfReplicaSpecs:
            Worker:
              replicas: 1 
              restartPolicy: OnFailure
              template:
                spec:
                  containers:
                    - name: tensorflow 
                      image: docker.io/jeun0241/katib_keras_v3
                      imagePullPolicy: Always
                      command:
                        - "python"
                        - "/var/katib_keras/main.py"
                        - "--epochs=2"
                        - "--log_dir='/var/katib_keras/cifar10.log'"
                        {{- with .HyperParameters}}
                        {{- range .}}
                        - "{{.Name}}={{.Value}}"
                        {{- end}}
                        {{- end}}
  • generated log file
WARNING:tensorflow:From /opt/anaconda3/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /opt/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /opt/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

INFO:root:
{metricName: accuracy, metricValue: 0.0972};{metricName: loss, metricValue: 2.3030}

INFO:root:
{metricName: accuracy, metricValue: 0.0976};{metricName: loss, metricValue: 2.3027}

INFO:root:
{metricName: accuracy, metricValue: 0.0920};{metricName: loss, metricValue: 2.3027}

INFO:root:
{metricName: accuracy, metricValue: 0.0920};{metricName: loss, metricValue: 2.3027}

INFO:root:
{metricName: accuracy, metricValue: 0.0920};{metricName: loss, metricValue: 2.3027}

Anything else you would like to add: In addition, I pulled the latest version of the file-metrics-collector image and tried it again, but it automatically pulls the v0.8.0 image as below.

  Normal   Pulling    28m                    kubelet, minikube  Pulling image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"
  Normal   Pulled     28m                    kubelet, minikube  Successfully pulled image "gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector:v0.8.0"

[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Katib version: v1alpha3 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-ui v0.8.0 540d9308c9f6 4 weeks ago 54.4MB gcr.io/kubeflow-images-public/katib/v1alpha3/file-metrics-collector v0.8.0 ff9ce96cf37c 4 weeks ago 25.2MB gcr.io/kubeflow-images-public/katib/v1alpha3/katib-controller v0.8.0 7c5162abd775 4 weeks ago 53.8MB gcr.io/kubeflow-images-public/katib/v1alpha3/suggestion-hyperopt v0.8.0 56c0051f100c 4 weeks ago 1.23GB gcr.io/kubeflow-images-public/katib/v1alpha3/katib-db-manager v0.8.0 32229959fe81 4 weeks ago 28.5MB
  • MiniKF box version: arrikto/minikf (virtualbox, 20200305.1.0)
  • Kubeflow version:
  • Minikube version: minikube version: v1.2.0
  • Kubernetes version: (use kubectl version): Client Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.3”, GitCommit:“5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0”, GitTreeState:“clean”, BuildDate:“2019-06-06T01:44:30Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.3”, GitCommit:“5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0”, GitTreeState:“clean”, BuildDate:“2019-06-06T01:36:19Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”}
  • OS (e.g. from /etc/os-release):

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
devxoxocommented, Mar 17, 2020

In my case, the log file was not created because I entered the key value incorrectly for validation accuracy. The key value was changed from ‘val_accuracy’ to ‘val_acc’ and it was executed normally. This issue seems to depend on the Keras version. Thanks for your help!

0reactions
andreyvelichcommented, Mar 16, 2020

You should check logs from metrics-logger-and-collector container. Maybe in your training container you don’t have permissions to create files under /var/ folder, so you have to change folder to /train/ or something similar?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding Kubernetes CrashLoopBackoff Events
CrashLoopBackOff is a status message that indicates one of your pods is in a constant state of flux—one or more containers are failing...
Read more >
Kubernetes CrashLoopBackOff: What it is, and how to fix it?
Learn to visualize, alert, and troubleshoot a Kubernetes CrashLoopBackOff: A pod starting, crashing, starting again, and crashing again.
Read more >
Kubernetes CrashLoopBackOff Error: What It Is and How to Fix It
CrashLoopBackOff is a common Kubernetes error, which indicates that a pod failed to start, Kubernetes tried to restart it, and it continued to...
Read more >
Troubleshoot and Fix Kubernetes CrashLoopBackoff Status
The CrashLoopBackoff status is a notification that the pod is being restarted due to an error and is waiting for the specified 'backoff'...
Read more >
Why the pod status is coming as crashloopbackoff in my case?
Also in order to reliably run one Pod to completion you should use kubernetes Jobs. THat will create pod in Completed status.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found