Workspace is not starting, when k8s StorageClass has volumeBindingMode=WaitForFirstConsumer
See original GitHub issueDescription
When default StorageClass is configured to have volumeBindingMode
set to WaitForFirstConsumer
, workspaces do not start. I guess that’s because che is waiting for PVC to be in “Bound” state before creating workspace (or mkdir) pod. But this is not happening with this volumeBindingMode
and the PVC is stuck in “Pending” state with message “waiting for first consumer to be created before binding” until the workspace startup fails.
Reproduction Steps
- Create default StorageClass with
WaitForFirstConsumer
volumeBindingMode (more info here) - Try to start workspace
Workspace startup fails with message:
Error: Failed to run the workspace: "Waiting for persistent volume claim 'claim-che-workspace' reached timeout"
And error log in che-master:
2019-03-14 14:06:31,927[aceSharedPool-0] [ERROR] [o.e.c.a.w.s.WorkspaceRuntimes 813] - Waiting for persistent volume claim 'claim-che-workspace' reached timeout
org.eclipse.che.api.workspace.server.spi.InternalInfrastructureException: Waiting for persistent volume claim 'claim-che-workspace' reached timeout
at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesPersistentVolumeClaims.wait(KubernetesPersistentVolumeClaims.java:225)
at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesPersistentVolumeClaims.waitBound(KubernetesPersistentVolumeClaims.java:165)
at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.pvc.CommonPVCStrategy.prepare(CommonPVCStrategy.java:200)
at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.internalStart(KubernetesInternalRuntime.java:200)
at org.eclipse.che.api.workspace.server.spi.InternalRuntime.start(InternalRuntime.java:141)
at org.eclipse.che.api.workspace.server.WorkspaceRuntimes$StartRuntimeTask.run(WorkspaceRuntimes.java:779)
at org.eclipse.che.commons.lang.concurrent.CopyThreadLocalRunnable.run(CopyThreadLocalRunnable.java:38)
at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
OS and version:
This is reproducible with Codeready Workspaces on OCP 4 Beta and on latest Eclipse Che on pure kubernetes (minikube)
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:11 (10 by maintainers)
Top Results From Across the Web
storageclass.storage.k8s.io "dynamic" not found
Description of problem: Cluster is on AWS EC2 and the AWS cloud provider is configured correctly. The logging deployer creates a pvc with...
Read more >Storage Classes - Kubernetes
This document describes the concept of a StorageClass in Kubernetes. Familiarity with volumes and persistent volumes is suggested.
Read more >Persistent volumes and dynamic provisioning - Google Cloud
Because the storage class standard-rwo uses volume binding mode WaitForFirstConsumer , the PersistentVolume will not be created until a Pod is scheduled to ......
Read more >Workspaces - Tasks and Pipelines - Tekton
This feature has many uses: A Task may optionally accept credentials to run authenticated commands. A Pipeline may accept optional configuration that changes ......
Read more >Using Storage Classes for Dynamic Provisioning
You must use "true" quoted in this version of the API. Without this annotation, OpenShift Container Platform considers this not the default StorageClass....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think that setting
CHE_INFRA_KUBERNETES_PVC_WAIT__BOUND=true
as suggested by @rhopp is the way to go in the short term.Using an init container looks like overkill. And using the operator to detect the Volume Binding Mode doesn’t look simple neither: theoretically the wsmaster and the workspace pods can be bound to PV with different volume binding modes (che.osio is an example we all know).
Something that I don’t understand is why we can’t just infer that we are in
WaitForFirstConsumer
mode at runtime if we intercept an event message that sayswaiting for first consumer to be created before binding
.I missed this event. Checking an event message is not reliable but I like this proposal and think it will improve Che Server behavior👍