question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Workspace is not starting, when k8s StorageClass has volumeBindingMode=WaitForFirstConsumer

See original GitHub issue

Description

When default StorageClass is configured to have volumeBindingMode set to WaitForFirstConsumer, workspaces do not start. I guess that’s because che is waiting for PVC to be in “Bound” state before creating workspace (or mkdir) pod. But this is not happening with this volumeBindingMode and the PVC is stuck in “Pending” state with message “waiting for first consumer to be created before binding” until the workspace startup fails.

Reproduction Steps

  1. Create default StorageClass with WaitForFirstConsumer volumeBindingMode (more info here)
  2. Try to start workspace

Workspace startup fails with message:

Error: Failed to run the workspace: "Waiting for persistent volume claim 'claim-che-workspace' reached timeout"

And error log in che-master:

2019-03-14 14:06:31,927[aceSharedPool-0]  [ERROR] [o.e.c.a.w.s.WorkspaceRuntimes 813]   - Waiting for persistent volume claim 'claim-che-workspace' reached timeout
org.eclipse.che.api.workspace.server.spi.InternalInfrastructureException: Waiting for persistent volume claim 'claim-che-workspace' reached timeout
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesPersistentVolumeClaims.wait(KubernetesPersistentVolumeClaims.java:225)
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesPersistentVolumeClaims.waitBound(KubernetesPersistentVolumeClaims.java:165)
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.pvc.CommonPVCStrategy.prepare(CommonPVCStrategy.java:200)
	at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.internalStart(KubernetesInternalRuntime.java:200)
	at org.eclipse.che.api.workspace.server.spi.InternalRuntime.start(InternalRuntime.java:141)
	at org.eclipse.che.api.workspace.server.WorkspaceRuntimes$StartRuntimeTask.run(WorkspaceRuntimes.java:779)
	at org.eclipse.che.commons.lang.concurrent.CopyThreadLocalRunnable.run(CopyThreadLocalRunnable.java:38)
	at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

OS and version:
This is reproducible with Codeready Workspaces on OCP 4 Beta and on latest Eclipse Che on pure kubernetes (minikube)

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:2
  • Comments:11 (10 by maintainers)

github_iconTop GitHub Comments

4reactions
l0rdcommented, May 28, 2019

I think that setting CHE_INFRA_KUBERNETES_PVC_WAIT__BOUND=true as suggested by @rhopp is the way to go in the short term.

Using an init container looks like overkill. And using the operator to detect the Volume Binding Mode doesn’t look simple neither: theoretically the wsmaster and the workspace pods can be bound to PV with different volume binding modes (che.osio is an example we all know).

Something that I don’t understand is why we can’t just infer that we are in WaitForFirstConsumer mode at runtime if we intercept an event message that says waiting for first consumer to be created before binding.

2reactions
sleshchenkocommented, May 29, 2019

Something that I don’t understand is why we can’t just infer that we are in WaitForFirstConsumer mode at runtime if we intercept an event message that says waiting for first consumer to be created before binding.

I missed this event. Checking an event message is not reliable but I like this proposal and think it will improve Che Server behavior👍

Read more comments on GitHub >

github_iconTop Results From Across the Web

storageclass.storage.k8s.io "dynamic" not found
Description of problem: Cluster is on AWS EC2 and the AWS cloud provider is configured correctly. The logging deployer creates a pvc with...
Read more >
Storage Classes - Kubernetes
This document describes the concept of a StorageClass in Kubernetes. Familiarity with volumes and persistent volumes is suggested.
Read more >
Persistent volumes and dynamic provisioning - Google Cloud
Because the storage class standard-rwo uses volume binding mode WaitForFirstConsumer , the PersistentVolume will not be created until a Pod is scheduled to ......
Read more >
Workspaces - Tasks and Pipelines - Tekton
This feature has many uses: A Task may optionally accept credentials to run authenticated commands. A Pipeline may accept optional configuration that changes ......
Read more >
Using Storage Classes for Dynamic Provisioning
You must use "true" quoted in this version of the API. Without this annotation, OpenShift Container Platform considers this not the default StorageClass....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found