Misleading message from CHE server when a container stopped
See original GitHub issueIn some specific cases we can get a container that may be stopped unexpectedly just after start.
In such case, workspace pod is considered as non-ready and it does not service traffic.
So, after workspace start timeout (like 5-8 minutes) we got an exception Server 'theia' in container 'theia-ide3fl' not available
but theia started just fine but there is an user-defined container that is completed.
We faced it this problem with a factory from quarkus workshop:
The quay.io/rhdevelopers/tutorial-tool
image was stooped with exit 0
and we got unclear message from CHE server like: Server 'theia' in container 'theia-ide3fl' not available
Since quay.io/rhdevelopers/tutorial-tool
is updated and does not fail anymore the following Devfile can be used to reproduce the case
apiVersion: 1.0.0
metadata:
name: quarkus-workshop
projects:
- name: quarkus-tutorial
source:
type: git
location: 'https://github.com/redhat-developer-demos/quarkus-tutorial.git'
components:
- id: redhat/vscode-yaml/latest
type: chePlugin
- id: redhat/vscode-xml/latest
type: chePlugin
- id: redhat/java8/latest
type: chePlugin
preferences:
java.configuration.maven.userSettings: /opt/developer/.m2/settings.xml
# Tool that allows to build java application including Quarkus
- alias: tools
type: kubernetes
mountSources: true
referenceContent: |
apiVersion: v1
kind: List
items:
- apiVersion: v1
kind: Pod
metadata:
name: tutorial-tools
labels:
app: tutorial-tools
spec:
containers:
- name: tutorial-tools
image: quay.io/rhdevelopers/tutorial-tools:0.0.2
imagePullPolicy: IfNotPresent
env:
- name: MAVEN_MIRROR_URL
value: 'http://nexus.rhd-workshop-infra:8081/nexus/content/groups/public'
workingDir: /projects
resources:
limits:
memory: 2Gi
requests:
memory: 2Gi
entrypoints:
- parentName: tutorial-tools
command: ['echo']
args: ['hello']
selector:
app: tutorial-tools
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (8 by maintainers)
Top GitHub Comments
@tolusha Thanks for trying one more time and notifying me. A small typo is fixed now and the workspace should not start anymore.
We can see that there are no corresponding events, so it can not be handled with unrecoverable events mechanism.
Events
But Pod yaml contains the corresponding information we could use:
So, we should watch workspaces pods and make sure every container is ready before proceeding to the servers check phase. Here is a list of all possible conditions types https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions. But instead of just waiting for some conditions - we could early fail when there are containers statuses with restart count more than 3…