Elasticsearch unable to start because of java.lang.IllegalStateException: Failed to create node environment
See original GitHub issueDescribe the bug
Elasticsearch image is not able to create “node environment” in the mounted (persistent) /usr/share/elasticsearch/data
This is due to permission issues because of fsGroup
(it was set to 0
)
The Java Exception is: java.lang.IllegalStateException: Failed to create node environment
To Reproduce
Steps to reproduce the behavior:
- Create an Elasticsearch deployment
- See error
Expected behavior
Elasticsearch pod up&running.
Additional context
Tested by adding fsGroup: 0
in the deployment securityContext
and it works as expected
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Failed to created node environment · Issue #21 - GitHub
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to created node environment
Read more >Elasticsearch Service Fails to create Node Environment (5.5.6)
StartupException: java.lang.IllegalStateException: Failed to create node environment at org.elasticsearch.bootstrap.
Read more >Troubles with ddev + elasticsearch latest version
I would start by deleting the docker volume that this creates, probably named "ddev-<projectname>_elasticsearch". docker volume ls | grep ...
Read more >unable to start elasticsearch - Google Groups
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/var/lib/elasticsearch]] with lock id [0]; maybe these locations are not ...
Read more >Elasicsearch connection test failed since 2.11.0
Hey. We updated our UCRM today to 2.11.0 and we now cant search and have "Elasicsearch connection test failed" in the system status...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I did some more research on this, as I was confused why this has not been a problem earlier.
So turns out if you mount a NEW PVC into a container, without any
securityContext.fsGroup
settings in the Pod, this is how the PVC is mounted into the container:see the
drwxr-xr-x
on the main folder, the container itself is running as user/grouproot/root
, so technically it should have rite access (userroot
is owner of the folder and has write access) but the elasticsearch process is started under the user/groupelasticsearch/root
, see here:which means the elasticsearch user has no write access to the PVC (user doesn’t match, and the group matches, but the group does not have write access).
Now, if you set
securityContext.fsGroup: 0
inside the pod it looks like this:the big difference is the
drwxrwsr-x
on the.
folder, meaning the the grouproot
has write access. Therefore elasticsearch will be able to access thedata
folder and do it’s thing. So turns out that settingsecurityContext.fsGroup: 0
does not only set the filesystem group to0
(root
) but also changes the permissions of the filesystem to writeable by group.Now why id this not cause more havoc: After the permissions have been set once to
drwxrwsr-x
on the PVC, they stay that way, so even if we removedsecurityContext.fsGroup: 0
recently, all PVCs that where created before the removal hat thedrwxrwsr-x
on the.
folder and everything is fine. Only if a new PVC (like a new project/migration) was added this caused issues on the very beginning.It also only causes issues with container images that switch the user of the service to something else than
root
, like the elasticsearch or solr images do. I still though changed it in the PR formariadb-single
,mongo-single
,postgres-single
just to be safeFixed in #2610