Topic Operator fails to start while trying to clean /tmp
See original GitHub issueDescribe the bug We are testing the Strimzi upgrade from 0.28.0 to 0.30.0. The Topic Operator pod fails to start and enters a crach loop with the following logs being produced on each iteration:
rm: cannot remove '/tmp/hsperfdata_root/75': Permission denied
rm: cannot remove '/tmp/ks-script-31y0onvf': Operation not permitted
rm: cannot remove '/tmp/ks-script-_qd86jm0': Operation not permitted
This seems to be happening due to the image (quay.io/strimzi/operator:0.30.0
) being built in the way that leaves some files in /tmp
owned by root
. When the pod starts it fails to clean /tmp
as strimzi
user doesn’t have permissinos to delete those files:
$ docker run --rm -ti quay.io/strimzi/operator:0.30.0 bash -c 'id; ls -lha /tmp; rm -rvf /tmp/*'
uid=1001(strimzi) gid=0(root) groups=0(root)
total 20K
drwxrwxrwt 1 root root 4.0K Jul 15 19:08 .
drwxr-xr-x 1 root root 4.0K Aug 3 11:04 ..
drwxr-xr-x 2 root root 4.0K Jul 15 19:08 hsperfdata_root
-rwx------ 1 root root 701 Jun 17 04:13 ks-script-31y0onvf
-rwx------ 1 root root 291 Jun 17 04:13 ks-script-_qd86jm0
rm: cannot remove '/tmp/hsperfdata_root/75': Permission denied
rm: cannot remove '/tmp/ks-script-31y0onvf': Operation not permitted
rm: cannot remove '/tmp/ks-script-_qd86jm0': Operation not permitted
To Reproduce Steps to reproduce the behavior:
- Deploy Topic Operator from 0.30.0 release.
- See it fails to start
Expected behavior Topci Operator starts successfully.
Environment:
- Strimzi version: 0.30.0
- Installation method: YAML files
- Kubernetes cluster: Kubernetes 1.21
Issue Analytics
- State:
- Created a year ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Operator is leaking files in /tmp, running out of disk space #6919
When it doesn't restart, it will keep using the same cache directory and should not run out of disk space. Clean the storage...
Read more >Shell Commands may fail if /tmp is full - IBM
Some shell commands that need to write to /tmp in order to execute (such as $(id)) may fail without a message if /tmp...
Read more >errors from nco moving .tmp when finished processing data
Hello, I'm running nco scripts in conjuction with gnu-paralllel and i'm getting some errors but they don't seem to actually be errors ....
Read more >linux - Are we supposed to manually delete the contents of /tmp?
/tmp directory contents get deleted only when system reboots, because running process may have accessing files from that directory. Share.
Read more >Is it safe to rm -rf /tmp/*? - Unix & Linux Stack Exchange
In general, no. If it's filling up with junk, you may want to look at what software isn't cleaning up after itself.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Great, thanks for confirmation. I will get to it later and fix the YAML files in here as well.
Yes, we are using the standalone Topic Operator deployment. Adding the
/tmp
volume fixes the problem. Thanks!