No space left on device creates hung EC2 instance
See original GitHub issueHello!
First I just want to say thank you for this library, it is truly incredible what I have been able to spin up in such a short timeframe. 🙇 🙇
Onto the error: I was attempting to train a model with a bit more data than my original go and ran into a System.IO.IOException: No space left on device
. I should have expected this but did not. With my prior test runs I saw that correctly after error or success the EC2 instance was shutdown, but for this one it was not. The associated EC2 instance stayed running until I manually went and terminated it.
My personal desire would be to have it terminate on any error, including Sys but this one may be tricky to handle so I understand and it may just be that some documentation should be added as to what to all cleanup manually.
Full log here: https://github.com/evamaxfield/phd-infrastructures/actions/runs/2321863887
Thank you again!
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:17 (11 by maintainers)
Top GitHub Comments
No worries, if it happens again those commands are helpful for us to diagnose. the issue. Consider including
--cloud-startup-script=$(echo 'echo "$(curl https://github.com/'"$GITHUB_ACTOR"'.keys)" >> /home/ubuntu/.ssh/authorized_keys' | base64 -w 0)
for easy access to the instance for debugging.Thanks, just the cml log was enough in this case.