Solving the simultaneous singularity build using flock
See original GitHub issueFurther to the discussion in #4635, I’ve been thinking about a more elegant way to solve the awkwardness of running a scatter while using Singularity on HPC. The major issues include:
- We run N
singularity build
s, for a scatter over N items, which wastes time and CPU, and writing N large images to the filesystem simultaneously will presumably challenge the filesystem. - We have to store N
.sif
images, which wastes space while the job is running - We have to delete the image after each
singularity build
My first proposed solution was #4673, which would solve the problem but require a pull request to introduce a new hook to Cromwell. And it doesn’t look like the Cromwell team have been able to prioritise this.
My new thought is that we could use file locks (e.g. flock
on linux) to deal with this issue, so that the first worker to run will create a file lock, then all subsequent workers will encounter that lock, and wait until it’s removed before attempting to build or run the image.
For example, we currently recommend this submit-docker
configuration:
submit-docker = """
# Ensure singularity is loaded if it's installed as a module
module load Singularity/3.0.1
# Build the Docker image into a singularity image
DOCKER_NAME=$(sed -e 's/[^A-Za-z0-9._-]/_/g' <<< ${docker})
IMAGE=${cwd}/$DOCKER_NAME.sif
if [ ! -f $IMAGE ]; then
singularity pull $IMAGE docker://${docker}
fi
# Submit the script to SLURM
sbatch \
--wait \
-J ${job_name} \
-D ${cwd} \
-o ${cwd}/execution/stdout \
-e ${cwd}/execution/stderr \
-t ${runtime_minutes} \
${"-c " + cpus} \
--mem-per-cpu=${requested_memory_mb_per_core} \
--wrap "singularity exec --bind ${cwd}:${docker_cwd} $IMAGE ${job_shell} ${script}"
"""
I’m instead proposing this. Note the use of a single shared image directory (/singularity_cache
in this example), and the use of flock
to ensure the submit scripts aren’t competing with each other:
submit-docker = """
# Ensure singularity is loaded if it's installed as a module
module load Singularity/3.0.1
# Determine the filepath to the image
DOCKER_NAME=$(sed -e 's/[^A-Za-z0-9._-]/_/g' <<< ${docker})
IMAGE=/singularity_cache/$DOCKER_NAME.sif
# Wait for an exclusive lock on the image
(
flock --exclusive 200
# Build the image
if [ ! -f $IMAGE ]; then
singularity pull $IMAGE docker://${docker}
fi
) 200>/var/lock/$IMAGE
# Submit the script to SLURM
sbatch \
--wait \
-J ${job_name} \
-D ${cwd} \
-o ${cwd}/execution/stdout \
-e ${cwd}/execution/stderr \
-t ${runtime_minutes} \
${"-c " + cpus} \
--mem-per-cpu=${requested_memory_mb_per_core} \
--wrap "singularity exec --bind ${cwd}:${docker_cwd} $IMAGE ${job_shell} ${script}"
"""
I haven’t tested this on our HPC cluster (it’s down for maintenance sadly!), but I’m interested if this makes sense as something we could get into the containers tutorial in order to recommend to users. @illusional, @vsoch @geoffjentry
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:36 (12 by maintainers)
Top GitHub Comments
I did not know of this thread. At our institute we have solved this differently. We use
singularity exec
and no specific pull command. This will try to locate the image in the cache whis is located inSINGULARITY_CACHEDIR
(env variable). If it is already there it will use it. If not, it will download it. This will lead to race condition if it is used in a scatter.We use https://github.com/biowdl/prepull-singularity to pull the images beforehand, so no race conditions occur.
I am also thinking of adding a
docker_pull
thing to the config, so you can dosingularity exec {image} echo done!
or something similar to make sure the cache is populated at workflow initialization time. I have no ETA on this though, for now the prepull singularity script works.I hope you are doing the pull on a login / dev node and not on something running massively in parallel? Or that the shub:// uri is interchangeable with docker:// or library:// ? Doing exec/run/pull in parallel is what led to devastating events in July that warranted adding extreme limits for all users to the server, and almost was the end of Singularity Hub. Ideally this really needs to be done with just one pull, and done before anything is run in parallel.