question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Folder deletion in pathTools appears to be accounting for potential parallelism oddly

See original GitHub issue

In the implementation of cleanPath() there is this fun chunk of code:

https://github.com/terrapower/armi/blob/54664d800db9eec1f6dff584d5eda65329048d49/armi/utils/pathTools.py#L266-L277

The only reason I can imagine for attempting the deletion in a loop with delays and a broad try/except if for scenarios in which the folder deletion is being attempted in parallel, and/or targeting a directory structure on a shared network drive. If this is the case, then having every processor try to delete the folder and hoping for the best is a pretty sketchy way to go about it. If this is supposed to be possible in parallel, we should actually address the complexities of air-traffic control explicitly in MPI.

This would look something like having only one rank responsible for the deletion, while all others wait until the directory is apparently removed, and a barrier at the end to synchronize. The main concern here are questions like:

  • do we expect all call sites to be collective? if not, any communication that we do may lead to deadlocks if not all processors in a communicator are trying to cleanPath() at the same time
  • are all processors in a communicator attempting to clear the same path? if not, one rank per desired path deletion will need to be responsible

These aren’t simple questions to answer from within the function, so it is likely that decisions like this should be made from the call site. Something like this is easy enough to do:

if armi.MPI_RANK == 0:
    clearPath(path)
while os.path.exists(path):
    sleep(0.1)
armi.MPI_COMM.barrier()

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jakehadercommented, Sep 12, 2021

This may be separate from this, but I have experienced issues when writing a file to a drive location and then immediately trying to access it where I can an OSError. I wonder if we were to delete these sleep timers and run some cases if this would uncover other issues.

0reactions
john-sciencecommented, Dec 8, 2021

I’d like to try adding separate strategies for when MPI is/isn’t present, but it’s taking me a while to update & run test_mpiActions.py outside of tox. If this needs to be fixed in a more timely manner please feel free to take it.

If you are working on it, I will assign it to you. Fair is fair.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Trash Folder (for Rapid Parallel File Deletion) - VAST Data
The operation of moving files into the trash folder is supported for the root user only by default. It is also possible to...
Read more >
Student was deleted but their folder still appears in Class ...
After removing a student from a Class Notebook, they will no longer have permission to view or edit the notebook's shared content. However,...
Read more >
Delete a file or recover a file that you deleted from the Files app
You can recover files from the Recently Deleted folder. Delete a file. To delete a file, select it and tap the Delete button...
Read more >
Deleting a Page or Content Folder - CT.gov
Navigate to the page or content folder you wish to delete. · Right-click on the item and a menu will appear. · Select...
Read more >
c# - Cannot delete directory with Directory.Delete(path, true)
It does appear calling Directory.Delete(path, true) while path or one of the folders/files under path is open or selected in Windows Explorer will...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found