Clean Command Dependency Problems
See original GitHub issueThere were two problems I came across with the clean command:
- ‘-c’ option doesn’t pick up task dependencies where the target of one task is a file dependency of another task.
- The clean execution order can be incorrect if more than one task has a dependency on the same task.
Below is a tarball of a simple example project that demonstrates the issues. doit.tar.gz
There are 4 tasks and 1 group task.
Tasks:
my_grp: Group task for [‘tst’, ‘t2’] tst: Creates a file, task_dep for [‘setup’] t2: Creates a file, “inferred” task_dep for [‘t1’] due to target of ‘t1’ used as a file_dep t1: Creates a file, task_dep for [‘setup’] setup: Creates the output directory where all targets are stored
DOIT_CONFIG has {‘default_tasks’: [‘my_grp’]
After running doit on the dodo.py file all tasks are executed, but when I dry-run or even run doit clean things just seem wrong.
Clean Options:
- doit clean -n [-c]
- Should clean default tasks and dependencies
- Cleaned Tasks (in order): [‘setup’, ‘t2’, ‘tst’] WRONG
- Wrong order: ‘setup’ should have been last
- Missing task ‘t1’
- doit clean -n tst
- Should clean ‘tst’ only
- Cleaned Tasks (in order): [‘tst’] GOOD
- doit clean -n -c tst
- Should clean ‘tst’ and dependencies
- Cleaned Tasks (in order): [‘setup’, ‘tst’] WRONG
- Wrong order: ‘setup’ should have been last
- doit clean -n -c t2
- Should clean ‘t2’ and dependencies
- Cleaned Tasks (in order): [‘t2’] WRONG
- Missing task ‘t1’ and task ‘setup’
- doit clean -n -c t1
- Should clean ‘t1’ and dependencies
- Cleaned Tasks (in order): [‘setup’, ‘t1’] WRONG
- Wrong order: ‘setup’ should have been last
- doit clean -n -a
- Should clean all the tasks
- Cleaned Tasks (in order): [‘setup’, ‘t1’, ‘t2’, ‘tst’] WRONG
- Wrong Order: ‘t2’ and ‘tst’ should be first and ‘setup’ should be last
- Much like the implied ‘-c’ when using default tasks, it seems like -a should also clean based on dependency order, i.e. implied ‘-c’
- doit clean -n -a -c
- Should clean all the tasks with dependency order
- Cleaned Tasks (in order): [ ‘t2’, ‘tst’, ‘setup’, ‘t1’] WRONG
- Wrong Order: ‘setup’ should be last
Code
I haven’t had a chance to analyze all the code yet, but what I have found revolves around the to_clean.reverse()
processing. First, before this line, the to_clean list needs to be filtered to remove duplicate tasks, but maintain order of first occurance, e.g. for python 3.6+ to_clean=list(dict.fromkeys(to_clean))
. May be able to do the same with an OrderedDict for other versions of python. This will address attempt (7) above. Another problem is it seems based on the path taken to get the to_clean, the list is already reversed in some instances. Lastly, tasks_and_deps_iter()
doesn’t seem to be working correctly to pick up the implied task dependencies like those between ‘t1’ and ‘t2’.
Issue Analytics
- State:
- Created 6 years ago
- Comments:9 (5 by maintainers)
Top GitHub Comments
@leftink I basically re-wrote the clean command. Everything should work now the way you expected 😀 Let me know I missed anything.
Also note that I fixed the problem with sub-tasks. I removed the attribute
Task.is_subtask
and addedTask.subtask_of
where the value is a string with the name of task to which this sub-tasks belongs to.Another example of a clean issue: dodo.py.gz
This example is simple and demonstrates that too much is getting cleaned.
It is creating a directory structure, with a task creating each level of the full path. The intent is to be able to “clean” at any level.
to_clean.extend(subtasks_iter(tasks, task))
.subtasks_iter()
needs to check the task since the function is purely pulling task_dep which does not mean a sub-task if my understanding is correct.