Checkpoints unable to write to local disk
See original GitHub issueDescription
A clear description of the bug
LocalResultHandler
isn’t able to write to the folder that’s created when checkpoint=True
.
I’m just guessing that prefect’s creating the temporary directory with the intention of saving the serialized object there. But it seems that perhaps it’s trying to write the serialized object before the temporary directory is created? Not sure!
Expected Behavior
What did you expect to happen instead?
https://docs.prefect.io/core/concepts/persistence.html#checkpointing
Reproduction
A minimal example that exhibits the behavior.
test_cache.py
:
from prefect import task, Flow
from prefect.engine.result_handlers import LocalResultHandler
@task(checkpoint=True, result_handler=LocalResultHandler(dir="~/.prefect"))
def hello():
return 'hello'
with Flow('test checkpoint') as flow:
h = hello()
flow.run()
$ python test_cache.py
[2020-01-07 20:20:30,905] INFO - prefect.FlowRunner | Beginning Flow run for 'test checkpoint'
[2020-01-07 20:20:30,908] INFO - prefect.FlowRunner | Starting flow run.
[2020-01-07 20:20:30,916] INFO - prefect.TaskRunner | Task 'print_df': Starting task run...
[2020-01-07 20:20:30,917] ERROR - prefect.TaskRunner | Unexpected error: FileNotFoundError(2, 'No such file or directory')
Traceback (most recent call last):
File "/Users/bryanwhiting/venvs/py37/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "/Users/bryanwhiting/venvs/py37/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 907, in get_task_run_state
state._result.store_safe_value()
File "/Users/bryanwhiting/venvs/py37/lib/python3.7/site-packages/prefect/engine/result.py", line 90, in store_safe_value
value = self.result_handler.write(self.value)
File "/Users/bryanwhiting/venvs/py37/lib/python3.7/site-packages/prefect/engine/result_handlers/local_result_handler.py", line 58, in write
fd, loc = tempfile.mkstemp(prefix="prefect-", dir=self.dir)
File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/tempfile.py", line 340, in mkstemp
return _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/tempfile.py", line 258, in _mkstemp_inner
fd = _os.open(file, flags, 0o600)
FileNotFoundError: [Errno 2] No such file or directory: '~/.prefect/prefect-bdz439pq'
[2020-01-07 20:20:30,946] INFO - prefect.TaskRunner | Task 'print_df': finished task run for task with final state: 'Failed'
[2020-01-07 20:20:30,947] INFO - prefect.FlowRunner | Flow run FAILED: some reference tasks failed.
Environment
Any additional information about your environment
- Python 3.7.5
- prefect 0.8.1
I’m loving the new releases of Prefect! You guys are definitely heading in the right direction. Keep it up!!!
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Checkpoint apply/delete errors when using partitions mounted ...
So we got a new server in and set it up with WS 2012 R2 and set it up the same way with...
Read more >How to Fix the Error: Hyper-V Checkpoint Operation Failed
Open VM settings. Click Checkpoints in the Management section. Change the type of checkpoint by selecting the appropriate option (change ...
Read more >How to Fix the Error: Hyper-V Checkpoint Operation Failed
The Hyper-V checkpoint operation failed error and likely issues can appear due to wrong permission settings for VM folders, VSS issues, and ...
Read more >Why can't I use a snapshot instead of a backup?
When an administrator creates a checkpoint for a Hyper-V virtual machine, it does not make a backup copy of the virtual hard disk...
Read more >secondary NN failing to checkpoint - Cloudera Community
You can manually force a check point see doc . Is it linked to the secondary namenode's local storage problem. Can you check...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Fascinating!! Thanks for the reply. I’m super excited to see what the checkpointing solution looks like. That’s going to be a definite game changer/airflow killer! Great work and thanks for the detailed response.
Hi @mivade - yea we’re aiming to release the new Results API later next week (in 0.11.0) which will include a more user friendly caching mechanism. You’re more than welcome to open an issue in the meantime, but I’ll be most curious to hear whether the new API satisfies your use case or not so it might be worthwhile to hold off until it’s released!