Reading backend while it is being written sometimes throws an error
See original GitHub issueGeneral information:
- emcee version: 3.0.2
- platform: Ubuntu 18.04
- installation method (pip/conda/source/other?): conda
Problem description:
Expected behavior:
The backend (HDF5 file) can be read with no errors while the chain is running and the backend is being written.
Actual behavior:
The process writing to the backend sometimes raises an error when another process is trying to read the HDF5 file. The errors, copied from the shell, is this one
Traceback (most recent call last):
  File "/home/mazzi/miniconda3/envs/pylegal/lib/python3.9/site-packages/h5py/_hl/files.py", line 202, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 96, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/mazzi/Documenti/DOTTORATO/Progetti/sfhchain/code/mcmc.py", line 741, in <module>
    mcmc(settings)
  File "/home/mazzi/Documenti/DOTTORATO/Progetti/sfhchain/code/mcmc.py", line 296, in mcmc
    for sample in sampler.sample(pos[region_idx, :, :], iterations=STEPS, skip_initial_state_check=True, progress=False):
  File "/home/mazzi/miniconda3/envs/pylegal/lib/python3.9/site-packages/emcee/ensemble.py", line 351, in sample
    self.backend.save_step(state, accepted)
  File "/home/mazzi/miniconda3/envs/pylegal/lib/python3.9/site-packages/emcee/backends/hdf.py", line 206, in save_step
    with self.open("a") as f:
  File "/home/mazzi/miniconda3/envs/pylegal/lib/python3.9/site-packages/emcee/backends/hdf.py", line 67, in open
    f = h5py.File(self.filename, mode)
  File "/home/mazzi/miniconda3/envs/pylegal/lib/python3.9/site-packages/h5py/_hl/files.py", line 424, in __init__
    fid = make_fid(name, mode, userblock_size,
  File "/home/mazzi/miniconda3/envs/pylegal/lib/python3.9/site-packages/h5py/_hl/files.py", line 204, in make_fid
    fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 116, in h5py.h5f.create
OSError: Unable to create file (unable to open file: name = 'results/DEBUG/chain-allstars_0000.hdf5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
What have you tried so far?:
I tried setting read_only=True when instantiating the HDFBackend in the script that tries to read the backend, but the problem was not solved.
Minimal example:
Run a chain using writer.py and read multiple times with reader.py. After a few tries the error should appear.
- writer.py
import time
import emcee
import numpy as np
def lnprob(x):
    time.sleep(0.01)
    return 0.
nwalkers = 100
nsteps = 10000
backend = emcee.backends.HDFBackend('backend.h5')
backend.reset(nwalkers,1)
sampler = emcee.EnsembleSampler(nwalkers,1,lnprob,backend=backend)
pos0 = np.ones(nwalkers) + ((np.random.random(nwalkers)-0.5)*2e-3)
print(pos0.shape)
sampler.run_mcmc(pos0[:, None],nsteps,progress=True)
- 'reader.py`
import emcee
backend = emcee.backends.HDFBackend('backend.h5',read_only=True)
chain = backend.get_chain()
Edit for the sake of completeness: while the example above does not use multiprocessing, in my actual code I do use it. I see the error both with and without mutiprocessing.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (3 by maintainers)
 Top Results From Across the Web
Top Results From Across the Web
Whether all error messages should come from the backend in ...
So my backend developer sends me error messages that are not suitable for output to an alert arguing that the message is I...
Read more >Best Practices for Node.js Error-handling - Toptal
Developers working with Node.js sometimes find themselves writing not-so-clean code while handling all sorts of errors. This article will introduce you to ...
Read more >Common 503 errors on Fastly | Fastly Help Guides
Error 503 backend read error This error typically appears if a timeout error occurs when Fastly cache servers attempt to fetch content from ......
Read more >How to Fix the "There Has Been a Critical Error on Your ...
This startling glitch would cause your entire website, and sometimes even your backend, to load as a blank white page.
Read more >Node.js Error Handling Best Practices: Ship With Confidence
Node.js error handling isn't a walk in the park. When deploying applications into production, we want to know that all code has been...
Read more > Top Related Medium Post
Top Related Medium Post
No results found
 Top Related StackOverflow Question
Top Related StackOverflow Question
No results found
 Troubleshoot Live Code
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free Top Related Reddit Thread
Top Related Reddit Thread
No results found
 Top Related Hackernoon Post
Top Related Hackernoon Post
No results found
 Top Related Tweet
Top Related Tweet
No results found
 Top Related Dev.to Post
Top Related Dev.to Post
No results found
 Top Related Hashnode Post
Top Related Hashnode Post
No results found

@Thalos12: great! I’d be happy to review such a PR!
@dfm thank you for pointing this out, I totally missed it… I edited my code accordingly and it turns out its just not waiting for the master process to finish, I also have to define the backends and a few other things inside the
pool, now the following code is working:Just documenting this in case other new users run into the same problem with MPIPool.