question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SWMR cannot be turned off once set to True; Allow forceful switching off

See original GitHub issue
  • Operating System: import h5py; print(h5py.version.info)
  • Python version: 3.7.5 (default, Oct 25 2019, 15:51:11)
  • Where Python was acquired: Miniconda3
  • h5py version: 2.10.0
  • HDF5 version: 1.10.5
  • numpy: 1.17.3

When using SWMR, are you limited to creating datasets only at creation time? Or are you supposed to be able to turn-off SWMR when it is not accessed by any other process?

Turning off SWMR is not possible (Jupyter Notebook, kernel restarted):

arr = np.array([.4, -.1, -.5, 8])
h5 = h5py.File("swmr_test.h5", 'w', libver='latest')
h5["np"] = arr
h5.swmr_mode = True
h5.swmr_mode = False

ValueError:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-22f0396ab8ff> in <module>
----> 1 h5.swmr_mode = False

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

/opt/miniconda3/envs/audio_tester/lib/python3.7/site-packages/h5py/_hl/files.py in swmr_mode(self, value)
    312                 self._swmr_mode = True
    313             else:
--> 314                 raise ValueError("It is not possible to forcibly switch SWMR mode off.")
    315 
    316     def __init__(self, name, mode=None, driver=None,

ValueError: It is not possible to forcibly switch SWMR mode off.

If this is intended, it would be nice to clarify this in the documentation.


I know you don’t manage HDF5 itself, but could you help me with my thought process?

If with SWMR you’re limited to only dataset creation at file creation time, isn’t useless for almost any real environment setting? At least in terms of it being a database. In my case I’m working with Apache Airflow, meaning that independent process at various times will mostly read, but sometime write to a database. Since I’m dealing with audio data, columnar storage file formats are not the solution.

At first I thought Parallel HDF5 would be the solution here with MPI. However, you need to execute it with mpiexec, meaning you need to coordinate them and you cannot trigger it from something like Airflow. SWMR also seemed good, because to create new datasets you only needed to turn off SWMR for short moments (small risk at blocking readers is acceptable in my case). Then data can be added without interrupting readers, and without coordination. But this doesn’t seem to be how SWMR works.

Is it possible from h5py’s side to implement a forceful method to set the file back to h5.swmr_mode = False? Even if this would block readers, it would make SWMR much more useful.


Related issue: https://github.com/h5py/h5py/issues/712

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
NumesSanguiscommented, Dec 5, 2019

Thank you for the detailed response. I have a better understanding of how access to HDF5 files in a multi-process setting works.

Corrupted files is indeed not what should be risked. I’m not familiar with the exact implementation, but what I imagined was that with h5.swmr_mode = False, only subsequent reading attempts fail. Preferably some mechanism in SWMR that checks at every read attempt if it is still in SWMR mode?

Since it is only readers that will be blocked when set to False, those don’t have the power to corrupt anything, right? Although I can image that the data being pulled might be incorrect… which is also not desirable.

0reactions
takluyvercommented, Dec 8, 2019

It doesn’t sound like HDF5 is going to work well for what you’re trying to do. SWMR doesn’t really make it easy to have multiple writers, even if only one process needs to write at a time - you’d need some external mechanism to coordinate which process can be the writer, and some clunky closing/reopening of files when the writer changes. SWMR is really meant for scenarios where you have one fixed data producer and one or more plain consumers.

I’m going to close this as I don’t think there’s anything to fix in h5py: you’re just hitting the limitations of the underlying HDF5 feature.

Read more comments on GitHub >

github_iconTop Results From Across the Web

HDF5 Multi Threaded Alternative - PyTorch Forums
This file is part of h5py, a Python interface to the HDF5 library. ... SWMR cannot be turned off once set to True;...
Read more >
Nest thermostat Off mode - Google Support
Set your thermostat to Off mode ; Nest Thermostat 3/4ths view. Nest Thermostat. Press and hold the touch bar until “Turn off” appears...
Read more >
OpenQuake for Advanced Users 3.15.0 documentation - Index of
General#. This manual is for advanced users, i.e. people who already know how to use the engine and have already read the official...
Read more >
HDF5/H5F API Specification - The HDF Group
Note the following deviation from the above-described behavior. If H5Fclose is called for a file but one or more objects within the file...
Read more >
https://www.psych.mcgill.ca/labs/mogillab/anaconda...
This file is part of h5py, a Python interface to the HDF5 library. ... True else: raise ValueError("It is not possible to forcibly...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found