Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error trying to do chunked, compressed parallel write with MPI

See original GitHub issue

I am trying to do a chunked, compressed parallel write using h5py. My test script is a slightly modified version of the one from the docs:

from mpi4py import MPI
import h5py
import numpy as np

rank = MPI.COMM_WORLD.rank  # The process ID (integer 0-3 for 4-process run)

f = h5py.File('parallel_test.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD)

dset = f.create_dataset('test', (4, 1000), dtype='i',
                        chunks=(1, 1000), compression="gzip")
dset[rank] = np.full(1000, rank)

f.close()

I run this via mpirun -np 4 python basic_hdf_write.py. This produces the error:

_frozen_importlib:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 88 from C header, got 96 from PyObject
Traceback (most recent call last):
  File "basic_hdf_write.py", line 11, in <module>
    dset[rank] = np.full(1000, rank)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/rigel/ocp/users/ra2697/conda/envs/hdf5_zarr/lib/python3.6/site-packages/h5py/_hl/dataset.py", line 708, in __setitem__
    self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 221, in h5py.h5d.DatasetID.write
  File "h5py/_proxy.pyx", line 132, in h5py._proxy.dset_rw
  File "h5py/_proxy.pyx", line 93, in h5py._proxy.H5PY_H5Dwrite
OSError: Can't write data (Can't perform independent write with filters in pipeline.
    The following caused a break from collective I/O:
        Local causes: independent I/O was requested; datatype conversions were required
        Global causes: independent I/O was requested; datatype conversions were required)
Traceback (most recent call last):
  File "h5py/_objects.pyx", line 193, in h5py._objects.ObjectID.__dealloc__
RuntimeError: Can't decrement id ref count (can't close file, there are objects still open)
Exception ignored in: 'h5py._objects.ObjectID.__dealloc__'
Traceback (most recent call last):
  File "h5py/_objects.pyx", line 193, in h5py._objects.ObjectID.__dealloc__
RuntimeError: Can't decrement id ref count (can't close file, there are objects still open)
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[25903,1],0]
  Exit code:    1
--------------------------------------------------------------------------

Googline the Can't perform independent write with filters in pipeline error lead me to the following HDF5 threads:

https://groups.google.com/forum/#!topic/h5py/bElGpATAwAI
https://forum.hdfgroup.org/t/parallel-i-o-does-not-support-filters-yet/884 (March '18)
https://www.hdfgroup.org/2018/04/why-should-i-care-about-the-hdf5-1-10-2-release/ (describes how to use compression with HDF5 parallel applications)
https://forum.hdfgroup.org/t/compressed-parallel-writing-problem/4979/2 (Oct. '18)

It sounds like compressed, parallel writes should definitely work with HDF5 1.10.4 (the version I am using with h5py). However, the h5py docs don’t give any examples of how to use this feature.

I would appreciate any suggestions you may have.

My h5py was installed from the new conda-forge build with mpi support. I am on Linux.

h5py    2.9.0
HDF5    1.10.4
Python  3.6.7 | packaged by conda-forge | (default, Nov 21 2018, 02:32:25) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]
sys.platform    linux
sys.maxsize     9223372036854775807
numpy   1.15.0

I am using Open MPI version 3.1.2.

Issue Analytics

State:
Created 5 years ago
Reactions:4
Comments:12 (2 by maintainers)

Top GitHub Comments

2reactions

a6commented, Jun 26, 2019

I can confirm that this error is still present, even with using the collective helper, with HDF5 1.10.5, h5py 2.9.0, and OpenMPI 3.1.4.

0reactions

lsawadecommented, Nov 30, 2022

Edit 3 – The original does works as long as the offsets are non-zero

It works when offsets are non-zero, I mean, every worker has to write! You cannot not write to the array from a worker.

import numpy as np
from mpi4py import MPI
import h5py


comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

N = np.array([2, 1, 3, 2])
Ntot = np.sum(N)
offsets = np.hstack((np.array([0]), np.cumsum(N)))

if size != 4:
    raise ValueError('This was made to be run with 4 processors.')

precision = np.float16

if rank == 0:
    x = np.random.randn(2, 1000).astype(precision)
elif rank == 1:
    x = np.random.randn(1, 1000).astype(precision)
elif rank == 2:
    x = np.random.randn(3, 1000).astype(precision)
elif rank == 3:
    x = np.random.randn(2, 1000).astype(precision)


# dt = h5py.string_dtype(encoding='utf-8', length=10)
with h5py.File('parallel_test.h5', 'w', driver='mpio', comm=comm) as db:

    dset = db.create_dataset(
        'test', (Ntot, 1000), dtype=precision,
        chunks=(1, 1000), compression="lzf")

    with dset.collective:
        if offsets[rank+1]-offsets[rank] > 0:
            print(rank, np.min(x), np.max(x))
            dset[offsets[rank]:offsets[rank+1], :] = x

Edit 2 – It works if the dataset is not being slice heterogenously:

edit 2

import numpy as np
from mpi4py import MPI
import h5py


comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

with h5py.File('parallel_test.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD) as f:

    dset = f.create_dataset('test', (4, 1000), dtype='i',
                            chunks=(1, 1000), compression="gzip")

    with dset.collective:
        dset[rank] = np.full(1000, rank, dtype='i')

Edit it’s not working… I was being silly…

edit

~Just for posterity, I attached a code with an added complexity that is writing irregularly size arrays using offset to the same array. There were a couple of hiccups. This seems to work for me without any problem!~

~Quite necessary was the qualifier~

if offsets[rank]-offsets[rank+1] > 0:

~without which the program would just hang [In my use case, I only need to write from some processors].~

Attached code

import numpy as np
from mpi4py import MPI
import h5py


comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

N = np.array([3, 4, 0, 2])
Ntot = np.sum(N)
offsets = np.hstack((np.array([0]), np.cumsum(N)))

if size != 4:
    raise ValueError('This was made to be run with 4 processors.')

if rank == 0:
    x = np.random.randn(3, 1000).astype('f')
elif rank == 1:
    x = np.random.randn(4, 1000).astype('f')
elif rank == 2:
    x = np.random.randn(0, 1000).astype('f')
elif rank == 3:
    x = np.random.randn(2, 1000).astype('f')

comm.Barrier()

with h5py.File('parallel_test.h5', 'w', driver='mpio', comm=comm) as db:

    dset = db.create_dataset('test', (Ntot, 1000), dtype='f',
                             chunks=(1, 1000), compression="gzip")
    with dset.collective:
        if offsets[rank]-offsets[rank+1] > 0:
            dset[offsets[rank]:offsets[rank+1], :] = x

~The comm.Barrier() is not really necessary – a remnant of previous iterations.~

~It is important to note that I compiled HDF5 “by hand” using installed cluster gcc, and openmpi.~

h5pcc -showconfig

           SUMMARY OF THE HDF5 CONFIGURATION
            =================================

General Information:
-------------------
                   HDF5 Version: 1.12.2
                  Configured on: Tue Nov 15 13:47:08 EST 2022
                  Configured by: lsawade@traverse.princeton.edu
                    Host system: powerpc64le-unknown-linux-gnu
              Uname information: Linux traverse.princeton.edu 4.18.0-372.32.1.el8_6.ppc64le #1 SMP Fri Oct 7 11:37:39 EDT 2022 ppc64le ppc64le ppc64le GNU/Linux
                       Byte sex: little-endian
             Installation point: /scratch/gpfs/lsawade/SpecfemMagicGF/packages/hdf5/build

Compiling Options:
------------------
                     Build Mode: production
              Debugging Symbols: no
                        Asserts: no
                      Profiling: no
             Optimization Level: high

Linking Options:
----------------
                      Libraries: static, shared
  Statically Linked Executables: 
                        LDFLAGS: -L/usr/local/cuda-11.7/lib64
                     H5_LDFLAGS: 
                     AM_LDFLAGS: 
                Extra libraries: -lz -ldl -lm 
                       Archiver: ar
                       AR_FLAGS: cr
                         Ranlib: ranlib

Languages:
----------
                              C: yes
                     C Compiler: /usr/local/openmpi/4.0.4/gcc/ppc64le/bin/mpicc ( gcc (GCC) 8.5.0 20210514 )
                       CPPFLAGS: 
                    H5_CPPFLAGS: -D_GNU_SOURCE -D_POSIX_C_SOURCE=200809L   -DNDEBUG -UH5_DEBUG_API
                    AM_CPPFLAGS: 
                        C Flags: 
                     H5 C Flags:  -std=c99  -Wall -Wcast-qual -Wconversion -Wextra -Wfloat-equal -Wformat=2 -Winit-self -Winvalid-pch -Wmissing-include-dirs -Wshadow -Wundef -Wwrite-strings -pedantic -Wno-c++-compat -Wlarger-than=2560 -Wlogical-op -Wframe-larger-than=16384 -Wpacked-bitfield-compat -Wsync-nand -Wstrict-overflow=5 -Wno-unsuffixed-float-constants -Wdouble-promotion -Wtrampolines -Wstack-usage=8192 -Wmaybe-uninitialized -Wdate-time -Warray-bounds=2 -Wc99-c11-compat -Wduplicated-cond -Whsa -Wnormalized -Wnull-dereference -Wunused-const-variable -Walloca -Walloc-zero -Wduplicated-branches -Wformat-overflow=2 -Wformat-truncation=1 -Wrestrict -Wattribute-alias -Wcast-align=strict -Wshift-overflow=2 -fstdarg-opt  -s  -Wbad-function-cast -Wimplicit-function-declaration -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpacked -Wpointer-sign -Wpointer-to-int-cast -Wint-to-pointer-cast -Wredundant-decls -Wstrict-prototypes -Wswitch -Wunused-function -Wunused-variable -Wunused-parameter -Wcast-align -Wunused-but-set-variable -Wformat -Wincompatible-pointer-types -Wint-conversion -Wshadow -Wcast-function-type -Wmaybe-uninitialized -Wno-aggregate-return -Wno-inline -Wno-missing-format-attribute -Wno-missing-noreturn -Wno-overlength-strings -Wno-jump-misses-init -Wno-suggest-attribute=const -Wno-suggest-attribute=noreturn -Wno-suggest-attribute=pure -Wno-suggest-attribute=format -Wno-suggest-attribute=cold -Wno-suggest-attribute=malloc -O3
                     AM C Flags: 
               Shared C Library: yes
               Static C Library: yes


                        Fortran: yes
               Fortran Compiler: /usr/local/openmpi/4.0.4/gcc/ppc64le/bin/mpif90 ( Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-targets=powerpcle-linux --disable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --disable-libmpx --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-secureplt --with-long-double-128 --with-cpu-32=power8 --with-tune-32=power8 --with-cpu-64=power8 --with-tune-64=power8 --build=ppc64le-redhat-linux built with gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC))
                  Fortran Flags: 
               H5 Fortran Flags:  -std=f2008  -Waliasing -Wall -Wcharacter-truncation -Wextra -Wimplicit-interface -Wsurprising -Wunderflow -pedantic -Warray-temporaries -Wintrinsics-std -Wimplicit-procedure -Wreal-q-constant -Wfunction-elimination -Wrealloc-lhs -Wrealloc-lhs-all -Wno-c-binding-type -Winteger-division -Wfrontend-loop-interchange   -s -O3
               AM Fortran Flags: 
         Shared Fortran Library: yes
         Static Fortran Library: yes

                            C++: no

                           Java: no


Features:
---------
                     Parallel HDF5: yes
  Parallel Filtered Dataset Writes: yes
                Large Parallel I/O: yes
                High-level library: yes
Dimension scales w/ new references: no
                  Build HDF5 Tests: yes
                  Build HDF5 Tools: yes
                      Threadsafety: no
               Default API mapping: v112
    With deprecated public symbols: yes
            I/O filters (external): deflate(zlib)
                               MPE: 
                     Map (H5M) API: no
                        Direct VFD: no
                        Mirror VFD: no
                (Read-Only) S3 VFD: no
              (Read-Only) HDFS VFD: no
                           dmalloc: no
    Packages w/ extra debug output: none
                       API tracing: no
              Using memory checker: no
   Memory allocation sanity checks: no
            Function stack tracing: no
                  Use file locking: best-effort
         Strict file format checks: no
      Optimization instrumentation: no

Top Results From Across the Web

Crash when writing parallel compressed chunks - HDF Forum

I'm finding crashes when I try to write compressed datasets in parallel with the MPIO driver. I have produced a (fairly simple) test...

A Case Study on Parallel HDF5 Dataset Concatenation ... - arXiv

For parallel write operations, compressed dataset chunks are assigned to an exclusive owner process, and then partial accesses to the chunk ...

Iterative Data Write — PyNWB unknown documentation

Defining HDF5 Dataset I/O Settings (chunking, compression, etc.) Iterative Data Write; Modular Data Storage using External Files · Parallel I/O using MPI ......

A Brief Introduction to Parallel HDF5

56 GB/s I/O rate in writing 5TB data using 5K ... processes to perform I/O to an HDF5 file at ... stream” and...

MPI-parallel Molecular Dynamics Trajectory Analysis with the ...

calculation on their chunk of data, and then gathered the results back to the ... parallel MPI-IO capable HDF5-based file format trajectory reader....