[ppc64le] Many tests are failing on linux IBM power9 machines
See original GitHub issueHi, I am facing many seg-faults when reading HDF5 files with h5py. Here are outcome of may investigations based on freshly built hdf5 (1.10.5). Note that all tests passed on the HDF5 side.
- Operating System Ubuntu 18.04 (using gcc-8)
- Python 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0] on linux
- Python from the system + virtualenv with cython&numpy pip-installed
- h5py version: master and 2.9.0
- HDF5 version: 1.10.5
While trying to read a float64-dataset of shape (4, 16384, 1024), uncompressed, unchunked I got this exception (from gdb):
Thread 1 “python” received signal SIGSEGV, Segmentation fault. __memcpy_power7 () at …/sysdeps/powerpc/powerpc64/power7/memcpy.S:392 392 …/sysdeps/powerpc/powerpc64/power7/memcpy.S: No such file or directory.
Note that the power7 is a big-endian architecture, while power9 is little endian, so no surprise this ends badly. the full backtrace is:
#0 __memcpy_power7 () at ../sysdeps/powerpc/powerpc64/power7/memcpy.S:392
#1 0x00007ffed99485b8 in memcpy (__len=<optimized out>, __src=<optimized out>, __dest=0x7ffeb9512010) at /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:34
#2 H5D__scatter_mem (_buf=0x7ffeb9510010, nelmts=65536, iter=0x10c4cd10, space=0x10b59440, _tscat_buf=<optimized out>) at /home/test/HDF5/hdf5-1.10.5/src/H5Dscatgath.c:340
#3 H5D__scatter_mem (_tscat_buf=<optimized out>, space=0x10b59440, iter=0x10c4cd10, nelmts=<optimized out>, _buf=0x7ffeb9510010) at /home/test/HDF5/hdf5-1.10.5/src/H5Dscatgath.c:291
#4 0x00007ffed9948f04 in H5D__scatgath_read (io_info=0x7fffffffd770, type_info=<optimized out>, nelmts=<optimized out>, file_space=0x10b7c160, mem_space=0x10b59440)
at /home/test/HDF5/hdf5-1.10.5/src/H5Dscatgath.c:567
#5 0x00007ffed9929720 in H5D__contig_read (io_info=<optimized out>, type_info=<optimized out>, nelmts=<optimized out>, file_space=<optimized out>, mem_space=<optimized out>, fm=<optimized out>)
at /home/test/HDF5/hdf5-1.10.5/src/H5Dcontig.c:595
#6 0x00007ffed994356c in H5D__read (dataset=<optimized out>, mem_type_id=<optimized out>, mem_space=0x10b59440, file_space=0x10b7c160, buf=0x7ffeb9510010) at /home/test/HDF5/hdf5-1.10.5/src/H5Dio.c:600
#7 0x00007ffed9943a3c in H5Dread (dset_id=<optimized out>, mem_type_id=216172782113784169, mem_space_id=288230376151711753, file_space_id=288230376151711751, dxpl_id=720575940379279384, buf=0x7ffeb9510010)
at /home/test/HDF5/hdf5-1.10.5/src/H5Dio.c:198
#8 0x00007ffed97051d0 in __pyx_f_4h5py_4defs_H5Dread (__pyx_v_dset_id=360287970189639680, __pyx_v_mem_type_id=216172782113784169, __pyx_v_mem_space_id=288230376151711753,
__pyx_v_file_space_id=288230376151711751, __pyx_v_plist_id=720575940379279384, __pyx_v_buf=0x7ffeb9510010) at /home/test/workspace/h5py/h5py/defs.c:5875
#9 0x00007ffff6404f0c in __pyx_f_4h5py_6_proxy_H5PY_H5Dread (__pyx_v_dset=<optimized out>, __pyx_v_mtype=<optimized out>, __pyx_v_mspace=<optimized out>, __pyx_v_fspace=<optimized out>,
__pyx_v_dxpl=<optimized out>, __pyx_v_buf=<optimized out>) at /home/test/workspace/h5py/h5py/_proxy.c:1881
#10 0x00007ffff6407554 in __pyx_f_4h5py_6_proxy_dset_rw (__pyx_v_dset=360287970189639680, __pyx_v_mtype=216172782113784169, __pyx_v_mspace=288230376151711753, __pyx_v_fspace=288230376151711751,
__pyx_v_dxpl=720575940379279384, __pyx_v_progbuf=0x7ffeb9510010, __pyx_v_read=<optimized out>) at /home/test/workspace/h5py/h5py/_proxy.c:2233
#11 0x00007ffff63d10ec in __pyx_pf_4h5py_3h5d_9DatasetID_read (__pyx_v_self=<optimized out>, __pyx_v_mspace=<optimized out>, __pyx_v_fspace=<optimized out>, __pyx_v_dxpl=<optimized out>,
__pyx_v_mtype=0x7ffff59663b8, __pyx_v_arr_obj=<optimized out>) at /home/test/workspace/h5py/h5py/h5d.c:3789
#12 __pyx_pw_4h5py_3h5d_9DatasetID_1read (__pyx_v_self=<optimized out>, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>) at /home/test/workspace/h5py/h5py/h5d.c:3639
#13 0x00000000101b6438 in _PyCFunction_FastCallDict (func_obj=<built-in method read of h5py.h5d.DatasetID object at remote 0x7ffff5966308>, args=<optimized out>, nargs=<optimized out>, kwargs=<optimized out>)
at ../Objects/methodobject.c:231
#14 0x000000001021d950 in _PyObject_FastCallDict (func=<built-in method read of h5py.h5d.DatasetID object at remote 0x7ffff5966308>, args=0x7ffff5966220, nargs=4, kwargs=<optimized out>)
at ../Objects/abstract.c:2313
#15 0x000000001008151c in methoddescr_call (descr=0x7ffff65fc948,
args=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), kwds={'dxpl': <h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>}) at ../Objects/descrobject.c:246
#16 0x00007ffed9754ffc in __Pyx_PyObject_Call (kw={'dxpl': <h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>},
arg=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), func=<method_descriptor at remote 0x7ffff65fc948>) at /home/test/workspace/h5py/h5py/_objects.c:9664
#17 __pyx_pf_4h5py_8_objects_9with_phil_wrapper (__pyx_v_kwds={'dxpl': <h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>},
__pyx_v_args=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), __pyx_self=<optimized out>) at /home/test/workspace/h5py/h5py/_objects.c:3593
#18 __pyx_pw_4h5py_8_objects_9with_phil_1wrapper (__pyx_self=<optimized out>,
__pyx_args=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), __pyx_kwds=<optimized out>) at /home/test/workspace/h5py/h5py/_objects.c:3517
#19 0x00007ffed9749098 in __Pyx_CyFunction_CallMethod (func=<optimized out>, self=<cython_function_or_method at remote 0x7ffff6600100>, arg=<optimized out>, kw=<optimized out>)
at /home/test/workspace/h5py/h5py/_objects.c:10731
#20 0x0000000010222558 in _PyObject_FastCallDict (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>, func=<cython_function_or_method at remote 0x7ffff6600100>) at ../Objects/tupleobject.c:131
#21 _PyObject_FastCallKeywords (func=<cython_function_or_method at remote 0x7ffff6600100>, stack=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at ../Objects/abstract.c:2496
#22 0x00000000101111d4 in call_function (pp_stack=0x7fffffffe130, oparg=<optimized out>, kwnames=('dxpl',)) at ../Python/ceval.c:4861
#23 0x0000000010118094 in _PyEval_EvalFrameDefault (
f=Frame 0x10c49d38, for file /home/test/venv-kieffer_system/lib/python3.6/site-packages/h5py/_hl/dataset.py, line 574, in __getitem__ (self=<Dataset(_id=<h5py.h5d.DatasetID at remote 0x7ffff5966308>, _dcpl=<h5py.h5p.PropDCID at remote 0x7ffff75e84a8>, _dxpl=<h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>, _filters={}, _local=<_thread._local at remote 0x7ffff5966360>) at remote 0x7ffff75e9f60>, args=(<ellipsis at remote 0x104ca6b8>,), names=(), new_dtype=<numpy.dtype at remote 0x7ffff59d53f0>, mtype=<h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>, mshape=(4, 16384, 1024), fspace=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, selection=<SimpleSelection(_shape=(4, 16384, 1024), _id=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, _sel=((0, 0, 0), (4, 16384, 1024), (1, 1, 1), (False, False, False)), _mshape=(...)) at remote 0x7ffff75e9f98>, arr=<numpy.ndarray at remote 0x7ffff5967300>, mspace=<h5py.h5s.SpaceID at remote 0x7ffff59672c8>, single_element=False), throwflag=<optimized out>) at ../Python/ceval.c:3351
#24 0x0000000010113cac in PyEval_EvalFrameEx (throwflag=0,
f=Frame 0x10c49d38, for file /home/test/venv-kieffer_system/lib/python3.6/site-packages/h5py/_hl/dataset.py, line 574, in __getitem__ (self=<Dataset(_id=<h5py.h5d.DatasetID at remote 0x7ffff5966308>, _dcpl=<h5py.h5p.PropDCID at remote 0x7ffff75e84a8>, _dxpl=<h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>, _filters={}, _local=<_thread._local at remote 0x7ffff5966360>) at remote 0x7ffff75e9f60>, args=(<ellipsis at remote 0x104ca6b8>,), names=(), new_dtype=<numpy.dtype at remote 0x7ffff59d53f0>, mtype=<h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>, mshape=(4, 16384, 1024), fspace=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, selection=<SimpleSelection(_shape=(4, 16384, 1024), _id=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, _sel=((0, 0, 0), (4, 16384, 1024), (1, 1, 1), (False, False, False)), _mshape=(...)) at remote 0x7ffff75e9f98>, arr=<numpy.ndarray at remote 0x7ffff5967300>, mspace=<h5py.h5s.SpaceID at remote 0x7ffff59672c8>, single_element=False)) at ../Python/ceval.c:4166
#25 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=<optimized out>,
kwcount=<optimized out>, kwstep=2, defs=<optimized out>, defcount=<optimized out>, kwdefs=<optimized out>, closure=<optimized out>, name=<optimized out>, qualname=<optimized out>) at ../Python/ceval.c:4166
#26 0x00000000101ede04 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=<optimized out>, args=<optimized out>, locals=0x0, globals=<optimized out>,
_co=<optimized out>) at ../Python/ceval.c:4187
#27 function_call (func=<function at remote 0x7ffff6160840>,
Issue Analytics
- State:
- Created 4 years ago
- Comments:26 (25 by maintainers)
Top Results From Across the Web
GitLab Runner on POWER9 - ppc64le architecture (Linux OS)
Release notes Several GitLab customers use IBM's Power9 (ppc64le compute architecture) systems for compute-intensive workloads.
Read more >Platform diagnostics (ppc64-diag) - IBM
Platform diagnostics report firmware events, provide an automated response mechanism to urgent events, and provide event notifications to system ...
Read more >on POWER 9, libvirt says KVM is available in an LPAR, but it's ...
Description of problem: The libguestfs-test-tool command fails on ppc64le(Power 9) when running gating test for ...
Read more >Base R on Power 9 CPU working (ppc64le architecture)?
There is an r-base package available for the IBM Power9 CPU architecture ( ppc64le ) on Ubuntu Linux and I am wondering if...
Read more >Power9 System Firmware - Inspur
ppc64 or ppc64le - describes the Linux code that is compiled to run on ... HIPER/Non-Pervasive: A problem was fixed for the IBM...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Yes, exactly, the
test_custom_float_promotion
should cover my use case.Can you make a PR from the patch so we can see what the change is?