question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[ppc64le] Many tests are failing on linux IBM power9 machines

See original GitHub issue

Hi, I am facing many seg-faults when reading HDF5 files with h5py. Here are outcome of may investigations based on freshly built hdf5 (1.10.5). Note that all tests passed on the HDF5 side.

  • Operating System Ubuntu 18.04 (using gcc-8)
  • Python 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0] on linux
  • Python from the system + virtualenv with cython&numpy pip-installed
  • h5py version: master and 2.9.0
  • HDF5 version: 1.10.5

While trying to read a float64-dataset of shape (4, 16384, 1024), uncompressed, unchunked I got this exception (from gdb):

Thread 1 “python” received signal SIGSEGV, Segmentation fault. __memcpy_power7 () at …/sysdeps/powerpc/powerpc64/power7/memcpy.S:392 392 …/sysdeps/powerpc/powerpc64/power7/memcpy.S: No such file or directory.

Note that the power7 is a big-endian architecture, while power9 is little endian, so no surprise this ends badly. the full backtrace is:

#0  __memcpy_power7 () at ../sysdeps/powerpc/powerpc64/power7/memcpy.S:392
#1  0x00007ffed99485b8 in memcpy (__len=<optimized out>, __src=<optimized out>, __dest=0x7ffeb9512010) at /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:34
#2  H5D__scatter_mem (_buf=0x7ffeb9510010, nelmts=65536, iter=0x10c4cd10, space=0x10b59440, _tscat_buf=<optimized out>) at /home/test/HDF5/hdf5-1.10.5/src/H5Dscatgath.c:340
#3  H5D__scatter_mem (_tscat_buf=<optimized out>, space=0x10b59440, iter=0x10c4cd10, nelmts=<optimized out>, _buf=0x7ffeb9510010) at /home/test/HDF5/hdf5-1.10.5/src/H5Dscatgath.c:291
#4  0x00007ffed9948f04 in H5D__scatgath_read (io_info=0x7fffffffd770, type_info=<optimized out>, nelmts=<optimized out>, file_space=0x10b7c160, mem_space=0x10b59440)
    at /home/test/HDF5/hdf5-1.10.5/src/H5Dscatgath.c:567
#5  0x00007ffed9929720 in H5D__contig_read (io_info=<optimized out>, type_info=<optimized out>, nelmts=<optimized out>, file_space=<optimized out>, mem_space=<optimized out>, fm=<optimized out>)
    at /home/test/HDF5/hdf5-1.10.5/src/H5Dcontig.c:595
#6  0x00007ffed994356c in H5D__read (dataset=<optimized out>, mem_type_id=<optimized out>, mem_space=0x10b59440, file_space=0x10b7c160, buf=0x7ffeb9510010) at /home/test/HDF5/hdf5-1.10.5/src/H5Dio.c:600
#7  0x00007ffed9943a3c in H5Dread (dset_id=<optimized out>, mem_type_id=216172782113784169, mem_space_id=288230376151711753, file_space_id=288230376151711751, dxpl_id=720575940379279384, buf=0x7ffeb9510010)
    at /home/test/HDF5/hdf5-1.10.5/src/H5Dio.c:198
#8  0x00007ffed97051d0 in __pyx_f_4h5py_4defs_H5Dread (__pyx_v_dset_id=360287970189639680, __pyx_v_mem_type_id=216172782113784169, __pyx_v_mem_space_id=288230376151711753, 
    __pyx_v_file_space_id=288230376151711751, __pyx_v_plist_id=720575940379279384, __pyx_v_buf=0x7ffeb9510010) at /home/test/workspace/h5py/h5py/defs.c:5875
#9  0x00007ffff6404f0c in __pyx_f_4h5py_6_proxy_H5PY_H5Dread (__pyx_v_dset=<optimized out>, __pyx_v_mtype=<optimized out>, __pyx_v_mspace=<optimized out>, __pyx_v_fspace=<optimized out>, 
    __pyx_v_dxpl=<optimized out>, __pyx_v_buf=<optimized out>) at /home/test/workspace/h5py/h5py/_proxy.c:1881
#10 0x00007ffff6407554 in __pyx_f_4h5py_6_proxy_dset_rw (__pyx_v_dset=360287970189639680, __pyx_v_mtype=216172782113784169, __pyx_v_mspace=288230376151711753, __pyx_v_fspace=288230376151711751, 
    __pyx_v_dxpl=720575940379279384, __pyx_v_progbuf=0x7ffeb9510010, __pyx_v_read=<optimized out>) at /home/test/workspace/h5py/h5py/_proxy.c:2233
#11 0x00007ffff63d10ec in __pyx_pf_4h5py_3h5d_9DatasetID_read (__pyx_v_self=<optimized out>, __pyx_v_mspace=<optimized out>, __pyx_v_fspace=<optimized out>, __pyx_v_dxpl=<optimized out>, 
    __pyx_v_mtype=0x7ffff59663b8, __pyx_v_arr_obj=<optimized out>) at /home/test/workspace/h5py/h5py/h5d.c:3789
#12 __pyx_pw_4h5py_3h5d_9DatasetID_1read (__pyx_v_self=<optimized out>, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>) at /home/test/workspace/h5py/h5py/h5d.c:3639
#13 0x00000000101b6438 in _PyCFunction_FastCallDict (func_obj=<built-in method read of h5py.h5d.DatasetID object at remote 0x7ffff5966308>, args=<optimized out>, nargs=<optimized out>, kwargs=<optimized out>)
    at ../Objects/methodobject.c:231
#14 0x000000001021d950 in _PyObject_FastCallDict (func=<built-in method read of h5py.h5d.DatasetID object at remote 0x7ffff5966308>, args=0x7ffff5966220, nargs=4, kwargs=<optimized out>)
    at ../Objects/abstract.c:2313
#15 0x000000001008151c in methoddescr_call (descr=0x7ffff65fc948, 
    args=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), kwds={'dxpl': <h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>}) at ../Objects/descrobject.c:246
#16 0x00007ffed9754ffc in __Pyx_PyObject_Call (kw={'dxpl': <h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>}, 
    arg=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), func=<method_descriptor at remote 0x7ffff65fc948>) at /home/test/workspace/h5py/h5py/_objects.c:9664
#17 __pyx_pf_4h5py_8_objects_9with_phil_wrapper (__pyx_v_kwds={'dxpl': <h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>}, 
    __pyx_v_args=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), __pyx_self=<optimized out>) at /home/test/workspace/h5py/h5py/_objects.c:3593
#18 __pyx_pw_4h5py_8_objects_9with_phil_1wrapper (__pyx_self=<optimized out>, 
    __pyx_args=(<h5py.h5d.DatasetID at remote 0x7ffff5966308>, <h5py.h5s.SpaceID at remote 0x7ffff59672c8>, <h5py.h5s.SpaceID at remote 0x7ffff5967278>, <numpy.ndarray at remote 0x7ffff5967300>, <h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>), __pyx_kwds=<optimized out>) at /home/test/workspace/h5py/h5py/_objects.c:3517
#19 0x00007ffed9749098 in __Pyx_CyFunction_CallMethod (func=<optimized out>, self=<cython_function_or_method at remote 0x7ffff6600100>, arg=<optimized out>, kw=<optimized out>)
    at /home/test/workspace/h5py/h5py/_objects.c:10731
#20 0x0000000010222558 in _PyObject_FastCallDict (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>, func=<cython_function_or_method at remote 0x7ffff6600100>) at ../Objects/tupleobject.c:131
#21 _PyObject_FastCallKeywords (func=<cython_function_or_method at remote 0x7ffff6600100>, stack=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at ../Objects/abstract.c:2496
#22 0x00000000101111d4 in call_function (pp_stack=0x7fffffffe130, oparg=<optimized out>, kwnames=('dxpl',)) at ../Python/ceval.c:4861
#23 0x0000000010118094 in _PyEval_EvalFrameDefault (
    f=Frame 0x10c49d38, for file /home/test/venv-kieffer_system/lib/python3.6/site-packages/h5py/_hl/dataset.py, line 574, in __getitem__ (self=<Dataset(_id=<h5py.h5d.DatasetID at remote 0x7ffff5966308>, _dcpl=<h5py.h5p.PropDCID at remote 0x7ffff75e84a8>, _dxpl=<h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>, _filters={}, _local=<_thread._local at remote 0x7ffff5966360>) at remote 0x7ffff75e9f60>, args=(<ellipsis at remote 0x104ca6b8>,), names=(), new_dtype=<numpy.dtype at remote 0x7ffff59d53f0>, mtype=<h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>, mshape=(4, 16384, 1024), fspace=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, selection=<SimpleSelection(_shape=(4, 16384, 1024), _id=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, _sel=((0, 0, 0), (4, 16384, 1024), (1, 1, 1), (False, False, False)), _mshape=(...)) at remote 0x7ffff75e9f98>, arr=<numpy.ndarray at remote 0x7ffff5967300>, mspace=<h5py.h5s.SpaceID at remote 0x7ffff59672c8>, single_element=False), throwflag=<optimized out>) at ../Python/ceval.c:3351
#24 0x0000000010113cac in PyEval_EvalFrameEx (throwflag=0, 
    f=Frame 0x10c49d38, for file /home/test/venv-kieffer_system/lib/python3.6/site-packages/h5py/_hl/dataset.py, line 574, in __getitem__ (self=<Dataset(_id=<h5py.h5d.DatasetID at remote 0x7ffff5966308>, _dcpl=<h5py.h5p.PropDCID at remote 0x7ffff75e84a8>, _dxpl=<h5py.h5p.PropDXID at remote 0x7ffff5a0b6d8>, _filters={}, _local=<_thread._local at remote 0x7ffff5966360>) at remote 0x7ffff75e9f60>, args=(<ellipsis at remote 0x104ca6b8>,), names=(), new_dtype=<numpy.dtype at remote 0x7ffff59d53f0>, mtype=<h5py.h5t.TypeFloatID at remote 0x7ffff59663b8>, mshape=(4, 16384, 1024), fspace=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, selection=<SimpleSelection(_shape=(4, 16384, 1024), _id=<h5py.h5s.SpaceID at remote 0x7ffff5967278>, _sel=((0, 0, 0), (4, 16384, 1024), (1, 1, 1), (False, False, False)), _mshape=(...)) at remote 0x7ffff75e9f98>, arr=<numpy.ndarray at remote 0x7ffff5967300>, mspace=<h5py.h5s.SpaceID at remote 0x7ffff59672c8>, single_element=False)) at ../Python/ceval.c:4166
#25 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=<optimized out>, 
    kwcount=<optimized out>, kwstep=2, defs=<optimized out>, defcount=<optimized out>, kwdefs=<optimized out>, closure=<optimized out>, name=<optimized out>, qualname=<optimized out>) at ../Python/ceval.c:4166
#26 0x00000000101ede04 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=<optimized out>, args=<optimized out>, locals=0x0, globals=<optimized out>, 
    _co=<optimized out>) at ../Python/ceval.c:4187
#27 function_call (func=<function at remote 0x7ffff6160840>, 

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:26 (25 by maintainers)

github_iconTop GitHub Comments

1reaction
mraspaudcommented, Jun 17, 2019

Yes, exactly, the test_custom_float_promotion should cover my use case.

0reactions
takluyvercommented, Jun 18, 2019

Can you make a PR from the patch so we can see what the change is?

Read more comments on GitHub >

github_iconTop Results From Across the Web

GitLab Runner on POWER9 - ppc64le architecture (Linux OS)
Release notes Several GitLab customers use IBM's Power9 (ppc64le compute architecture) systems for compute-intensive workloads.
Read more >
Platform diagnostics (ppc64-diag) - IBM
Platform diagnostics report firmware events, provide an automated response mechanism to urgent events, and provide event notifications to system ...
Read more >
on POWER 9, libvirt says KVM is available in an LPAR, but it's ...
Description of problem: The libguestfs-test-tool command fails on ppc64le(Power 9) when running gating test for ...
Read more >
Base R on Power 9 CPU working (ppc64le architecture)?
There is an r-base package available for the IBM Power9 CPU architecture ( ppc64le ) on Ubuntu Linux and I am wondering if...
Read more >
Power9 System Firmware - Inspur
ppc64 or ppc64le - describes the Linux code that is compiled to run on ... HIPER/Non-Pervasive: A problem was fixed for the IBM...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found