question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Object array fill value

See original GitHub issue

If a fill value is provided for an array with object dtype, and the fill value cannot be JSON encoded, an error will occur:

In [5]: zarr.full(10, dtype=bytes, object_codec=numcodecs.VLenBytes(), fill_value=b'foobar')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-fa0fae8e2b58> in <module>()
----> 1 zarr.full(10, dtype=bytes, object_codec=numcodecs.VLenBytes(), fill_value=b'foobar')

~/src/github/alimanfoo/zarr/zarr/creation.py in full(shape, fill_value, **kwargs)
    267     """
    268 
--> 269     return create(shape=shape, fill_value=fill_value, **kwargs)
    270 
    271 

~/src/github/alimanfoo/zarr/zarr/creation.py in create(shape, chunks, dtype, compressor, fill_value, order, store, synchronizer, overwrite, path, chunk_store, filters, cache_metadata, read_only, object_codec, **kwargs)
    112     init_array(store, shape=shape, chunks=chunks, dtype=dtype, compressor=compressor,
    113                fill_value=fill_value, order=order, overwrite=overwrite, path=path,
--> 114                chunk_store=chunk_store, filters=filters, object_codec=object_codec)
    115 
    116     # instantiate array

~/src/github/alimanfoo/zarr/zarr/storage.py in init_array(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec)
    290                          order=order, overwrite=overwrite, path=path,
    291                          chunk_store=chunk_store, filters=filters,
--> 292                          object_codec=object_codec)
    293 
    294 

~/src/github/alimanfoo/zarr/zarr/storage.py in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec)
    364                 order=order, filters=filters_config)
    365     key = _path_to_prefix(path) + array_meta_key
--> 366     store[key] = encode_array_metadata(meta)
    367 
    368 

~/src/github/alimanfoo/zarr/zarr/meta.py in encode_array_metadata(meta)
     66     )
     67     s = json.dumps(meta, indent=4, sort_keys=True, ensure_ascii=True,
---> 68                    separators=(',', ': '))
     69     b = s.encode('ascii')
     70     return b

/usr/lib/python3.6/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    236         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237         separators=separators, default=default, sort_keys=sort_keys,
--> 238         **kw).encode(obj)
    239 
    240 

/usr/lib/python3.6/json/encoder.py in encode(self, o)
    199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
--> 201             chunks = list(chunks)
    202         return ''.join(chunks)
    203 

/usr/lib/python3.6/json/encoder.py in _iterencode(o, _current_indent_level)
    428             yield from _iterencode_list(o, _current_indent_level)
    429         elif isinstance(o, dict):
--> 430             yield from _iterencode_dict(o, _current_indent_level)
    431         else:
    432             if markers is not None:

/usr/lib/python3.6/json/encoder.py in _iterencode_dict(dct, _current_indent_level)
    402                 else:
    403                     chunks = _iterencode(value, _current_indent_level)
--> 404                 yield from chunks
    405         if newline_indent is not None:
    406             _current_indent_level -= 1

/usr/lib/python3.6/json/encoder.py in _iterencode(o, _current_indent_level)
    435                     raise ValueError("Circular reference detected")
    436                 markers[markerid] = o
--> 437             o = _default(o)
    438             yield from _iterencode(o, _current_indent_level)
    439             if markers is not None:

/usr/lib/python3.6/json/encoder.py in default(self, o)
    178         """
    179         raise TypeError("Object of type '%s' is not JSON serializable" %
--> 180                         o.__class__.__name__)
    181 
    182     def encode(self, o):

TypeError: Object of type 'bytes' is not JSON serializable

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
alimanfoocommented, Dec 7, 2017

Yes I think it would be very reasonable to base64 encode the fill value and put it in the JSON where it currently lives under the ‘fill_value’ key.

In fact for arrays with ‘S’ (fixed length byte string) dtype, I recently made this change to the code and spec, i.e., the bytes fill value is base64 encoded. This didn’t break compatibility because previously it was impossible to store a bytes fill value because it would have raised a JSON encoding error, and so we could be sure there would be no existing data out there that had a fill value without base64 encoding.

For object arrays it’s not quite so simple, because at least in theory it would previously have been possible for someone to create an object array with a unicode string fill value (e.g., ‘’ or ‘foobar’ or ‘whatever’) or any other valid JSON fill value (e.g., 0) and this would have been stored in the JSON. So if we now switched behaviour and started base64 encoding fill values for object arrays, this would break compatibility with previous data, because on read Zarr would wrongly try to base64 decode their fill values. Now it may be that actually there is no data out there like this, but I feel like we need to assume there is and manage any changes that could break compatibility with previous data. By “manage” I think I mean incrementing the storage spec version number, and providing a migration path and/or compatibility layer so that things don’t break or behave unexpectedly. I’m keen to ship 2.2 and so want to avoid this for the moment.

That said there may be a clever way to add support for encoding object fill values without breaking compatibility, but I can’t see it.

On Thu, Dec 7, 2017 at 5:35 PM, jakirkham notifications@github.com wrote:

Yeah that sounds ideal.

An alternative might be to base64 encode it at the end and shove it back in JSON. Can flesh this out a bit more if it is interesting.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/alimanfoo/zarr/issues/216#issuecomment-350039795, or mute the thread https://github.com/notifications/unsubscribe-auth/AAq8QuCAn8fXtsEg_qAA1MCeGek39mUIks5s-CHdgaJpZM4Q5PyW .

– Alistair Miles Head of Epidemiological Informatics Centre for Genomics and Global Health http://cggh.org Big Data Institute Building Old Road Campus Roosevelt Drive Oxford OX3 7LF United Kingdom Phone: +44 (0)1865 743596 Email: alimanfoo@googlemail.com Web: http://a http://purl.org/net/alimanlimanfoo.github.io/ Twitter: https://twitter.com/alimanfoo

0reactions
joshmoorecommented, Sep 22, 2021

slightly related that several fixes to object arrays were in #806 cc: @abergou

Read more comments on GitHub >

github_iconTop Results From Across the Web

Array.prototype.fill() - JavaScript - MDN Web Docs - Mozilla
The fill() method changes all elements in an array to a static value, from a start index (default 0) to an end index...
Read more >
Array.prototype.fill() with object passes reference and not new ...
You can first fill the array with any value (e.g. undefined ), and then you will be able to use map : var...
Read more >
How to populate an array with zeros or objects in JavaScript
In JavaScript, you can use the Array.fill() method to populate an array with a zero or any other value like an object or...
Read more >
4 Ways to Populate an Array in JavaScript - Medium
Using the Array.fill method is an obvious choice — you supply the method with the value you want to populate your array with,...
Read more >
fill an array using JavaScript: Different methods & examples
1. Syntax: array.fill(value, start, end) 2. Using this does not have any specific advantage over the previous method, apart from the ability to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found