IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 3
See original GitHub issueI’m getting an unexpected error when reading in a list of strings:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-22-8ddd76dd0792> in <module>()
3 path ='3489803881346866476.parquet'
4 ## need to have fastparquet and python-snappy installed to make this work.
----> 5 df = pd.read_parquet(path,engine='fastparquet')
/anaconda2/lib/python2.7/site-packages/pandas/io/parquet.pyc in read_parquet(path, engine, columns, **kwargs)
255
256 impl = get_engine(engine)
--> 257 return impl.read(path, columns=columns, **kwargs)
/anaconda2/lib/python2.7/site-packages/pandas/io/parquet.pyc in read(self, path, columns, **kwargs)
203 path, _, _ = get_filepath_or_buffer(path)
204 parquet_file = self.api.ParquetFile(path)
--> 205 return parquet_file.to_pandas(columns=columns, **kwargs)
206
207
/anaconda2/lib/python2.7/site-packages/fastparquet/api.pyc in to_pandas(self, columns, categories, filters, index)
395 for (name, v) in views.items()}
396 self.read_row_group(rg, columns, categories, infile=f,
--> 397 index=index, assign=parts)
398 start += rg.num_rows
399 else:
/anaconda2/lib/python2.7/site-packages/fastparquet/api.pyc in read_row_group(self, rg, columns, categories, infile, index, assign)
222 infile, rg, columns, categories, self.schema, self.cats,
223 self.selfmade, index=index, assign=assign,
--> 224 scheme=self.file_scheme)
225 if ret:
226 return df
/anaconda2/lib/python2.7/site-packages/fastparquet/core.pyc in read_row_group(file, rg, columns, categories, schema_helper, cats, selfmade, index, assign, scheme)
336 """
337 if assign is None:
--> 338 raise RuntimeError('Going with pre-allocation!')
339 read_row_group_arrays(file, rg, columns, categories, schema_helper,
340 cats, selfmade, assign=assign)
/anaconda2/lib/python2.7/site-packages/fastparquet/core.pyc in read_row_group_arrays(file, rg, columns, categories, schema_helper, cats, selfmade, assign)
313
314 use = name in categories if categories is not None else False
--> 315 read_col(column, schema_helper, file, use_cat=use,
316 selfmade=selfmade, assign=out[name],
317 catdef=out[name+'-catdef'] if use else None)
/anaconda2/lib/python2.7/site-packages/fastparquet/core.pyc in read_col(column, schema_helper, infile, use_cat, grab_dict, selfmade, assign, catdef)
258 max_defi = schema_helper.max_definition_level(cmd.path_in_schema)
259 part = assign[num:num+len(defi)]
--> 260 part[defi != max_defi] = my_nan
261 if d and not use_cat:
262 part[defi == max_defi] = dic[val]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 3
Running: Mac OS X: 10.12.6 Python: 2.7.14 FastParquet: 0.1.5
Here’s the schema I’m attempting to load:
message Msg {
optional binary sn_id (UTF8);
optional binary sn_name (UTF8);
optional binary sn_type (UTF8);
optional binary author_id (UTF8);
optional binary author_name (UTF8);
optional binary sn_msg_id (UTF8);
optional binary sn_msg_type (UTF8);
optional int64 sent_ts;
optional binary text (UTF8);
repeated binary tag_ids (UTF8);
repeated binary tag_names (UTF8);
repeated binary tag_descriptions (UTF8);
}
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
boolean index did not match indexed array along dimension 0 ...
I am using Numpy 1.11, instead of an IndexError I get a VisibleDeprecationWarning . So I guess using an incorrect size is no...
Read more >boolean index did not match indexed array along dimension XX
In order to fix this issue, you need to go back and check wherever boolean indexing has happened and double check the shape...
Read more >boolean index did not match indexed array along dimension 0
The most intuitive understanding of this error message is that the corresponding array dimensions are inconsistent. We understand the point, ...
Read more >boolean index did not match indexed array along dimension 0 ...
Hello. I used your code but it dosen't solve the error. My code: class CustomDataset(utils.Dataset): def load_custom( ...
Read more >17. Understanding Python errors
py:10: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4 but corresponding boolean dimension is 5 # Remove ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Indeed, the error would suggest that the two are not equivalent. You can inspect it:
That’s a good question. We’re using protobuf 3 to define our parquet schema, which won’t allow for required non repeated fields, so it’s not easy for me to quickly check this.