Opening dataset without loading any indexes?
See original GitHub issueIs your feature request related to a problem?
Within pangeo-forge’s internals we would like to call open_dataset
, then to_dict()
, and end up with a schema-like representation of the contents of the dataset. This works, but it also has the side-effect of loading all indexes into memory, even if we are loading the data values “lazily”.
Describe the solution you’d like
@benbovy do you think it would be possible to (perhaps optionally) also avoid loading indexes upon opening a dataset, so that we actually don’t load anything? The end result would act a bit like ncdump
does.
Describe alternatives you’ve considered
Otherwise we might have to try using xarray-schema or something but the suggestion here would be much neater and more flexible.
xref: https://github.com/pangeo-forge/pangeo-forge-recipes/issues/256
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:8 (8 by maintainers)
Top Results From Across the Web
pandas read_csv index_col=None not working with delimiters ...
Quick Answer. Use index_col=False instead of index_col=None when you have delimiters at the end of each line to turn off index column inference...
Read more >Read CSV File without Unnamed Index Column in Python
In this article you'll learn how to load a CSV file without an unnamed index column in Python programming. The article consists of...
Read more >Export Pandas to CSV without Index & Header
pandas DataFrame to CSV with no index can be done by using index=False param of to_csv() method. With this, you can specify ignore...
Read more >How to avoid Python/Pandas creating an index in a saved csv?
1 Answer. The first and most preferable way would be to set your index value as index=False while you are converting your data...
Read more >pandas.read_csv — pandas 1.5.2 documentation
Detect missing value markers (empty strings and the value of na_values). In data without any NAs, passing na_filter=False can improve the performance of...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Yes it is definitely a pathological example. 💣 But the fact remains that there are many cases where we just want to discover dataset contents as quickly as possible and want to avoid the cost of loading coordinates and creating indexes.
This would also fix #2233