Add function to retrieve example datasets
See original GitHub issueIt would be nice to have a clean way for gallery examples to get access to the test data files. See https://github.com/pvlib/pvlib-python/pull/860#discussion_r379847062
For a function like load_dataset('greensboro-tmy'):
Pros:
- No need to monkey around with filepaths, especially ones that aren’t really meant to be public
- Associating files with keys means we can move and rename test data files without it being a breaking change
- Simplifies example code
Cons:
- The data files are in several formats (csv, json, nc, h5 etc), so this function would either have to know the appropriate reading method for each file (complicated) or just return a file handle and let the user parse the contents (less useful).
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Processing data in a Dataset - Hugging Face
datasets. Dataset. map() takes a callable accepting a dict as argument (same dict as returned by dataset[i] ) and iterate over the dataset...
Read more >Receive and handle data with custom functions - Office Add-ins
If a custom function retrieves data from an external source such as the web, it must: Return a JavaScript Promise to Excel. Resolve...
Read more >HTMLElement.dataset - Web APIs | MDN
The dataset read-only property of the HTMLElement interface provides read/write access to custom data attributes (data-*) on elements.
Read more >Datasets - Ignition User Manual 8.0
Instead they must be created using the system.dataset.toDataSet function, which also allows you to convert a PyDataset to a Dataset. It requires ...
Read more >Collect() - Retrieve data from Spark RDD/DataFrame
collect() action function is used to retrieve all elements from the dataset (RDD/DataFrame/Dataset) as a Array[Row] to the driver program.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I would be happy if this
https://github.com/pvlib/pvlib-python/blob/c9929e85986a3368ed370203ef8699617c0cdc61/docs/examples/plot_greensboro_kimber_soiling.py#L39-L43
looked like
I don’t think the broader scope is feasible. To be clear, this is just something for the tests/examples - not for anything else.
metpy has a
get_test_datafunction with the same idea, but a different implementation because it uses a caching back end that I think we should avoid.example: https://unidata.github.io/MetPy/latest/examples/XArray_Projections.html#sphx-glr-examples-xarray-projections-py
+1 to having the function. I think a useful scope is to return the full path to the file. Reading file content and providing output in various formats seems like a large bite to chew.