question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

accessing three months of data using fsspec

See original GitHub issue

The way noaa-goes16/ABI-L2 data is stored in s3 has limited access for one day of data for (# of files around 288) using this command:

fs = fsspec.filesystem('s3', anon=True)

urls = ['s3://' + f for f in fs.glob("s3://noaa-goes16/ABI-L2-ACMC/2022/001/*/*.nc")]

or one hour ((# of files around 12) urls = ['s3://' + f for f in fs.glob("s3://noaa-goes16/ABI-L2-ACMC/2022/001/04/*.nc")]

Is there any way to work around this to have access to around three months of data(100 consecutive days)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:20 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
lsterzingercommented, Jun 8, 2022

Check the last few files in the list, in your original code it is just the 1st day of data (2022/001/*) that is being repeated 5 times whereas with the new loop it should get all days from 2022/001 - 2022/004 (since in the first iteration i=0, which does not return any files since there is no 0th day).

1reaction
lsterzingercommented, Jun 8, 2022

The code you shared here is just taking all the file paths in the the first day of 2022 (2022/001/*/*.nc) and appending them to a list 5 times. Did you mean something like this?

urls1=[]
for i in range (5):
    urls = ['s3://' + f for f in fs.glob(f"s3://noaa-goes16/ABI-L2-CODC/2022/{i:03}/*/*.nc")]
    urls1 = urls1+ urls

This takes i, pads preceding zeros to 3 places, and inserts it into the string.

It’s hard to debug why this isn’t working for you from the information I have currently. Would you be willing to share your code/notebook to see if I can replicate your issue?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Features of fsspec - Read the Docs
A subclass of AbstractBufferedFile provides random access for the underlying file-like data (without downloading the whole thing). This is a critical feature in...
Read more >
Hello from fsspec! · Issue #96 · drivendataorg/cloudpathlib
I took a quick stab at making FsspecClient and FsspecPath classes that is backed by an fsspec FileSystem in #109. from cloudpathlib.fsspec import...
Read more >
How to use the fsspec.AbstractFileSystem function in ... - Snyk
from fsspec import AbstractFileSystem for method in async_methods + ... (as implemented in MMapCache), so only the data which is accessed takes up...
Read more >
how to load and process zarr files using dask and xarray
import xarray as xr import fsspec import hvplot.xarray from dask.distributed import Client url = 's3://mur-sst/zarr' # Amazon Public Data ds ...
Read more >
Azure Machine Learning Python - Dataset Class
A Dataset is a reference to data in a Datastore or behind public web urls. ... A class attribute that provides access to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found