Add ability for all operators to interact with storages of AWS/GCP/AZURE
See original GitHub issueDescription
Currently every operator is doing XToY
. so if someone wrote MySQLToGCS
it doesn’t help someone who needs MySQLToS3
.
Use case / motivation
It would be great if when PR is raised people will need only to handle the X
part and provide a list of the Y
part. Something like people need to write XtoDataframe
or XtoFile
and there is build in integration in Airflow that can handle the FileToS3
FileToGCS
etc…
So when user is PR MySQLToFile
Airflow will utilise this and auto create MySQLToGCS
and MySQLToS3
.
The idea is to build infrastructure layer once that will be automated for all.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Compare AWS and Azure services to Google Cloud
Service category Service type Google Cloud product
App modernization CI/CD Cloud Build
App modernization CI/CD Google Cloud Deploy
App modernization Execution Control Cloud Tasks
Read more >Cloud storage 101: NAS file storage on AWS, Azure and GCP
We look at NAS file storage options in AWS, Azure and Google Cloud. All three offer native- and NetApp-based file storage with Azure...
Read more >Azure Archive Storage – Data Management
Azure Archive Storage provides a low cost means of delivering durable, highly available, secure cloud storage and data management for rarely accessed data....
Read more >Integration — Airflow Documentation
All classes communicate via the Window Azure Storage Blob protocol. Make sure that a Airflow connection of type wasb exists. Authorization can be...
Read more >Confidential computing: an AWS perspective
If any AWS operator, including those with the highest privileges, needs to do maintenance work on the EC2 server, they can do so...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Would be interesting to implement this using fsspec (https://filesystem-spec.readthedocs.io/en/latest/). Such an implementation would provide an interface to all filesystems supported by fsspec, which includes Azure Blob/GDL2, S3 and GCS. You also get others like FTP and SFTP for free too (https://filesystem-spec.readthedocs.io/en/latest/api.html#implementations).
@turbaszek I see how this is related, and I face this question frequently. But I think #8059 depends on having a common FS interface like #3526 sugests.
Right now we have (I might miss some) the following file providers: S3, GCS, Azure, SFTP/SSH, FTP, Samba. If we want to allow all operations between them it means it’s
N^2
operators. It’s 36 operators. And will be more as other cloud providers are added (Alibaba, DO, …). I’ll give a look at the PR @ashb , thanks.