Support for multiple data sources
See original GitHub issueIdea is to enhance Engine
to be able to run on multiple data sources:
data1 = ...
data2 = ...
data3 = ...
...
def process_function(engine, batches):
batch1 = batches['data1']
batch2 = batches['data2']
batch3 = batches['data3']
...
engine = Engine(process_function)
engine.run({"data1": data1, "data2": data2, "data3": data3}, max_epochs=10)
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (8 by maintainers)
Top Results From Across the Web
Connect Multiple Data Sources in Your App using these 6 ...
Connect Multiple Data Sources in Your App using these 6 Open Source Tools ; MSSQL. DB2. PostGres ; MongoDB. Dropbox. Google Sheets ;...
Read more >5 ways to manage multiple data sources for high-performance ...
1. Know what data to combine ... The first thing to understand is what you should combine, both in terms of the data...
Read more >Multiple Data Source Support - MicroStrategy
The Multiple data source support VLDB property allows you to choose which technique to use to support multiple data sources in a project....
Read more >Learn to combine multiple data sources (Power Query)
For more information about combining data sources, see Combine multiple queries. In this task, you combine the Products and Total Sales queries by...
Read more >How to Get Data from Multiple Sources - Integrate.io
How to Integrate Data from Multiple Sources · Step 1: Decide Which Sources to Use · Step 2: Choose a Data Integration Method...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I randomly saw this issue while checking in on the 0.4 release. I also feel that this should be left up to the datasets. It’s not immediately clear what the semantics are if the iterators are not the same length.
Perhaps this is not the right venue for this, but I just want to offer my unsolicited $.02 on features such as this one.
IMHO the strength of ignite over other packages such as pytorch-lightening is that the engine and event system is extremely simple. It allows one to inject logic anywhere in the loop as one pleases, and it’s very obvious what will happen. The reason why I don’t like pytorch lightening as much is exactly because it’s filled with features like this one here. It’s got a huge API, and has a lot of logic along the lines of “if you want to do this, then use this API and use a
None
for that argument but pass a dict for this other argument, etc etc”. What I love about ignite is that basically everything except the engine, even very common things like model checkpointing, is a “plugin” to be registered. If I want the model checkpointing to write 10 copies of the same file, I am fully empowered to write a handler to do so. I’m not sure how that would be done in other frameworks. (obviously a contrived example to trying to make a point, and to avoid the argument of “actually, in lightening you can do xyz”)I would personally implore you to ruthlessly keep the Engine from becoming complicated. For instance, the determinism changes added to the engine caused some roadblocks for me (#935, #941).
I absolutely understand the desire to cover more use cases for users, and you have done an amazing job so far. However, when confronted with a design decision exactly like this one, IMHO a solution that can be implemented without additions to the engine should be chosen. (eg. by providing the
Dataset
andDataLoader
implementations under aignite.data
orignite.contrib
. A similar argument can be made about the determinism stuff: subclasses ofEngine
and optional datasets should be preferred.Thanks for your continued work on an amazing library 😃
@vfdev-5 yes, by the looks of it, it does seem to solve our problems. That’s why I was checking up on the official 0.4 release when I came across this in one fo the kanban boards 😃
I have yet to actually try it, but if I can find some time next week, I will install the rc and give it a go.