question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for multiple data sources

See original GitHub issue

Idea is to enhance Engine to be able to run on multiple data sources:

data1 = ...
data2 = ...
data3 = ...
...

def process_function(engine, batches):
    batch1 = batches['data1']
    batch2 = batches['data2']
    batch3 = batches['data3']
    ...

engine = Engine(process_function)
engine.run({"data1": data1, "data2": data2, "data3": data3}, max_epochs=10)

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

4reactions
amatsukawacommented, Jun 17, 2020

I randomly saw this issue while checking in on the 0.4 release. I also feel that this should be left up to the datasets. It’s not immediately clear what the semantics are if the iterators are not the same length.

Perhaps this is not the right venue for this, but I just want to offer my unsolicited $.02 on features such as this one.

IMHO the strength of ignite over other packages such as pytorch-lightening is that the engine and event system is extremely simple. It allows one to inject logic anywhere in the loop as one pleases, and it’s very obvious what will happen. The reason why I don’t like pytorch lightening as much is exactly because it’s filled with features like this one here. It’s got a huge API, and has a lot of logic along the lines of “if you want to do this, then use this API and use a None for that argument but pass a dict for this other argument, etc etc”. What I love about ignite is that basically everything except the engine, even very common things like model checkpointing, is a “plugin” to be registered. If I want the model checkpointing to write 10 copies of the same file, I am fully empowered to write a handler to do so. I’m not sure how that would be done in other frameworks. (obviously a contrived example to trying to make a point, and to avoid the argument of “actually, in lightening you can do xyz”)

I would personally implore you to ruthlessly keep the Engine from becoming complicated. For instance, the determinism changes added to the engine caused some roadblocks for me (#935, #941).

I absolutely understand the desire to cover more use cases for users, and you have done an amazing job so far. However, when confronted with a design decision exactly like this one, IMHO a solution that can be implemented without additions to the engine should be chosen. (eg. by providing the Dataset and DataLoader implementations under a ignite.data or ignite.contrib. A similar argument can be made about the determinism stuff: subclasses of Engine and optional datasets should be preferred.

Thanks for your continued work on an amazing library 😃

2reactions
amatsukawacommented, Jun 17, 2020

@vfdev-5 yes, by the looks of it, it does seem to solve our problems. That’s why I was checking up on the official 0.4 release when I came across this in one fo the kanban boards 😃

I have yet to actually try it, but if I can find some time next week, I will install the rc and give it a go.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Connect Multiple Data Sources in Your App using these 6 ...
Connect Multiple Data Sources in Your App using these 6 Open Source Tools ; MSSQL. DB2. PostGres ; MongoDB. Dropbox. Google Sheets ;...
Read more >
5 ways to manage multiple data sources for high-performance ...
1. Know what data to combine ... The first thing to understand is what you should combine, both in terms of the data...
Read more >
Multiple Data Source Support - MicroStrategy
The Multiple data source support VLDB property allows you to choose which technique to use to support multiple data sources in a project....
Read more >
Learn to combine multiple data sources (Power Query)
For more information about combining data sources, see Combine multiple queries. In this task, you combine the Products and Total Sales queries by...
Read more >
How to Get Data from Multiple Sources - Integrate.io
How to Integrate Data from Multiple Sources · Step 1: Decide Which Sources to Use · Step 2: Choose a Data Integration Method...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found