Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rethinking generators

See original GitHub issue

In constructing examples for the new docs, I’m coming to the conclusion that our use of generators instead of lists is kind of a UI nightmare. It’s pretty much impossible for a naive user (or even me in some cases) to know what kind of iterable they’re getting back, which makes inspection of the resulting objects much more cumbersome than it should be. Basically, unless you’re planning to go all the way to a pandas DataFrame with pliers code (e.g, in a Graph), things get hairy pretty quickly.

It’s clearly not an optimal solution to just return lists everywhere, since almost anything meaningful one might want to do with, e.g., a large movie, could potentially result in crazy memory use (and this was what prompted me to use generators in the first place). But I think that’s at least an easily understandable problem from the user’s perspective–i.e., it wouldn’t take much for a user to write an outer loop around VideoFrameStims and save the results to file in batches.

Perhaps the right approach is to build batching–and possibly file-writing–functionality into the Graph. That way, when users initialize a Graph, they just have to specify a batch size, file store, etc. We could potentially use HDF5 to store intermediate results if needed.

In any case, I don’t think we should hold up the former change (dropping generators, at least by default) until we have a scheme along the latter lines figured out. We should probably make it a high priority to just use lists for now (and maybe have a config setting that enables them if the user really knows what they’re doing).

Issue Analytics

State:
Created 6 years ago
Comments:5

Top GitHub Comments

1reaction

tyarkonicommented, Dec 30, 2017

From a naive user perspective, the main problem I see is that generators will be returned any time iteration is involved–and that can violate expectations in odd ways. Consider these two calls:

res1 = converter.transform(stim)
res2 = converter.transform([stim])

Out of the box in pliers, the first call returns another Stim instance, while the second returns [<generator>]. I think this is really unintuitive for naive users (and probably also for many non-naive users). But I don’t see any good way to avoid it so long as we want to use generators internally. So probably best to default to always returning lists, while still allowing power users to unmask the generator usage that’s already going on under the hood.

0reactions

tyarkonicommented, Jan 7, 2018

Yep!

Top Results From Across the Web

Rethinking central pattern generators: A general approach

Repetitive behavioral patterns such as swimming, flying, chewing, breathing, scratching and walking have long been a mainstay of motor research in neuroscience.

Rethinking Central Pattern Generators: A General Approach

We demonstrate our approach by generating a novel model of lamprey locomotion. However, we suggest that the methods presented here can be more ......

Rethinking hydrogen generators to make them economically ...

Rethinking hydrogen generators to make them economically viable. EPFL researchers have published a study showing the economic factors behind the design of ...

[1902.04697] Rethinking Generative Mode Coverage - arXiv

Constructing the generator mixture has a connection to the multiplicative weights update rule, upon which we propose our algorithm.

Rethinking Generative Mode Coverage: A Pointwise ...

Constructing the generator mixture has a connection to the multiplicative weights update rule, upon which we propose our algorithm. We prove that our...