question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to yield result instead of getting the result list?

See original GitHub issue

This is my code

import time
from joblib import Parallel, delayed
def producer():
    with open("some_large_file") as f:
        for line in f:
            yield line

def func(i):
    time.sleep(1)  # or other time comsumption operations
    return i
out = Parallel(n_jobs=10, verbose=100)(delayed(func)(i) for i in producer())
print(out)

My purpose is to read a large file (about 10G),and do some operations on each line. The final result will be saved in the out variable, which is a list object stored in memory. Can joblib yield immediate results during run jobs? Then I can write the result content to a file rather than store them in the memory?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
GaelVaroquauxcommented, Apr 11, 2022

This feature is quite challenging to code (danger of race conditions, in particular when dealing with exceptions).

There is a pull request in progress on this feature: https://github.com/joblib/joblib/pull/588 We hope to merge in soonish, but these things are tricky.

1reaction
npyoungcommented, Apr 10, 2022

Second this feature request. Often my parallel jobs have large outputs that I want to process and write to disk as they become available rather than keep them around in memory until all jobs have completed. I would use something like multiprocessing.Pool.imap for this but I need the advanced pickling and memmap conversion of joblib.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Pythonically yield all values from a list? - Stack Overflow
Since this question doesn't specify; I'll provide an answer that applies in Python >= 3.3. If you need only to return that list,...
Read more >
Yield in Python Tutorial: Generator & Yield vs Return Example
Python yield returns a generator object. Generators are special functions that have to be iterated to get the values. The yield keyword converts ......
Read more >
When to use yield instead of return in Python? - GeeksforGeeks
Return sends a specified value back to its caller whereas Yield can produce a sequence of values. We should use yield when we...
Read more >
Understanding Python's "yield" Keyword - Stack Abuse
The yield keyword in Python is used to create generators. A generator is a type of collection that produces items on-the-fly and can...
Read more >
How to Use Generators and yield in Python
In this step-by-step tutorial, you'll learn about generators and yielding in Python. You'll create generator functions and generator expressions using ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found