Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to yield result instead of getting the result list?

See original GitHub issue

This is my code

import time
from joblib import Parallel, delayed
def producer():
    with open("some_large_file") as f:
        for line in f:
            yield line

def func(i):
    time.sleep(1)  # or other time comsumption operations
    return i
out = Parallel(n_jobs=10, verbose=100)(delayed(func)(i) for i in producer())
print(out)

My purpose is to read a large file (about 10G)，and do some operations on each line. The final result will be saved in the out variable, which is a list object stored in memory. Can joblib yield immediate results during run jobs? Then I can write the result content to a file rather than store them in the memory?

Issue Analytics

State:
Created 2 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

GaelVaroquauxcommented, Apr 11, 2022

This feature is quite challenging to code (danger of race conditions, in particular when dealing with exceptions).

There is a pull request in progress on this feature: https://github.com/joblib/joblib/pull/588 We hope to merge in soonish, but these things are tricky.

1reaction

npyoungcommented, Apr 10, 2022

Second this feature request. Often my parallel jobs have large outputs that I want to process and write to disk as they become available rather than keep them around in memory until all jobs have completed. I would use something like multiprocessing.Pool.imap for this but I need the advanced pickling and memmap conversion of joblib.

Top Results From Across the Web

How to Pythonically yield all values from a list? - Stack Overflow

Since this question doesn't specify; I'll provide an answer that applies in Python >= 3.3. If you need only to return that list,...

Yield in Python Tutorial: Generator & Yield vs Return Example

Python yield returns a generator object. Generators are special functions that have to be iterated to get the values. The yield keyword converts ......

When to use yield instead of return in Python? - GeeksforGeeks

Return sends a specified value back to its caller whereas Yield can produce a sequence of values. We should use yield when we...

Understanding Python's "yield" Keyword - Stack Abuse

The yield keyword in Python is used to create generators. A generator is a type of collection that produces items on-the-fly and can...

How to Use Generators and yield in Python

In this step-by-step tutorial, you'll learn about generators and yielding in Python. You'll create generator functions and generator expressions using ...