Migrate STDOUT/STDIN exchanges from asynchio pipes to queues
See original GitHub issueMigrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/2900
Originally created by @aaronsteers on 2021-08-18 21:44:57
As discussed in #2743 and this comment (https://gitlab.com/meltano/meltano/-/issues/2743#note_569087851), asyncio queues can be used in place of pipes to send data between processes.
cc @pandemicsyn
From our code:
async def _read_from_fd(self, read_fd):
# Since we're redirecting our own stdout and stderr output,
# the line length limit can be arbitrarily large.
line_length_limit = 1024 * 1024 * 1024 # 1 GiB
reader = asyncio.StreamReader(limit=line_length_limit)
read_protocol = asyncio.StreamReaderProtocol(reader)
loop = asyncio.get_event_loop()
read_transport, _ = await loop.connect_read_pipe( # <<<<
lambda: read_protocol, os.fdopen(read_fd)
)
await capture_subprocess_output(reader, self)
From https://docs.python.org/3/library/asyncio-platforms.html#windows:
SelectorEventLoop has the following limitations:
- …
- Pipes are not supported, so the loop.connect_read_pipe() and loop.connect_write_pipe() methods are not implemented.
Alternative implementation using queues from here:
async def first_pipe_cmd(command, queue, cwd="."):
proc = await asyncio.create_subprocess_shell(
command,
stdout=asyncio.subprocess.PIPE,
#stderr=asyncio.subprocess.PIPE,
cwd=cwd)
#await asyncio.wait(_outstream_handler(proc.stderr, "stderr", "first_pipe_cmd")) #Broken at the moment
data ="first"
while data:
data = await proc.stdout.readline()
line = data.decode()
if data: await queue.put(line)
logging.info(f"Queue data for processes, data is {line}")
logging.info("First piped process has completed")
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (2 by maintainers)
Top Results From Across the Web
A non-blocking read on a subprocess.PIPE in Python
(Coming from google?) all PIPEs will deadlock when one of the PIPEs' buffer gets filled up and not read. e.g. stdout deadlock when...
Read more >Event Loop — Python 3.11.1 documentation
The event loop is the core of every asyncio application. ... This method clears all queues and shuts down the executor, ... PIPE,...
Read more >Working with Subprocesses — PyMOTW 3
The methods of the protocol class are called automatically based on I/O events for the subprocess. Because both the stdin and stderr arguments ......
Read more >Kombu Documentation - Read the Docs
from kombu import Connection, Exchange, Queue ... When you transfer money from one bank to another, your bank sends a message to a...
Read more >A curated list of awesome Go frameworks, libraries and ...
schema - Library to embed schema migrations for database/sql-compatible databases inside your Go binaries. skeema - Pure-SQL schema management system for MySQL, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@tayloramurphy - thanks for the ping.
On reviewing this, I don’t think it is a high priority as of now, given that:
meltano run
andelt
. (Interop was one of the drivers here, if I remember correctly.)There are still potential benefits, but there are risks also in terms of compatibility and performance. There’s another discussion (I think I the SDK repo) about preformance benchmarking and I think we’d want that process be in place before making a big change like this.
It’s also arguable that investing in adding BATCH support to stream maps (along with good docs) may be a better investment, since that would bring similar benefits of reducing memory pressure to legacy taps and targets as well as sdk-based ones.
@BuzzCutNorman, @visch - do you see strong value on this as of now or would you agree it’s ok to deprioritize?
Thankfully we’ve documented it as such in https://docs.meltano.com/guide/installation-guide#windows 😅