open() doesn't play nice with subprocess.run when used for STDIN
See original GitHub issueWhat are you trying to achieve?
I want to run a command line program using a file on S3 as STDIN.
What is the expected result?
The file should stream into the command line program as standard input
What are you seeing instead?
UnsupportedOperation: fileno
Steps/code to reproduce the problem
from subprocess import run
from smart_open import open as smart_open
s3 = boto3.Session(profile_name="development").client("s3")
with smart_open("s3://bucket-name/path/to/file.gz", transport_params={"client": s3}, buffering=0) as f:
run(("cat",), stdin=f)
Traceback
The following is the result while running the above in the ipython shell:
---------------------------------------------------------------------------
UnsupportedOperation Traceback (most recent call last)
Cell In [21], line 2
1 with smart_open("REDACTED", transport_params={"client": s3}, buffering=0) as f:
----> 2 run(("cat",), stdin=f)
File /usr/lib/python3.10/subprocess.py:501, in run(input, capture_output, timeout, check, *popenargs, **kwargs)
498 kwargs['stdout'] = PIPE
499 kwargs['stderr'] = PIPE
--> 501 with Popen(*popenargs, **kwargs) as process:
502 try:
503 stdout, stderr = process.communicate(input, timeout=timeout)
File /usr/lib/python3.10/subprocess.py:832, in Popen.__init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask, pipesize)
811 raise SubprocessError('Cannot disambiguate when both text '
812 'and universal_newlines are supplied but '
813 'different. Pass one or the other.')
815 # Input and output objects. The general principle is like
816 # this:
817 #
(...)
827 # are -1 when not using PIPEs. The child objects are -1
828 # when not redirecting.
830 (p2cread, p2cwrite,
831 c2pread, c2pwrite,
--> 832 errread, errwrite) = self._get_handles(stdin, stdout, stderr)
834 # We wrap OS handles *before* launching the child, otherwise a
835 # quickly terminating child could make our fds unwrappable
836 # (see #8458).
838 if _mswindows:
File /usr/lib/python3.10/subprocess.py:1603, in Popen._get_handles(self, stdin, stdout, stderr)
1600 p2cread = stdin
1601 else:
1602 # Assuming file-like object
-> 1603 p2cread = stdin.fileno()
1605 if stdout is None:
1606 pass
File /usr/lib/python3.10/gzip.py:359, in GzipFile.fileno(self)
353 def fileno(self):
354 """Invoke the underlying file object's fileno() method.
355
356 This will raise AttributeError if the underlying file object
357 doesn't support fileno().
358 """
--> 359 return self.fileobj.fileno()
I’ve tried this without the buffering=0
argument as well with the same results. If this isn’t possible, then I suppose my next best option would be to just pull the entire file down and do everything locally. The problem in my case is that the file is Very Large, so I can’t just do something simple like:
with smart_open("s3://bucket-name/path/to/file.gz") as f:
run(("cat",), stdin=BytesIO(f.read()))
'cause I’m assuming that that would dump the whole file into RAM first.
Versions
Please provide the output of:
import platform, sys, smart_open
print(platform.platform())
print("Python", sys.version)
print("smart_open", smart_open.__version__)
Linux-5.15.0-48-generic-x86_64-with-glibc2.35
Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0]
smart_open 6.2.0
Checklist
Before you create the issue, please make sure you have:
- Described the problem clearly
- Provided a minimal reproducible example, including any required data
- Provided the version numbers of the relevant software
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top Results From Across the Web
How do I pass a string into subprocess.Popen (using the stdin ...
This is the best answer for Python 3.4+ (using it in Python 3.6). It indeed does not work with check_call but it works...
Read more >subprocess — Subprocess management — Python 3.11.1 ...
The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use...
Read more >Python Tutorial: subprocesses module - 2020 - BogoToBogo
A program can create new processes using library functions such as those found in the os or subprocess modules such as os.fork(), subprocess.Popen(),...
Read more >Subprocess management — Python 2.7.2 documentation
If a string is specified for args, it will be used as the name or path of the program to execute; this will...
Read more >How To Use subprocess to Run External Programs in Python 3
You can use the subprocess.run function to run an external program from your Python code. First, though, you need to import the subprocess ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
There probably is. If the dirty way feels wrong, have a look on SO, e.g. here: https://stackoverflow.com/questions/4846891/python-piping-output-between-two-subprocesses
I tried to do exactly this with
subprocess.run()
but it ended up with the same error:I guess I could run the string with
shell=True
, but that felt dirty so I was hoping there was a Better Way:☝🏻 This works, but I just assumed that there’s a Pythonic way to do it.