diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst index fe64daa3291d67..929844ae7c3b92 100644 --- a/Doc/library/subprocess.rst +++ b/Doc/library/subprocess.rst @@ -28,9 +28,12 @@ Using the :mod:`!subprocess` Module ----------------------------------- The recommended approach to invoking subprocesses is to use the :func:`run` -function for all use cases it can handle. For more advanced use cases, the -underlying :class:`Popen` interface can be used directly. +function for all use cases it can handle. For pipelines, look to the +:func:`run_pipeline` function. For more advanced use cases, the underlying +:class:`Popen` interface can be used directly. +The :func:`!run` function +^^^^^^^^^^^^^^^^^^^^^^^^^ .. function:: run(args, *, stdin=None, input=None, stdout=None, stderr=None,\ capture_output=False, shell=False, cwd=None, timeout=None, \ @@ -160,6 +163,9 @@ underlying :class:`Popen` interface can be used directly. .. versionadded:: 3.5 +Constants and base exceptions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + .. data:: DEVNULL Special value that can be used as the *stdin*, *stdout* or *stderr* argument @@ -723,6 +729,267 @@ functions. with a non-zero :attr:`~Popen.returncode`. +The :func:`!run_pipeline` function +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. function:: run_pipeline(*commands, stdin=None, input=None, \ + stdout=None, stderr=None, capture_output=False, \ + timeout=None, check=False, encoding=None, \ + errors=None, text=None, env=None, \ + **other_popen_kwargs) + + Run a pipeline of commands connected via pipes, similar to shell pipelines. + Wait for all commands to complete, then return a :class:`CompletedPipeline` + instance. + + Each positional argument should be a command: either a sequence of + program arguments, or a :class:`PipelineCommand` wrapping one with + per-command overrides. Bare sequences are wrapped in a + :class:`PipelineCommand` on entry. The standard output of each + command is connected to the standard input of the next command in the + pipeline. + + This function requires at least two commands. For a single command, use + :func:`run` instead. + + If *capture_output* is true, the standard output of the final command and + the standard error of all commands will be captured (see *Standard + error handling* below). The *stdout* and *stderr* arguments may not be + supplied at the same time as *capture_output*. + + A *timeout* may be specified in seconds. If the timeout expires, all + child processes will be killed and waited for, and then a + :exc:`TimeoutExpired` exception will be raised. + + The *input* argument is passed to the first command's stdin. If used, it + must be a byte sequence, or a string if *encoding* or *errors* is specified + or *text* is true. + + If *check* is true, and any process in the pipeline exits with a non-zero + exit code, a :exc:`PipelineError` exception will be raised. This behavior + is similar to the shell's ``pipefail`` option. + + If *encoding* or *errors* are specified, or *text* is true, file objects + are opened in text mode using the specified encoding and errors. + + If *stdin* is specified, it is connected to the first command's standard + input. If *stdout* is specified, it is connected to the last command's + standard output. When *stdout* is :data:`PIPE`, the output is available + in the returned :class:`CompletedPipeline`'s + :attr:`~CompletedPipeline.stdout` attribute. + + Other keyword arguments are passed to every command's :class:`Popen` + call. ``close_fds=False`` is rejected because inherited copies of the + inter-process pipe ends in sibling children would prevent EOF from + being signaled and cause deadlocks. ``shell=True`` and ``executable=`` + are also rejected: the pipeline itself replaces the shell, and + per-command shell interpretation would re-introduce the quoting and + injection surface this function exists to avoid. When one command + genuinely needs shell interpretation (a glob, or a shell builtin), + wrap it in a :class:`PipelineCommand` with ``shell=True``. + ``start_new_session`` and ``process_group`` are also rejected: each + command is spawned as a sibling child of the calling process, so + applying these per command does not produce a single process group + spanning the pipeline. ``stderr=STDOUT`` at the pipeline level is + rejected because it would merge each non-final command's stderr into + the next command's stdin; use a :class:`PipelineCommand` with + ``stderr=STDOUT`` for the one command that needs it, or + ``capture_output=True`` to capture stderr from every command. + + .. rubric:: Standard error handling + + When stderr is captured (via ``capture_output=True`` or ``stderr=PIPE``), + every command in the pipeline writes to a single shared pipe, and the + captured :attr:`~CompletedPipeline.stderr` is the interleaved output of + all of them. Two consequences follow from the shared pipe: + + * In text mode the interleaving can split multi-byte characters across + writes from different processes. If that is a concern, capture in + binary mode and decode yourself, or pass ``errors="replace"`` or + ``errors="backslashreplace"``. + + * If any child spawns a grandchild process that keeps the inherited + stderr file descriptor open after the child itself exits, the + parent's read on the stderr pipe will not see EOF and + :func:`run_pipeline` will block. Either do not capture stderr, or + ensure such grandchildren fully detach (closing inherited fds) before + daemonizing. + + To exempt one command from the shared stderr pipe, wrap it in a + :class:`PipelineCommand` with ``stderr=DEVNULL`` (discard) or + ``stderr=STDOUT`` (merge into that command's stdout). + ``stderr=STDOUT`` at the *pipeline* level is rejected. + + Examples:: + + >>> import subprocess + >>> # Equivalent to: echo "hello world" | tr a-z A-Z + >>> result = subprocess.run_pipeline( + ... ["echo", "hello world"], + ... ["tr", "a-z", "A-Z"], + ... capture_output=True, text=True + ... ) + >>> result.stdout + 'HELLO WORLD\n' + >>> result.returncodes + (0, 0) + + >>> # Pipeline with three commands + >>> result = subprocess.run_pipeline( + ... ["echo", "one\ntwo\nthree"], + ... ["sort"], + ... ["head", "-n", "2"], + ... capture_output=True, text=True + ... ) + >>> result.stdout + 'one\nthree\n' + + >>> # Using input parameter + >>> result = subprocess.run_pipeline( + ... ["cat"], + ... ["wc", "-l"], + ... input="line1\nline2\nline3\n", + ... capture_output=True, text=True + ... ) + >>> result.stdout.strip() + '3' + + >>> # Error handling with check=True + >>> subprocess.run_pipeline( + ... ["echo", "hello"], + ... ["false"], # exits with status 1 + ... check=True + ... ) + Traceback (most recent call last): + ... + subprocess.PipelineError: Pipeline failed: ['false'] (commands[1]) returned 1 + + .. versionadded:: next + + +.. class:: CompletedPipeline + + The return value from :func:`run_pipeline`, representing a pipeline of + processes that have finished. + + .. attribute:: commands + + The commands used to launch the pipeline, as a tuple of + :class:`PipelineCommand` instances. Bare argv sequences passed to + :func:`run_pipeline` are wrapped, so every element has ``.args`` + and the override attributes. + + .. attribute:: returncodes + + Tuple of exit status codes for each command in the pipeline. Typically, + an exit status of 0 indicates that the command ran successfully. + + A negative value ``-N`` indicates that the command was terminated by + signal ``N`` (POSIX only). + + .. attribute:: stdout + + Captured stdout from the final command in the pipeline. A bytes sequence, + or a string if :func:`run_pipeline` was called with an encoding, errors, + or ``text=True``. ``None`` if stdout was not captured. + + .. attribute:: stderr + + Captured stderr from all commands in the pipeline, combined. A bytes + sequence, or a string if :func:`run_pipeline` was called with an + encoding, errors, or ``text=True``. ``None`` if stderr was not captured. + + .. method:: check_returncodes() + + If any element of :attr:`returncodes` is non-zero, raise a + :exc:`PipelineError`. + + .. versionadded:: next + + +.. class:: PipelineCommand(args, /, *, stderr=None, env=None, cwd=None, \ + shell=False) + + One command in a :func:`run_pipeline` call. :func:`run_pipeline` + wraps each bare argv sequence it receives in a :class:`PipelineCommand`, + so :attr:`CompletedPipeline.commands` and :attr:`PipelineError.commands` + always hold instances of this class. + + Construct one explicitly when a single command needs different stderr + handling, a different *env* or *cwd*, or shell interpretation. Any + override left at its default means the corresponding + :func:`run_pipeline` keyword applies to this command as it would to a + bare argv sequence. + + *args* is a sequence of program arguments, or a string if *shell* is + true. Passing a string with ``shell=False`` (or a sequence with + ``shell=True``) raises :exc:`TypeError`. + + *stderr* may be ``None`` (use the pipeline's shared stderr handling), + :data:`DEVNULL` (discard this command's stderr), or :data:`STDOUT` + (merge this command's stderr into its stdout stream). Any other value + raises :exc:`ValueError`. + + *env* and *cwd*, if given, replace the pipeline-level *env* and *cwd* + for this command only. + + *shell*, if true, runs this command's *args* through the shell. + + Example -- discard the noisy stderr of one command while the rest + keep the pipeline's stderr handling:: + + >>> from subprocess import run_pipeline, PipelineCommand, DEVNULL + >>> with open("out.gz", "wb") as f: + ... result = run_pipeline( + ... PipelineCommand(["dd", "if=infile", "bs=1M"], stderr=DEVNULL), + ... ["pigz"], + ... stdout=f, check=True, + ... ) + + .. versionadded:: next + + +.. exception:: PipelineError + + Subclass of :exc:`SubprocessError`, raised when a pipeline run by + :func:`run_pipeline` (with ``check=True``) contains one or more commands + that returned a non-zero exit status. This is similar to the shell's + ``pipefail`` behavior. + + :exc:`PipelineError` is a sibling of :exc:`CalledProcessError`, not a + subclass. To handle both single-command and pipeline failures with + one ``except`` clause, catch :exc:`SubprocessError` (which is also + the common base of :exc:`TimeoutExpired`). Retry helpers and + decorators that match on :exc:`CalledProcessError` will not catch + pipeline failures by default. + + .. attribute:: commands + + The commands used in the pipeline, as a tuple of + :class:`PipelineCommand` instances. + + .. attribute:: returncodes + + Tuple of exit status codes for each command in the pipeline. + + .. attribute:: stdout + + Output of the final command if it was captured. Otherwise, ``None``. + + .. attribute:: stderr + + Combined stderr output of all commands if it was captured. + Otherwise, ``None``. + + .. attribute:: failed + + Tuple of ``(index, command, returncode)`` triples for each command + that returned a non-zero exit status. The *index* is the position + of the command in the pipeline (0-based). + + .. versionadded:: next + + Exceptions ^^^^^^^^^^ @@ -736,16 +1003,21 @@ will be raised by the child only if the selected shell itself was not found. To determine if the shell failed to find the requested application, it is necessary to check the return code or output from the subprocess. -A :exc:`ValueError` will be raised if :class:`Popen` is called with invalid -arguments. +A :exc:`ValueError` will be raised if :class:`Popen` and related functions are +called with invalid arguments. -:func:`check_call` and :func:`check_output` will raise -:exc:`CalledProcessError` if the called process returns a non-zero return -code. +:func:`check_call`, :func:`check_output`, :func:`run` with ``check=True``, or +:meth:`CompletedProcess.check_returncode` will raise :exc:`CalledProcessError` +if the called process had a non-zero return code. + +:func:`run_pipeline` with ``check=True`` or +:meth:`CompletedPipeline.check_returncodes` will raise :exc:`PipelineError` if +any process in the pipeline had a non-zero return code. All of the functions and methods that accept a *timeout* parameter, such as -:func:`run` and :meth:`Popen.communicate` will raise :exc:`TimeoutExpired` if -the timeout expires before the process exits. +:func:`run`, :func:`run_pipeline`, and :meth:`Popen.communicate` will raise +:exc:`TimeoutExpired` if the timeout expires before the process or processes +exit. Exceptions defined in this module all inherit from :exc:`SubprocessError`. @@ -1393,24 +1665,33 @@ Replacing shell pipeline becomes:: - p1 = Popen(["dmesg"], stdout=PIPE) - p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE) - p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits. - output = p2.communicate()[0] - -The ``p1.stdout.close()`` call after starting the p2 is important in order for -p1 to receive a SIGPIPE if p2 exits before p1. + result = run_pipeline(["dmesg"], ["grep", "hda"], + capture_output=True, check=True) + output = result.stdout -Alternatively, for trusted input, the shell's own pipeline support may still -be used directly: +:func:`run_pipeline` connects the commands, closes the parent's copies of +the intermediate pipe ends, waits for every process, and (with +``check=True``) raises :exc:`PipelineError` if *any* command fails -- +equivalent to the +shell's ``set -o pipefail`` without needing a shell. -.. code-block:: bash +If you need to read the final command's output incrementally rather than +waiting for the whole pipeline to finish (for example, streaming a large +decompressed file), chain :class:`Popen` instances directly:: - output=$(dmesg | grep hda) - -becomes:: - - output = check_output("dmesg | grep hda", shell=True) + p1 = Popen(["dmesg"], stdout=PIPE) + p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE) + p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits. + for line in p2.stdout: + ... + p2.wait() + p1.wait() + if p1.returncode or p2.returncode: + ... # handle failure in either command + +The ``p1.stdout.close()`` call after starting p2 is important in order for +p1 to receive a SIGPIPE if p2 exits before p1. Each process must be +waited on and its return code checked to detect failure in any command. Replacing :func:`os.system` diff --git a/Doc/whatsnew/3.15.rst b/Doc/whatsnew/3.15.rst index 405d388af487e8..c9f1fdcf11f44f 100644 --- a/Doc/whatsnew/3.15.rst +++ b/Doc/whatsnew/3.15.rst @@ -1135,6 +1135,20 @@ ssl subprocess ---------- +* Added :func:`subprocess.run_pipeline` for running a sequence of commands + connected stdout-to-stdin, similar to a shell ``a | b | c`` pipeline, + without invoking a shell. It returns a + :class:`~subprocess.CompletedPipeline` carrying the return codes of every + command; passing ``check=True`` raises :exc:`~subprocess.PipelineError` + if any command fails (pipefail semantics). Individual commands may be + wrapped in a :class:`~subprocess.PipelineCommand` to override *stderr*, + *env*, *cwd*, or *shell* for that command only. This replaces the + manual + :class:`~subprocess.Popen`-chaining recipe and the + ``bash -c "set -o pipefail; ..."`` workaround previously needed to detect + failures in non-final commands. + (Contributed by Gregory P. Smith in :gh:`47798`.) + * :meth:`subprocess.Popen.wait`: when ``timeout`` is not ``None`` and the platform supports it, an efficient event-driven mechanism is used to wait for process termination: diff --git a/Lib/subprocess.py b/Lib/subprocess.py index 38b655f2f7b9d2..3e333667ae1d84 100644 --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -17,6 +17,10 @@ ======== run(...): Runs a command, waits for it to complete, then returns a CompletedProcess instance. +run_pipeline(...): Runs a pipeline of commands connected via pipes, + waits for all to complete, then returns a CompletedPipeline + instance. Each command may be a bare argv sequence or a + PipelineCommand with per-command overrides. Popen(...): A class for flexibly executing a command in a new process Constants @@ -62,7 +66,9 @@ __all__ = ["Popen", "PIPE", "STDOUT", "call", "check_call", "getstatusoutput", "getoutput", "check_output", "run", "CalledProcessError", "DEVNULL", - "SubprocessError", "TimeoutExpired", "CompletedProcess"] + "SubprocessError", "TimeoutExpired", "CompletedProcess", + "run_pipeline", "CompletedPipeline", "PipelineError", + "PipelineCommand"] # NOTE: We intentionally exclude list2cmdline as it is # considered an internal implementation detail. issue10838. @@ -194,6 +200,51 @@ def stdout(self, value): self.output = value +class PipelineError(SubprocessError): + """Raised when run_pipeline() is called with check=True and one or more + commands in the pipeline return a non-zero exit status. + + Attributes: + commands: Tuple of PipelineCommand instances for each command. + returncodes: Tuple of return codes corresponding to each command. + stdout: Standard output from the final command (if captured). + stderr: Standard error output (if captured). + failed: Tuple of (index, command, returncode) for each failed command. + """ + def __init__(self, commands, returncodes, stdout=None, stderr=None): + commands = tuple(commands) + returncodes = tuple(returncodes) + assert len(commands) == len(returncodes), ( + f"{len(commands)=} != {len(returncodes)=}") + super().__init__(commands, returncodes) + self.commands = commands + self.returncodes = returncodes + self.stdout = stdout + self.stderr = stderr + self.failed = tuple( + (i, cmd, rc) + for i, (cmd, rc) in enumerate(zip(commands, returncodes)) + if rc != 0 + ) + + def __str__(self): + parts = [] + for i, cmd, rc in self.failed: + if rc and rc < 0: + try: + detail = f"died with {signal.Signals(-rc)!r}" + except ValueError: + detail = f"died with unknown signal {-rc}" + else: + detail = f"returned {rc}" + if isinstance(cmd, PipelineCommand) and not cmd._has_overrides(): + cmd_display = cmd.args + else: + cmd_display = cmd + parts.append(f"{cmd_display!r} (commands[{i}]) {detail}") + return f"Pipeline failed: {'; '.join(parts)}" + + if _mswindows: class STARTUPINFO: def __init__(self, *, dwFlags=0, hStdInput=None, hStdOutput=None, @@ -253,10 +304,11 @@ def __repr__(self): def _communicate_io_posix(selector, stdin, input_view, input_offset, output_buffers, endtime, *, close_on_eof=False): """ - Low-level POSIX I/O multiplexing loop used by Popen._communicate. + Low-level POSIX I/O multiplexing loop. - Handles the select loop for reading/writing but does not manage - stream lifecycle or raise timeout exceptions. + This is the common core used by both _communicate_streams() and + Popen._communicate(). It handles the select loop for reading/writing + but does not manage stream lifecycle or raise timeout exceptions. Args: selector: A _PopenSelector with streams already registered @@ -404,6 +456,192 @@ def _translate_newlines(data, encoding, errors): return data.replace("\r\n", "\n").replace("\r", "\n") +def _communicate_streams(stdin=None, input_data=None, read_streams=None, + timeout=None, cmd_for_timeout=None, + stdout_stream=None, stderr_stream=None): + """ + Multiplex I/O: write input_data to stdin, read from read_streams. + + All streams must be file objects (not raw file descriptors). + All I/O is done in binary mode; caller handles text encoding. + + Args: + stdin: Writable binary file object for input, or None + input_data: Bytes to write to stdin, or None + read_streams: List of readable binary file objects to read from + timeout: Timeout in seconds, or None for no timeout + cmd_for_timeout: Value to use for TimeoutExpired.cmd + stdout_stream: File object in read_streams that holds stdout data, + or None. Used only to populate TimeoutExpired.output on a + partial timeout. + stderr_stream: File object in read_streams that holds stderr data, + or None. Used only to populate TimeoutExpired.stderr on a + partial timeout. + + Returns: + Dict mapping each file object in read_streams to its bytes data. + All file objects in read_streams will be closed. + + Raises: + TimeoutExpired: If timeout expires (with partial data) + """ + if timeout is not None: + endtime = _time() + timeout + else: + endtime = None + + read_streams = read_streams or [] + + if _mswindows: + return _communicate_streams_windows( + stdin, input_data, read_streams, endtime, timeout, cmd_for_timeout, + stdout_stream, stderr_stream) + else: + return _communicate_streams_posix( + stdin, input_data, read_streams, endtime, timeout, cmd_for_timeout, + stdout_stream, stderr_stream) + + +if _mswindows: + def _reader_thread_func(fh, buffer): + """Thread function to read from a file handle into a buffer list.""" + try: + buffer.append(fh.read()) + except OSError: + buffer.append(b'') + + def _writer_thread_func(fh, data, result): + """Thread function to write data to a file handle and close it.""" + try: + if data: + fh.write(data) + except BrokenPipeError: + pass + except OSError as exc: + if exc.errno != errno.EINVAL: + result.append(exc) + try: + fh.close() + except BrokenPipeError: + pass + except OSError as exc: + if exc.errno != errno.EINVAL and not result: + result.append(exc) + + def _communicate_streams_windows(stdin, input_data, read_streams, + endtime, orig_timeout, cmd_for_timeout, + stdout_stream=None, stderr_stream=None): + """Windows implementation using threads.""" + threads = [] + buffers = {} + writer_thread = None + writer_result = [] + + if stdin and input_data: + writer_thread = threading.Thread( + target=_writer_thread_func, + args=(stdin, input_data, writer_result)) + writer_thread.daemon = True + writer_thread.start() + elif stdin: + try: + stdin.close() + except BrokenPipeError: + pass + except OSError as exc: + if exc.errno != errno.EINVAL: + raise + + for stream in read_streams: + buf = [] + buffers[stream] = buf + t = threading.Thread(target=_reader_thread_func, args=(stream, buf)) + t.daemon = True + t.start() + threads.append((stream, t)) + + def _raise_timeout(): + results = {s: (b[0] if b else b'') for s, b in buffers.items()} + raise TimeoutExpired( + cmd_for_timeout, orig_timeout, + output=results.get(stdout_stream), + stderr=results.get(stderr_stream)) + + # Drain the writer before any reader so a stalled write surfaces as + # the timeout source, not a partial read. + if writer_thread is not None: + remaining = _deadline_remaining(endtime) + if remaining is not None and remaining < 0: + remaining = 0 + writer_thread.join(remaining) + if writer_thread.is_alive(): + _raise_timeout() + if writer_result: + raise writer_result[0] + + for stream, t in threads: + remaining = _deadline_remaining(endtime) + if remaining is not None and remaining < 0: + remaining = 0 + t.join(remaining) + if t.is_alive(): + _raise_timeout() + + results = {stream: (buf[0] if buf else b'') + for stream, buf in buffers.items()} + for stream in read_streams: + try: + stream.close() + except OSError: + pass + return results + +else: + def _communicate_streams_posix(stdin, input_data, read_streams, + endtime, orig_timeout, cmd_for_timeout, + stdout_stream=None, stderr_stream=None): + """POSIX implementation using selectors.""" + output_buffers = {stream: [] for stream in read_streams} + + if stdin: + _flush_stdin(stdin) + if not input_data: + try: + stdin.close() + except BrokenPipeError: + pass + stdin = None # don't register with selector + + input_view = _make_input_view(input_data) + + with _PopenSelector() as selector: + if stdin and input_data: + selector.register(stdin, selectors.EVENT_WRITE) + for stream in read_streams: + selector.register(stream, selectors.EVENT_READ) + + _, completed = _communicate_io_posix( + selector, stdin, input_view, 0, output_buffers, endtime) + + if not completed: + results = {stream: b''.join(chunks) + for stream, chunks in output_buffers.items()} + raise TimeoutExpired( + cmd_for_timeout, orig_timeout, + output=results.get(stdout_stream), + stderr=results.get(stderr_stream)) + + results = {} + for stream, chunks in output_buffers.items(): + results[stream] = b''.join(chunks) + try: + stream.close() + except OSError: + pass + + return results + + # XXX This function is only used by multiprocessing and the test suite, # but it's here so that it can be imported when Python is compiled without # threads. @@ -624,6 +862,103 @@ def check_returncode(self): self.stderr) +class PipelineCommand: + """One command in a run_pipeline() pipeline. + + run_pipeline() accepts each command either as a bare argv sequence or + as a PipelineCommand; bare sequences are wrapped on entry, so + CompletedPipeline.commands and PipelineError.commands always hold + PipelineCommand instances. + + Construct one explicitly when a single command needs different stderr + handling, a shell, or its own env/cwd. Any override left at its + default of None (or False for shell) means the corresponding + run_pipeline() keyword applies to this command as it would to a bare + argv sequence. + """ + + __slots__ = ("args", "stderr", "env", "cwd", "shell") + + def __init__(self, args, /, *, stderr=None, env=None, cwd=None, + shell=False): + if stderr not in (None, STDOUT, DEVNULL): + raise ValueError( + "PipelineCommand stderr must be None, STDOUT, or DEVNULL") + if shell: + if not isinstance(args, str): + raise TypeError( + "PipelineCommand with shell=True requires a str command") + elif isinstance(args, str): + raise TypeError( + "PipelineCommand args must be a sequence of program " + "arguments, not a str (use shell=True for a shell command)") + self.args = args + self.stderr = stderr + self.env = env + self.cwd = cwd + self.shell = shell + + def _has_overrides(self): + """True if any keyword override differs from its default.""" + return (self.stderr is not None or self.env is not None + or self.cwd is not None or self.shell) + + def __repr__(self): + parts = [f"{self.args!r}"] + if self.stderr is STDOUT: + parts.append("stderr=STDOUT") + elif self.stderr is DEVNULL: + parts.append("stderr=DEVNULL") + if self.env is not None: + # env is commonly large and may contain credentials; don't + # dump its contents into tracebacks via PipelineError.__str__. + try: + n = len(self.env) + except TypeError: + n = "?" + parts.append(f"env=<{n} entries>") + if self.cwd is not None: + parts.append(f"cwd={self.cwd!r}") + if self.shell: + parts.append("shell=True") + return f"{type(self).__name__}({', '.join(parts)})" + + +class CompletedPipeline: + """A pipeline of processes that have finished running. + + This is returned by run_pipeline(). + + Attributes: + commands: Tuple of PipelineCommand instances for each command. + returncodes: Tuple of return codes for each command in the pipeline. + stdout: The standard output of the final command (None if not captured). + stderr: The standard error output (None if not captured). + """ + def __init__(self, commands, returncodes, stdout=None, stderr=None): + self.commands = tuple(commands) + self.returncodes = tuple(returncodes) + self.stdout = stdout + self.stderr = stderr + + def __repr__(self): + args = [f"commands={self.commands!r}", + f"returncodes={self.returncodes!r}"] + if self.stdout is not None: + args.append(f"stdout={self.stdout!r}") + if self.stderr is not None: + args.append(f"stderr={self.stderr!r}") + return f"{type(self).__name__}({', '.join(args)})" + + __class_getitem__ = classmethod(types.GenericAlias) + + def check_returncodes(self): + """Raise PipelineError if any command's exit code is non-zero.""" + if any(rc != 0 for rc in self.returncodes): + raise PipelineError(self.commands, self.returncodes, + self.stdout, self.stderr) + + def run(*popenargs, input=None, capture_output=False, timeout=None, check=False, **kwargs): """Run command with arguments and return a CompletedProcess instance. @@ -694,6 +1029,296 @@ def run(*popenargs, return CompletedProcess(process.args, retcode, stdout, stderr) +def run_pipeline(*commands, input=None, capture_output=False, timeout=None, + check=False, **kwargs): + """Run a pipeline of commands connected via pipes. + + Each positional argument should be a command: either a sequence of + program arguments, or a PipelineCommand wrapping one with per-command + overrides. Bare sequences are wrapped in a PipelineCommand on entry. + The stdout of each command is connected to the stdin of the next + command in the pipeline, similar to shell pipelines. + + Returns a CompletedPipeline instance with attributes commands, returncodes, + stdout, and stderr. By default, stdout and stderr are not captured, and + those attributes will be None. Pass capture_output=True to capture both + the final command's stdout and stderr from all commands. + + If check is True and any command's exit code is non-zero, it raises a + PipelineError. This is similar to shell "pipefail" behavior. + + If timeout (seconds) is given and the pipeline takes too long, a + TimeoutExpired exception will be raised and all processes will be killed. + + The optional "input" argument allows passing bytes or a string to the + first command's stdin. If you use this argument, you may not also specify + stdin in kwargs. + + By default, all communication is in bytes. Use text=True, encoding, or + errors to enable text mode, which affects the input argument and stdout/ + stderr outputs. + + .. note:: + When using text=True with capture_output=True or stderr=PIPE, be aware + that stderr output from multiple processes may be interleaved in ways + that produce invalid character sequences when decoded. For reliable + text decoding, avoid text=True when capturing stderr from pipelines, + or handle decoding errors appropriately. + + Other keyword arguments are passed to each Popen call, except for stdin, + stdout, and stderr (when stderr=PIPE or capture_output=True), which are + managed by the pipeline. + + Example: + # Equivalent to: cat file.txt | grep pattern | wc -l + result = run_pipeline( + ["cat", "file.txt"], + ["grep", "pattern"], + ["wc", "-l"], + capture_output=True, text=True + ) + print(result.stdout) # "42\\n" + print(result.returncodes) # (0, 0, 0) + """ + if len(commands) < 2: + raise ValueError("run_pipeline requires at least 2 commands") + + if input is not None and kwargs.get("stdin") is not None: + raise ValueError("stdin and input arguments may not both be used.") + if kwargs.get("stdin") is PIPE: + raise ValueError("stdin=PIPE is not supported by run_pipeline; " + "pass input= instead, or provide a file/fd") + + if capture_output: + if kwargs.get("stdout") is not None or kwargs.get("stderr") is not None: + raise ValueError("stdout and stderr arguments may not be used " + "with capture_output.") + + if not kwargs.get("close_fds", True): + raise ValueError( + "close_fds=False is not supported by run_pipeline; " + "inherited pipe ends would prevent EOF signaling between commands") + + if kwargs.get("shell"): + raise ValueError( + "shell=True is not supported by run_pipeline; the pipeline itself " + "replaces the shell. Use PipelineCommand(cmd, shell=True) for a " + "single command that needs shell interpretation.") + if kwargs.get("executable") is not None: + raise ValueError( + "executable= is not supported by run_pipeline") + + if kwargs.get("stderr") is STDOUT: + raise ValueError( + "stderr=STDOUT at the run_pipeline level would merge each " + "non-final command's stderr into the next command's stdin. " + "Use PipelineCommand(cmd, stderr=STDOUT) for a single command, " + "or capture_output=True to capture stderr from every command.") + + if kwargs.get("start_new_session") or kwargs.get("process_group") is not None: + # run_pipeline spawns each command as a sibling child of this + # process, so a per-command session/group does not give the shell + # "one process group per pipeline" semantic that callers passing + # these almost certainly want. Reject for now; a feature that + # places every command in a single new group is a possible + # follow-on. + raise ValueError( + "start_new_session and process_group are not supported by " + "run_pipeline; each command is spawned as a sibling child, " + "so a per-command session or group does not yield a single " + "process group for the pipeline") + + commands = tuple(c if isinstance(c, PipelineCommand) else PipelineCommand(c) + for c in commands) + + stderr_arg = kwargs.pop("stderr", None) + capture_stderr = capture_output or (stderr_arg is PIPE) + + stdin_arg = kwargs.pop("stdin", None) + stdout_arg = kwargs.pop("stdout", None) + + # Load-bearing: pop text=/universal_newlines=/encoding=/errors= so each + # Popen keeps its parent-side pipes binary. _communicate_streams_* relies + # on a bytes-in/bytes-out contract; leaving these in kwargs would wrap the + # pipes in TextIOWrapper and break the threaded Windows backend. + text = kwargs.pop("text", None) + universal_newlines = kwargs.pop("universal_newlines", None) + encoding = kwargs.pop("encoding", None) + errors_param = kwargs.pop("errors", None) + text_mode = bool(text or universal_newlines or encoding or errors_param) + if text_mode and encoding is None: + encoding = locale.getencoding() + + processes = [] + stderr_reader = None # File object for reading shared stderr (for parent) + stderr_write_fd = None # Write end of shared stderr pipe (for children) + + try: + # One shared stderr pipe across all children: lets stderr from any + # command reach the parent through a single read end, which the + # I/O loop multiplexes alongside stdout. + if capture_stderr: + stderr_read_fd, stderr_write_fd = os.pipe() + stderr_reader = os.fdopen(stderr_read_fd, 'rb') + + for i, cmd in enumerate(commands): + is_first = (i == 0) + is_last = (i == len(commands) - 1) + + if is_first: + if input is not None: + proc_stdin = PIPE + else: + proc_stdin = stdin_arg # may be None, PIPE, fd, or file + else: + proc_stdin = processes[-1].stdout + + if is_last: + if capture_output: + proc_stdout = PIPE + else: + proc_stdout = stdout_arg # may be None, PIPE, fd, or file + else: + proc_stdout = PIPE + + if cmd.stderr is not None: + assert cmd.stderr in (STDOUT, DEVNULL), cmd.stderr + proc_stderr = cmd.stderr + elif capture_stderr: + proc_stderr = stderr_write_fd + else: + proc_stderr = stderr_arg + + cmd_kwargs = kwargs + if cmd.env is not None or cmd.cwd is not None or cmd.shell: + cmd_kwargs = dict(kwargs) + if cmd.env is not None: + cmd_kwargs["env"] = cmd.env + if cmd.cwd is not None: + cmd_kwargs["cwd"] = cmd.cwd + if cmd.shell: + cmd_kwargs["shell"] = True + + try: + proc = Popen(cmd.args, stdin=proc_stdin, stdout=proc_stdout, + stderr=proc_stderr, **cmd_kwargs) + except OSError as e: + e.add_note( + f"raised while starting {cmd!r} " + f"(run_pipeline commands[{i}])") + raise + processes.append(proc) + + # Close the parent's copy of the previous process's stdout + # to allow the pipe to signal EOF when the previous process exits + if not is_first and processes[-2].stdout is not None: + processes[-2].stdout.close() + + # The parent must drop its write end so children's writes are the + # only ones keeping the pipe open; otherwise the reader never + # sees EOF after all children exit. + if stderr_write_fd is not None: + os.close(stderr_write_fd) + stderr_write_fd = None + + first_proc = processes[0] + last_proc = processes[-1] + + if timeout is not None: + endtime = _time() + timeout + else: + endtime = None + + input_data = input + if input_data is not None and text_mode: + input_data = input_data.encode(encoding, errors_param or "strict") + + read_streams = [] + if last_proc.stdout is not None: + read_streams.append(last_proc.stdout) + if stderr_reader is not None: + read_streams.append(stderr_reader) + + # Drive stdin, stdout, and stderr concurrently: any one of them + # filling its kernel pipe buffer would otherwise block a child + # whose progress depends on another stream draining. + stdin_stream = first_proc.stdin if input is not None else None + + try: + results = _communicate_streams( + stdin=stdin_stream, + input_data=input_data, + read_streams=read_streams, + timeout=_deadline_remaining(endtime), + cmd_for_timeout=commands, + stdout_stream=last_proc.stdout, + stderr_stream=stderr_reader, + ) + except TimeoutExpired: + for p in processes: + if p.poll() is None: + p.kill() + for p in processes: + p.wait() + raise + + stdout = results.get(last_proc.stdout) + stderr = results.get(stderr_reader) + + decode_errors = errors_param or "strict" + if text_mode and stdout is not None: + stdout = _translate_newlines(stdout, encoding, decode_errors) + if text_mode and stderr is not None: + stderr = _translate_newlines(stderr, encoding, decode_errors) + + returncodes = [] + for proc in processes: + try: + remaining = _deadline_remaining(endtime) + proc.wait(timeout=remaining) + except TimeoutExpired: + for p in processes: + if p.poll() is None: + p.kill() + for p in processes: + p.wait() + raise TimeoutExpired(commands, timeout, stdout, stderr) + returncodes.append(proc.returncode) + + result = CompletedPipeline(commands, returncodes, stdout, stderr) + + if check and any(rc != 0 for rc in returncodes): + raise PipelineError(commands, returncodes, stdout, stderr) + + return result + + finally: + # Ensure all processes are cleaned up: kill all surviving children + # before waiting on any, so a hung wait() can't leave later + # children un-killed. + for proc in processes: + if proc.poll() is None: + proc.kill() + for proc in processes: + proc.wait() + for proc in processes: + if proc.stdin and not proc.stdin.closed: + proc.stdin.close() + if proc.stdout and not proc.stdout.closed: + proc.stdout.close() + # Close stderr pipe (reader is a file object, writer is a raw fd) + if stderr_reader is not None and not stderr_reader.closed: + try: + stderr_reader.close() + except OSError: + pass + if stderr_write_fd is not None: + try: + os.close(stderr_write_fd) + except OSError: + pass + + def list2cmdline(seq): """ Translate a sequence of arguments into a command line @@ -982,6 +1607,9 @@ class Popen: """ _child_created = False # Set here since __del__ checks it + # When adding a new keyword here, consider whether forwarding it to + # every command in run_pipeline() makes sense; if not, reject or + # special-case it there. def __init__(self, args, bufsize=-1, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=True, diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py index 1a3db527d3d5b8..6fa07973dfd2e3 100644 --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -27,6 +27,8 @@ import gc import textwrap import json +import array +import pickle from test.support.os_helper import FakePath try: @@ -2059,6 +2061,1018 @@ def test_encoding_warning(self): self.assertStartsWith(lines[1], b":3: EncodingWarning: ") +class PipelineTestCase(BaseTestCase): + """Tests for subprocess.run_pipeline()""" + + def test_pipeline_basic(self): + """Test basic two-command pipeline""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("hello world")'], + [sys.executable, "-c", "import sys; print(sys.stdin.read().upper())"], + capture_output=True, text=True + ) + self.assertEqual(result.stdout.strip(), "HELLO WORLD") + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_three_commands(self): + """Test pipeline with three commands""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("one\\ntwo\\nthree")'], + [sys.executable, "-c", 'import sys; print("".join(sorted(sys.stdin.readlines())))'], + [sys.executable, "-c", "import sys; print(sys.stdin.read().strip().upper())"], + capture_output=True, text=True + ) + self.assertEqual(result.stdout.strip(), "ONE\nTHREE\nTWO") + self.assertEqual(result.returncodes, (0, 0, 0)) + + def test_pipeline_with_input(self): + """Test pipeline with input data""" + result = subprocess.run_pipeline( + [sys.executable, "-c", "import sys; print(sys.stdin.read().upper())"], + [sys.executable, "-c", "import sys; print(len(sys.stdin.read().strip()))"], + input="hello", capture_output=True, text=True + ) + self.assertEqual(result.stdout.strip(), "5") + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_memoryview_input(self): + """Test pipeline with memoryview input (byte elements)""" + test_data = b"Hello, memoryview pipeline!" + mv = memoryview(test_data) + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import sys; sys.stdout.buffer.write(sys.stdin.buffer.read())"], + [sys.executable, "-c", + "import sys; sys.stdout.buffer.write(sys.stdin.buffer.read().upper())"], + input=mv, capture_output=True + ) + self.assertEqual(result.stdout, test_data.upper()) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_memoryview_input_nonbyte(self): + """Test pipeline with non-byte memoryview input (e.g., int32). + + This tests the fix for gh-134453 where non-byte memoryviews + had incorrect length tracking on POSIX, causing data truncation. + """ + # Create an array of 32-bit integers large enough to trigger + # chunked writing behavior (> PIPE_BUF) + pipe_buf = getattr(select, "PIPE_BUF", 512) + # Each 'i' element is 4 bytes, need more than pipe_buf bytes total + num_elements = (pipe_buf // 4) + 100 + test_array = array.array("i", [0x41424344 for _ in range(num_elements)]) + expected_bytes = test_array.tobytes() + mv = memoryview(test_array) + + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import sys; sys.stdout.buffer.write(sys.stdin.buffer.read())"], + [sys.executable, "-c", + "import sys; data = sys.stdin.buffer.read(); " + "sys.stdout.buffer.write(data)"], + input=mv, capture_output=True + ) + self.assertEqual(result.stdout, expected_bytes, + msg=f"{len(result.stdout)=} != {len(expected_bytes)=}") + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_bytes_mode(self): + """Test pipeline in binary mode""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'import sys; sys.stdout.buffer.write(b"hello")'], + [sys.executable, "-c", "import sys; sys.stdout.buffer.write(sys.stdin.buffer.read().upper())"], + capture_output=True + ) + self.assertEqual(result.stdout, b"HELLO") + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_error_check(self): + """Test that check=True raises PipelineError on failure""" + with self.assertRaises(subprocess.PipelineError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "import sys; sys.exit(1)"], + capture_output=True, check=True + ) + exc = cm.exception + self.assertEqual(len(exc.failed), 1) + self.assertEqual(exc.failed[0][0], 1) # Second command failed + self.assertEqual(exc.returncodes, (0, 1)) + + def test_pipeline_first_command_fails(self): + """Test pipeline where first command fails""" + result = subprocess.run_pipeline( + [sys.executable, "-c", "import sys; sys.exit(42)"], + [sys.executable, "-c", "import sys; print(sys.stdin.read())"], + capture_output=True + ) + self.assertEqual(result.returncodes[0], 42) + + def test_pipeline_requires_two_commands(self): + """Test that pipeline requires at least 2 commands""" + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", 'print("hello")'], + capture_output=True + ) + self.assertIn("at least 2 commands", str(cm.exception)) + + def test_pipeline_stdin_and_input_conflict(self): + """Test that stdin and input cannot both be specified""" + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + input="data", stdin=subprocess.PIPE + ) + self.assertIn("stdin", str(cm.exception)) + self.assertIn("input", str(cm.exception)) + + def test_pipeline_stdin_pipe_rejected(self): + """Test that stdin=PIPE is rejected (would hang)""" + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + stdin=subprocess.PIPE + ) + self.assertIn("stdin=PIPE", str(cm.exception)) + + def test_pipeline_capture_output_conflict(self): + """Test that capture_output conflicts with stdout/stderr""" + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + capture_output=True, stdout=subprocess.PIPE + ) + self.assertIn("capture_output", str(cm.exception)) + + def test_pipeline_session_group_rejected(self): + """start_new_session= and process_group= are rejected. + + Each command is spawned as a sibling child of this process, so + per-command sessions/groups would not yield a single process + group spanning the pipeline. + """ + for kw in ({"start_new_session": True}, {"process_group": 0}): + with self.subTest(kw=kw): + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + **kw, + ) + self.assertIn("process group", str(cm.exception)) + + def test_pipeline_close_fds_false_rejected(self): + """Any falsy close_fds is rejected (would deadlock).""" + for value in (False, 0, None): + with self.subTest(close_fds=value): + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + close_fds=value, + ) + self.assertIn("close_fds", str(cm.exception)) + + def test_pipeline_universal_newlines(self): + """Test that universal_newlines=True works like text=True""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("hello")'], + [sys.executable, "-c", "import sys; print(sys.stdin.read().upper())"], + capture_output=True, universal_newlines=True + ) + self.assertIsInstance(result.stdout, str) + self.assertIn("HELLO", result.stdout) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_completed_repr(self): + """Test CompletedPipeline string representation""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("test")'], + [sys.executable, "-c", "import sys; print(sys.stdin.read())"], + capture_output=True, text=True + ) + repr_str = repr(result) + self.assertIn("CompletedPipeline", repr_str) + self.assertIn("commands=", repr_str) + self.assertIn("returncodes=", repr_str) + + def test_pipeline_check_returncodes_method(self): + """Test CompletedPipeline.check_returncodes() method""" + result = subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "import sys; sys.exit(5)"], + capture_output=True + ) + with self.assertRaises(subprocess.PipelineError) as cm: + result.check_returncodes() + self.assertEqual(cm.exception.returncodes[1], 5) + + def test_pipeline_no_capture(self): + """Test pipeline without capturing output""" + result = subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + ) + self.assertEqual(result.stdout, None) + self.assertEqual(result.stderr, None) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_stderr_capture(self): + """Test that stderr is captured from all processes""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'import sys; print("err1", file=sys.stderr); print("out1")'], + [sys.executable, "-c", 'import sys; print("err2", file=sys.stderr); print(sys.stdin.read())'], + capture_output=True, text=True + ) + self.assertIn("err1", result.stderr) + self.assertIn("err2", result.stderr) + + def test_pipeline_timeout(self): + """Pipeline timeout raises TimeoutExpired with bytes-or-None + partial output and stderr (regardless of backend). + """ + try: + subprocess.run_pipeline( + [sys.executable, "-c", + 'import time; time.sleep(10); print("done")'], + [sys.executable, "-c", + "import sys; print(sys.stdin.read())"], + capture_output=True, timeout=0.1, + ) + except subprocess.TimeoutExpired as e: + self.assertTrue(e.output is None or isinstance(e.output, bytes)) + self.assertTrue(e.stderr is None or isinstance(e.stderr, bytes)) + else: + self.fail("TimeoutExpired not raised") + + @unittest.skipIf(mswindows, "POSIX specific test") + def test_pipeline_timeout_stdout_devnull_stderr_pipe(self): + """Timeout when stdout=DEVNULL but stderr=PIPE keeps streams distinct. + + Regression: TimeoutExpired.output used to be populated with stderr + bytes whenever stdout was not captured. + """ + try: + subprocess.run_pipeline( + [sys.executable, "-c", "import time; time.sleep(10)"], + [sys.executable, "-c", "import sys; sys.stdin.read()"], + stdout=subprocess.DEVNULL, + stderr=subprocess.PIPE, + timeout=0.1, + ) + except subprocess.TimeoutExpired as e: + self.assertIsNone(e.output) + self.assertIsInstance(e.stderr, bytes) + else: + self.fail("TimeoutExpired not raised") + + def test_pipeline_error_str(self): + """Test PipelineError string representation""" + try: + subprocess.run_pipeline( + [sys.executable, "-c", "import sys; sys.exit(1)"], + [sys.executable, "-c", "import sys; sys.exit(2)"], + capture_output=True, check=True + ) + except subprocess.PipelineError as e: + error_str = str(e) + self.assertIn("Pipeline failed", error_str) + + @unittest.skipIf(mswindows, "negative returncodes are POSIX signal-deaths") + def test_pipeline_error_str_signal(self): + """PipelineError renders negative returncodes as signal deaths, + matching CalledProcessError.""" + err = subprocess.PipelineError( + [["a"], ["b"]], [0, -signal.SIGTERM]) + msg = str(err) + self.assertIn("died with", msg) + self.assertIn("SIGTERM", msg) + self.assertNotIn("-15", msg) + + def test_pipeline_spawn_failure_cleans_up(self): + """Popen failing mid-pipeline propagates and reaps earlier commands. + + Command 0 starts and would sleep 60s; command 1's executable does + not exist so Popen raises before command 1 ever runs. The finally + block must kill and wait on command 0 so this call returns + promptly rather than hanging until command 0's sleep finishes. + """ + start = time.monotonic() + with self.assertRaises(NONEXISTING_ERRORS) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "import time; time.sleep(60)"], + NONEXISTING_CMD, + capture_output=True, + ) + elapsed = time.monotonic() - start + self.assertLess(elapsed, 30, + "run_pipeline did not promptly clean up the running first " + "command after the second command failed to spawn") + notes = getattr(cm.exception, "__notes__", []) + self.assertTrue( + any("commands[1]" in n for n in notes), + f'expected a which-command note on the OSError; got {notes!r}') + + def test_pipeline_explicit_stdout_pipe(self): + """Test pipeline with explicit stdout=PIPE""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("hello")'], + [sys.executable, "-c", "import sys; print(sys.stdin.read().upper())"], + stdout=subprocess.PIPE + ) + self.assertEqual(result.stdout.strip(), b"HELLO") + self.assertIsNone(result.stderr) + + def test_pipeline_stdin_from_file(self): + """Test pipeline with stdin from file""" + with tempfile.NamedTemporaryFile(mode="w", delete=False) as f: + f.write("file content\n") + f.flush() + fname = f.name + try: + with open(fname, "r") as f: + result = subprocess.run_pipeline( + [sys.executable, "-c", "import sys; print(sys.stdin.read().upper())"], + [sys.executable, "-c", "import sys; print(len(sys.stdin.read().strip()))"], + stdin=f, capture_output=True, text=True + ) + self.assertEqual(result.stdout.strip(), "12") # "FILE CONTENT" + finally: + os.unlink(fname) + + def test_pipeline_stdout_to_devnull(self): + """Test pipeline with stdout to DEVNULL""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("hello")'], + [sys.executable, "-c", "import sys; print(sys.stdin.read())"], + stdout=subprocess.DEVNULL + ) + self.assertIsNone(result.stdout) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_large_data_no_deadlock(self): + """Test that large data doesn't cause pipe buffer deadlock. + + This test verifies that the multiplexed I/O implementation properly + handles cases where pipe buffers would fill up. Without proper + multiplexing, this would deadlock because: + 1. First process outputs large data filling stdout pipe buffer + 2. Middle process reads some, processes, writes to its stdout + 3. If stdout pipe buffer fills, middle process blocks on write + 4. But first process is blocked waiting for middle to read more + 5. Classic deadlock + + The test uses data larger than typical pipe buffer size (64KB on Linux) + to ensure the multiplexed I/O is working correctly. + """ + # Generate data larger than typical pipe buffer (64KB) + # Use 256KB to ensure we exceed buffer on most systems + large_data = "x" * (256 * 1024) + + # Pipeline: input -> double the data -> count chars + # The middle process outputs twice as much, increasing buffer pressure + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import sys; data = sys.stdin.read(); print(data + data)"], + [sys.executable, "-c", + "import sys; print(len(sys.stdin.read().strip()))"], + input=large_data, capture_output=True, text=True, timeout=30 + ) + + # Original data doubled = 512KB = 524288 chars + # Second process strips whitespace (removes trailing newline) then counts + expected_len = 256 * 1024 * 2 # doubled data, newline stripped + self.assertEqual(result.stdout.strip(), str(expected_len)) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_large_data_three_stages(self): + """Test large data through a three-stage pipeline. + + This is a more complex deadlock scenario with three processes, + where buffer pressure can occur at multiple points. + """ + # Use 128KB of data + large_data = "y" * (128 * 1024) + + # Pipeline: input -> uppercase -> add prefix to each line -> count + # We use line-based processing to create more buffer churn + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import sys; print(sys.stdin.read().upper())"], + [sys.executable, "-c", + 'import sys; print("".join("PREFIX:" + line for line in sys.stdin))'], + [sys.executable, "-c", + "import sys; print(len(sys.stdin.read()))"], + input=large_data, capture_output=True, text=True, timeout=30 + ) + + self.assertEqual(result.returncodes, (0, 0, 0)) + # Just verify we got a reasonable numeric output without deadlock + output_len = int(result.stdout.strip()) + self.assertGreater(output_len, len(large_data)) + + def test_pipeline_large_data_with_stderr(self): + """Test large data with large stderr output from multiple processes. + + Ensures stderr collection doesn't interfere with the main data flow + and doesn't cause deadlocks when multiple processes write large + amounts to stderr concurrently with stdin/stdout data flow. + """ + # 64KB of data through the pipeline + data_size = 64 * 1024 + large_data = "z" * data_size + # Each process writes 64KB to stderr as well + stderr_size = 64 * 1024 + + result = subprocess.run_pipeline( + [sys.executable, "-c", f''' +import sys +# Write large stderr output +sys.stderr.write("E" * {stderr_size}) +sys.stderr.write("\\nstage1 done\\n") +# Pass through stdin to stdout +data = sys.stdin.read() +print(data) +'''], + [sys.executable, "-c", f''' +import sys +# Write large stderr output +sys.stderr.write("F" * {stderr_size}) +sys.stderr.write("\\nstage2 done\\n") +# Count input size +data = sys.stdin.read() +print(len(data.strip())) +'''], + input=large_data, capture_output=True, text=True, timeout=30 + ) + + self.assertEqual(result.stdout.strip(), str(data_size)) + self.assertIn("stage1 done", result.stderr) + self.assertIn("stage2 done", result.stderr) + # > stderr_size (one stage's worth) confirms both stages' bytes + # survived multiplexing through the shared stderr pipe. + self.assertGreater(len(result.stderr), stderr_size) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_timeout_large_input(self): + """Test that timeout is enforced with large input to a slow pipeline. + + This verifies that run_pipeline() doesn't block indefinitely when + writing large input to a pipeline where the first process is slow + to consume stdin. The timeout should be enforced promptly. + + This is particularly important on Windows where stdin writing could + block without proper threading. + """ + # Input larger than typical pipe buffer (64KB) + input_data = "x" * (128 * 1024) + + start = time.monotonic() + with self.assertRaises(subprocess.TimeoutExpired): + subprocess.run_pipeline( + # First process sleeps before reading - simulates slow consumer + [sys.executable, "-c", + "import sys, time; time.sleep(30); print(sys.stdin.read())"], + [sys.executable, "-c", + "import sys; print(len(sys.stdin.read()))"], + input=input_data, capture_output=True, text=True, timeout=0.5 + ) + elapsed = time.monotonic() - start + + # Timeout should occur close to the specified timeout value, + # not after waiting for the subprocess to finish sleeping. + # Allow generous margin for slow CI, but must be well under + # the subprocess sleep time. + self.assertLess(elapsed, 5.0, + f"TimeoutExpired raised after {elapsed:.2f}s; expected ~0.5s. " + "Input writing may have blocked without checking timeout.") + + def test_pipeline_check_true_success(self): + """check=True with all-successful commands returns normally""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("ok")'], + [sys.executable, "-c", "import sys; print(sys.stdin.read().strip())"], + capture_output=True, text=True, check=True + ) + self.assertEqual(result.returncodes, (0, 0)) + self.assertEqual(result.stdout.strip(), "ok") + + def test_pipeline_stderr_to_stdout_rejected(self): + """stderr=STDOUT at the pipeline level is rejected. + + It would merge each non-final command's stderr into the next + command's stdin; PipelineCommand(stderr=STDOUT) is the + per-command spelling. + """ + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + stdout=subprocess.PIPE, stderr=subprocess.STDOUT, + ) + self.assertIn("PipelineCommand", str(cm.exception)) + + def test_pipeline_intermediate_stdout_closed_in_parent(self): + """Parent closes intermediate stdout so an early-exiting consumer + does not leave the producer blocked on a full pipe.""" + result = subprocess.run_pipeline( + [sys.executable, "-c", + 'import sys; sys.stdout.write("x"); sys.stdout.flush(); ' + 'sys.stdout.write("y" * 200000)'], + [sys.executable, "-c", "import sys; sys.stdin.read(1)"], + capture_output=True, timeout=10 + ) + self.assertEqual(result.returncodes[1], 0) + + def test_pipeline_error_pickle(self): + """PipelineError survives a pickle round-trip""" + err = subprocess.PipelineError( + [["echo", "hi"], ["false"]], [0, 1], + stdout=b"hi\n", stderr=b'') + restored = pickle.loads(pickle.dumps(err)) + self.assertEqual(restored.commands, err.commands) + self.assertEqual(restored.returncodes, err.returncodes) + self.assertEqual(restored.stdout, err.stdout) + self.assertEqual(restored.stderr, err.stderr) + self.assertEqual(restored.failed, err.failed) + self.assertEqual(str(restored), str(err)) + + def test_pipeline_error_repr(self): + """repr(PipelineError(...)) is meaningful via Exception.args""" + err = subprocess.PipelineError( + [["echo", "hi"], ["false"]], [0, 1]) + r = repr(err) + self.assertIn("PipelineError", r) + self.assertIn("echo", r) + self.assertIn("false", r) + + def test_pipeline_shell_rejected(self): + """shell=True is rejected; the pipeline replaces the shell.""" + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + "echo hello world", + "tr a-z A-Z", + shell=True, capture_output=True, + ) + self.assertIn("shell=True", str(cm.exception)) + + def test_pipeline_executable_rejected(self): + """executable= is rejected (only meaningful with shell=).""" + with self.assertRaises(ValueError) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", "pass"], + executable=sys.executable, + ) + self.assertIn("executable", str(cm.exception)) + + def test_pipeline_env(self): + """env= is propagated to every command in the pipeline.""" + env = os.environ.copy() + env["MY_TEST_VAR"] = "pipeline_value" + result = subprocess.run_pipeline( + [sys.executable, "-c", + 'import os, sys; assert os.environ["MY_TEST_VAR"] == "pipeline_value"; ' + 'sys.stdout.write("first\\n")'], + [sys.executable, "-c", + 'import os, sys; assert os.environ["MY_TEST_VAR"] == "pipeline_value"; ' + 'sys.stdout.write(sys.stdin.read() + "second\\n")'], + env=env, capture_output=True, text=True, + ) + self.assertEqual(result.returncodes, (0, 0)) + self.assertIn("first", result.stdout) + self.assertIn("second", result.stdout) + + def test_pipeline_cwd(self): + """cwd= is propagated to every command in the pipeline.""" + with tempfile.TemporaryDirectory() as tmpdir: + expected = os.path.realpath(tmpdir) + result = subprocess.run_pipeline( + [sys.executable, "-c", + 'import os, sys; sys.stdout.write(os.getcwd() + "\\n")'], + [sys.executable, "-c", + "import os, sys; sys.stdout.write(sys.stdin.read()); " + 'sys.stdout.write(os.getcwd() + "\\n")'], + cwd=tmpdir, capture_output=True, text=True, + ) + lines = result.stdout.strip().split("\n") + self.assertEqual(len(lines), 2) + for line in lines: + self.assertEqual(os.path.realpath(line), expected) + self.assertEqual(result.returncodes, (0, 0)) + + @unittest.skipIf(mswindows, "pass_fds POSIX-specific") + def test_pipeline_pass_fds(self): + """pass_fds= forwards an inheritable fd to every command.""" + with tempfile.NamedTemporaryFile(mode="wb", delete=False) as f: + f.write(b"shared-content") + fname = f.name + try: + rfd = os.open(fname, os.O_RDONLY) + try: + result = subprocess.run_pipeline( + [sys.executable, "-c", + f'import os, sys; ' + f'data = os.pread({rfd}, 32, 0); ' + f'sys.stdout.write(data.decode() + "|")'], + [sys.executable, "-c", + f'import os, sys; ' + f'data = os.pread({rfd}, 32, 0); ' + f'sys.stdout.write(sys.stdin.read() + data.decode())'], + pass_fds=(rfd,), capture_output=True, text=True, + ) + finally: + os.close(rfd) + finally: + os.unlink(fname) + self.assertEqual(result.returncodes, (0, 0)) + self.assertEqual(result.stdout, "shared-content|shared-content") + + def test_pipeline_stderr_pipe_normal_completion(self): + """stderr=PIPE captures stderr without capture_output= on the success path.""" + result = subprocess.run_pipeline( + [sys.executable, "-c", + 'import sys; print("err1", file=sys.stderr); print("out1")'], + [sys.executable, "-c", + 'import sys; print("err2", file=sys.stderr); print(sys.stdin.read())'], + stderr=subprocess.PIPE, + ) + self.assertIsNone(result.stdout) + self.assertIsNotNone(result.stderr) + self.assertIn(b"err1", result.stderr) + self.assertIn(b"err2", result.stderr) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_errors_replace_multibyte_split(self): + """errors='replace' handles multi-byte stderr without raising.""" + result = subprocess.run_pipeline( + [sys.executable, "-c", + r'import sys; sys.stderr.buffer.write("é first ".encode()); ' + r"sys.stderr.flush(); " + r'sys.stdout.write("data")'], + [sys.executable, "-c", + r'import sys; sys.stderr.buffer.write("中 second".encode()); ' + r"sys.stderr.flush(); " + r"sys.stdout.write(sys.stdin.read())"], + capture_output=True, text=True, errors="replace", + ) + self.assertEqual(result.returncodes, (0, 0)) + self.assertIsInstance(result.stderr, str) + self.assertIn("first", result.stderr) + self.assertIn("second", result.stderr) + + def test_pipeline_middle_command_exits_early(self): + """Pipeline completes when a middle command exits without reading all input.""" + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import sys\n" + "try:\n" + " for i in range(100000):\n" + ' print(f"line{i}")\n' + "except BrokenPipeError:\n" + " pass\n"], + [sys.executable, "-c", + "import sys\n" + "print(sys.stdin.readline().strip())\n"], + [sys.executable, "-c", + "import sys\n" + "sys.stdout.write(sys.stdin.read())\n"], + capture_output=True, text=True, timeout=30, + ) + self.assertEqual(result.stdout.strip(), "line0") + self.assertEqual(result.returncodes[1], 0) + self.assertEqual(result.returncodes[2], 0) + + def test_pipeline_brokenpipe_mid_input_write(self): + """The first command exits while input is still being written. + + Exercises the BrokenPipeError handler in the I/O loop's stdin + write path: the input is larger than typical pipe buffers, the + first command reads a single byte then exits, and the + remaining writes must fail gracefully. + """ + big = b"x" * (4 * 1024 * 1024) + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import sys; sys.stdin.buffer.read(1); sys.exit(0)"], + [sys.executable, "-c", "import sys; sys.stdin.read()"], + input=big, capture_output=True, timeout=60, + ) + self.assertEqual(result.returncodes, (0, 0)) + + def test_pipeline_timeout_after_io_completes(self): + """Timeout fires after I/O completes but a process is still running. + + The final command closes its stdout (so the I/O loop sees EOF + and finishes) and then sleeps, so the per-process wait() is + what times out rather than the I/O loop. + """ + with self.assertRaises(subprocess.TimeoutExpired) as cm: + subprocess.run_pipeline( + [sys.executable, "-c", "pass"], + [sys.executable, "-c", + "import sys, os, time; " + "sys.stdout.close(); os.close(1); time.sleep(60)"], + stdout=subprocess.PIPE, timeout=0.5, + ) + self.assertIsNotNone(cm.exception.output) + + +class PipelineCommandTestCase(BaseTestCase): + """Tests for subprocess.PipelineCommand and its run_pipeline integration.""" + + def test_command_repr(self): + """repr shows args and only the overrides that are set.""" + s = subprocess.PipelineCommand(["ls"]) + self.assertEqual(repr(s), "PipelineCommand(['ls'])") + s = subprocess.PipelineCommand(["ls"], stderr=subprocess.DEVNULL, + cwd="/tmp") + r = repr(s) + self.assertIn("stderr=DEVNULL", r) + self.assertIn("cwd='/tmp'", r) + self.assertNotIn("env=", r) + self.assertNotIn("shell=", r) + s = subprocess.PipelineCommand("echo hi", shell=True, + stderr=subprocess.STDOUT) + r = repr(s) + self.assertIn("stderr=STDOUT", r) + self.assertIn("shell=True", r) + + def test_command_repr_env_not_dumped(self): + """repr summarizes env rather than dumping its contents. + + env is commonly a copy of os.environ (large, may contain + credentials); a PipelineCommand that ends up in a PipelineError + traceback should not echo it verbatim. + """ + s = subprocess.PipelineCommand(["ls"], env={"A": "1", "SECRET": "x"}) + r = repr(s) + self.assertIn("env=<2 entries>", r) + self.assertNotIn("SECRET", r) + self.assertNotIn("'A'", r) + + def test_command_stderr_validation(self): + """stderr only accepts None, STDOUT, or DEVNULL.""" + for bad in (subprocess.PIPE, 2, io.BytesIO()): + with self.subTest(stderr=bad): + with self.assertRaises(ValueError): + subprocess.PipelineCommand(["x"], stderr=bad) + + def test_command_args_type_validation(self): + """args must be str iff shell=True.""" + with self.assertRaises(TypeError): + subprocess.PipelineCommand("echo hi") + with self.assertRaises(TypeError): + subprocess.PipelineCommand(["echo", "hi"], shell=True) + subprocess.PipelineCommand("echo hi", shell=True) + subprocess.PipelineCommand(["echo", "hi"]) + subprocess.PipelineCommand(("echo", "hi")) + + def test_command_unknown_kwarg_rejected(self): + """Only the documented overrides are accepted.""" + with self.assertRaises(TypeError): + subprocess.PipelineCommand(["x"], stdout=subprocess.DEVNULL) + with self.assertRaises(TypeError): + subprocess.PipelineCommand(["x"], pass_fds=(3,)) + + def test_command_args_positional_only(self): + """The args parameter is positional-only.""" + with self.assertRaises(TypeError): + subprocess.PipelineCommand(args=["x"]) + + def test_command_pickle(self): + """PipelineCommand survives a pickle round-trip.""" + s = subprocess.PipelineCommand(["ls", "-l"], + stderr=subprocess.DEVNULL, + env={"X": "1"}, cwd="/tmp") + s2 = pickle.loads(pickle.dumps(s)) + self.assertEqual(s2.args, s.args) + self.assertEqual(s2.stderr, s.stderr) + self.assertEqual(s2.env, s.env) + self.assertEqual(s2.cwd, s.cwd) + self.assertEqual(s2.shell, s.shell) + + def test_command_no_override_equals_bare_command(self): + """A PipelineCommand with no overrides behaves like a bare argv.""" + result = subprocess.run_pipeline( + subprocess.PipelineCommand( + [sys.executable, "-c", 'print("hello")']), + [sys.executable, "-c", + "import sys; print(sys.stdin.read().upper())"], + capture_output=True, text=True, + ) + self.assertEqual(result.stdout.strip(), "HELLO") + self.assertEqual(result.returncodes, (0, 0)) + + def test_command_stderr_overrides_pipeline_stderr(self): + """A command's stderr= takes precedence over the pipeline-level stderr. + + Command 0 with stderr=DEVNULL discards its stderr even though the + pipeline-level stderr= sends every other command's stderr to a + file. + """ + with tempfile.NamedTemporaryFile(mode="w", delete=False) as f: + fname = f.name + try: + with open(fname, "w") as logf: + subprocess.run_pipeline( + subprocess.PipelineCommand( + [sys.executable, "-c", + 'import sys; print("E0", file=sys.stderr); ' + 'print("out")'], + stderr=subprocess.DEVNULL), + [sys.executable, "-c", + 'import sys; print("E1", file=sys.stderr); ' + "print(sys.stdin.read())"], + stderr=logf, stdout=subprocess.PIPE, text=True, + ) + with open(fname) as logf: + log = logf.read() + finally: + os.unlink(fname) + self.assertNotIn("E0", log) + self.assertIn("E1", log) + + def test_command_stderr_devnull(self): + """stderr=DEVNULL on one command discards only that command's stderr.""" + result = subprocess.run_pipeline( + subprocess.PipelineCommand( + [sys.executable, "-c", + 'import sys; print("E0", file=sys.stderr); print("out")'], + stderr=subprocess.DEVNULL), + [sys.executable, "-c", + 'import sys; print("E1", file=sys.stderr); ' + "print(sys.stdin.read())"], + capture_output=True, text=True, + ) + self.assertNotIn("E0", result.stderr) + self.assertIn("E1", result.stderr) + self.assertIn("out", result.stdout) + + def test_command_stderr_stdout_middle(self): + """stderr=STDOUT on a non-final command merges into the next stdin. + + Both stdout and stderr of that command feed the next command, + and neither appears in the pipeline's captured stderr. + """ + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("a")'], + subprocess.PipelineCommand( + [sys.executable, "-c", + 'import sys; sys.stderr.write("ERR\\n"); ' + "sys.stderr.flush(); " + "sys.stdout.write(sys.stdin.read())"], + stderr=subprocess.STDOUT), + [sys.executable, "-c", + "import sys; sys.stdout.write(sys.stdin.read())"], + capture_output=True, text=True, + ) + self.assertIn("ERR", result.stdout) + self.assertIn("a", result.stdout) + self.assertNotIn("ERR", result.stderr) + self.assertEqual(result.returncodes, (0, 0, 0)) + + def test_command_stderr_stdout_last(self): + """stderr=STDOUT on the final command merges into result.stdout.""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("data")'], + subprocess.PipelineCommand( + [sys.executable, "-c", + "import sys; sys.stdout.write(sys.stdin.read()); " + 'sys.stderr.write("ERR\\n")'], + stderr=subprocess.STDOUT), + stdout=subprocess.PIPE, + ) + self.assertIn(b"data", result.stdout) + self.assertIn(b"ERR", result.stdout) + self.assertIsNone(result.stderr) + self.assertEqual(result.returncodes, (0, 0)) + + def test_command_env_override(self): + """A command's env= replaces the pipeline-level env for that command.""" + env_pipe = os.environ | {"MARK": "pipe"} + env_cmd = os.environ | {"MARK": "cmd"} + result = subprocess.run_pipeline( + [sys.executable, "-c", + 'import os; print(os.environ["MARK"])'], + subprocess.PipelineCommand( + [sys.executable, "-c", + "import os, sys; " + 'print(sys.stdin.read().strip(), os.environ["MARK"])'], + env=env_cmd), + env=env_pipe, capture_output=True, text=True, check=True, + ) + self.assertEqual(result.stdout.strip(), "pipe cmd") + + def test_command_cwd_override(self): + """A command's cwd= replaces the pipeline-level cwd for that command.""" + with tempfile.TemporaryDirectory() as d_pipe, \ + tempfile.TemporaryDirectory() as d_cmd: + d_pipe_r = os.path.realpath(d_pipe) + d_cmd_r = os.path.realpath(d_cmd) + result = subprocess.run_pipeline( + [sys.executable, "-c", + "import os; print(os.getcwd())"], + subprocess.PipelineCommand( + [sys.executable, "-c", + "import os, sys; " + "print(sys.stdin.read().strip()); print(os.getcwd())"], + cwd=d_cmd), + cwd=d_pipe, capture_output=True, text=True, check=True, + ) + lines = [os.path.realpath(p) for p in result.stdout.strip().split("\n")] + self.assertEqual(lines, [d_pipe_r, d_cmd_r]) + + @unittest.skipIf(mswindows, "POSIX shell-specific") + def test_command_shell_true(self): + """shell=True on a single command runs it through the shell.""" + result = subprocess.run_pipeline( + [sys.executable, "-c", 'print("hello world")'], + subprocess.PipelineCommand("tr a-z A-Z", shell=True), + capture_output=True, text=True, + ) + self.assertEqual(result.stdout.strip(), "HELLO WORLD") + self.assertEqual(result.returncodes, (0, 0)) + + def test_commands_are_normalized(self): + """CompletedPipeline.commands always holds PipelineCommand instances. + + Bare argv sequences are wrapped on entry; an explicit + PipelineCommand passes through by identity. + """ + explicit = subprocess.PipelineCommand( + [sys.executable, "-c", "import sys; print(sys.stdin.read())"]) + bare = [sys.executable, "-c", 'print("x")'] + result = subprocess.run_pipeline(bare, explicit, capture_output=True) + for c in result.commands: + self.assertIsInstance(c, subprocess.PipelineCommand) + self.assertIs(result.commands[1], explicit) + self.assertEqual(result.commands[0].args, bare) + self.assertIn("PipelineCommand", repr(result)) + + def test_bare_str_command_rejected(self): + """A bare str positional is rejected before any process spawns. + + Normalization runs each bare positional through + PipelineCommand(), whose strict args check refuses a str when + shell is not set. + """ + with self.assertRaises(TypeError) as cm: + subprocess.run_pipeline("echo hi", "tr a-z A-Z") + self.assertIn("sequence of program arguments", str(cm.exception)) + + def test_command_in_pipeline_error(self): + """PipelineError.commands and .failed hold PipelineCommand instances.""" + explicit = subprocess.PipelineCommand( + [sys.executable, "-c", "import sys; sys.exit(7)"], + stderr=subprocess.DEVNULL) + try: + subprocess.run_pipeline( + # cmd0 must not write to stdout: cmd1 exits without + # reading, and a flush to a readerless pipe during + # interpreter shutdown can yield exit code 120. + [sys.executable, "-c", "pass"], + explicit, + capture_output=True, check=True, + ) + except subprocess.PipelineError as e: + self.assertEqual(e.returncodes, (0, 7)) + self.assertIsInstance(e.commands, tuple) + self.assertIsInstance(e.returncodes, tuple) + self.assertIsInstance(e.failed, tuple) + for c in e.commands: + self.assertIsInstance(c, subprocess.PipelineCommand) + self.assertEqual(len(e.failed), 1) + idx, cmd, rc = e.failed[0] + self.assertEqual(idx, 1) + self.assertIs(cmd, explicit) + self.assertEqual(rc, 7) + # An override-carrying command shows its full repr in str(e). + self.assertIn("PipelineCommand", str(e)) + self.assertIn("stderr=DEVNULL", str(e)) + else: + self.fail("PipelineError not raised") + + def test_pipeline_error_str_shows_args_when_no_overrides(self): + """str(PipelineError) shows bare args for commands without overrides. + + Avoids the "command 1 PipelineCommand(['false'])" stutter in + tracebacks for the common case where the caller passed plain + argv sequences. + """ + err = subprocess.PipelineError( + [subprocess.PipelineCommand(["true"]), + subprocess.PipelineCommand(["false"])], + [0, 1]) + msg = str(err) + self.assertIn("['false']", msg) + self.assertNotIn("PipelineCommand", msg) + + def _get_test_grp_name(): for name_group in ('staff', 'nogroup', 'grp', 'nobody', 'nfsnobody'): if grp: diff --git a/Misc/NEWS.d/next/Library/2026-04-25-12-00-00.gh-issue-47798.pIpEln.rst b/Misc/NEWS.d/next/Library/2026-04-25-12-00-00.gh-issue-47798.pIpEln.rst new file mode 100644 index 00000000000000..13a3e068ee2e76 --- /dev/null +++ b/Misc/NEWS.d/next/Library/2026-04-25-12-00-00.gh-issue-47798.pIpEln.rst @@ -0,0 +1,6 @@ +Added :func:`subprocess.run_pipeline` for running shell-style pipelines of +commands. The new :class:`subprocess.CompletedPipeline` and +:exc:`subprocess.PipelineError` types describe pipeline outcomes and check +failures, with ``check=True`` providing pipefail-style error handling. +Individual commands may be wrapped in a :class:`subprocess.PipelineCommand` +to override *stderr*, *env*, *cwd*, or *shell* for that command only.