Sys, subprocess, and argparse


🔧 Python’s sys Module

The sys module lets you interact with the Python runtime environment (the interpreter, I/O streams, CLI args, exit codes, import paths, etc.).


Command-Line Arguments: sys.argv

sys.argv holds the arguments passed to your script from the command line.

Example script: script.py

import sys

print("Arguments:", sys.argv)

Run it:

python script.py hello 123 world

Output:

Arguments: ['script.py', 'hello', '123', 'world']

Use case: reading user input to the script

import sys

if len(sys.argv) < 3:
    print("Usage: python downloader.py <url> <output>")
    sys.exit(1)

url = sys.argv[1]
output = sys.argv[2]

print(f"Downloading {url} to {output}")

Exiting the Program: sys.exit()

You can exit with a success code (0) or an error code (non-zero).

Why use this?

  • End execution early

  • Communicate success/failure to the OS

  • CI/CD pipelines depend on exit codes


Stdout / Stderr / Stdin

These represent the shell’s:

  • stdout → normal output

  • stderr → error output

  • stdin → input stream

Example: writing to stdout

Example: writing to stderr

This matters because many tools capture only stdout while showing stderr separately.

Example: reading from stdin

Run it:

Output:


System Information

sys.version, sys.platform, and others let you inspect the runtime environment.

Example: checking Python version

Example: get OS platform

Typical outputs:

  • "darwin" → macOS

  • "linux" → Linux

  • "win32" → Windows

Example: find the Python interpreter path

This helps when debugging virtual environments.


Manipulating sys.path (Import Search Path)

sys.path is the list of directories Python searches when importing modules.

Example: print module search paths

Example: add a custom path

Useful when:

  • working with monorepos

  • local development of reusable modules

  • running scripts without packaging them


sys.getsizeof() — Memory Size of Objects

This shows memory usage of Python objects.

Example:

This is very important for Data Engineering when:

  • optimizing memory

  • checking object sizes in pipelines

  • diagnosing memory leaks


sys.modules — Loaded Module Cache

Shows all imported modules and their references.

Example:

Useful when debugging imports or circular imports.


🏃 subprocess Module — Run External Commands

subprocess is for running external programs from Python. This is how you run shell commands like ls, grep, ping, ps, etc.

It replaced older modules like os.system, commands, popen2.

⭐ The Two Most Important Functions

1. subprocess.run()

The simplest and safest API.

✔ Runs command ✔ Waits for it to finish ✔ Captures output ✔ Returns a CompletedProcess


2. subprocess.Popen()

More advanced, for streaming output or interactive programs.

Use Popen when you need:

  • long-running processes

  • stream output live

  • send input to process (stdin)

  • more control over pipes


🧪 Examples You’ll Actually Use

✔ Example: Run a Shell Command

⚠️ Only use shell=True if necessary (security risk with user input).


✔ Example: Check if Command Succeeded


✔ Example: Capture stdout + stderr


✔ Example: Send Input to a Process


Argparse

argparse is the standard Python library for building command-line interfaces (CLIs). Whenever you run a command like:

you’re using the kind of functionality argparse helps you build.

It converts command-line arguments → Python variables, supports validation, help messages, types, defaults, and much more.

Let’s break it down clearly and practically.


🔧 What argparse Does

✔ Defines which arguments your script accepts ✔ Parses them from the command line ✔ Automatically generates --help text ✔ Validates types and allowed values ✔ Supports flags (--verbose), options (--output file), and positional arguments


⭐ Minimal Example

script.py

Run:

Output:


⭐ Example With Flags and Options

Run:

Output:


🔥 What You Can Add Using argparse

Positional Arguments

Required and order-based.

Use:


Optional Arguments (“Flags”)

Boolean flag:

Run:

Option with a value:

Run:


Argument Types


Choices (Restrict allowed values)

Run:


Default Values


Lists / Multiple Values

Method 1 — nargs

Run:

Output:

Method 2 — repeat flags

Run:

Output:


Required Flags


Help and Description

The user gets automatic documentation:

Output:


🧱 Complex Example (Realistic)

Run:


Last updated