What Happens When You Run python script.py? A Step-by-Step Deep Dive

· 11 min read

Introduction

You open a terminal, type python script.py, and press Enter. Your output appears almost instantly.

But between that keypress and the first line of output, Python does a surprising amount of work.

Even a tiny script goes through multiple internal stages:

  • The shell finds the Python interpreter
  • CPython initializes itself
  • Your source code is tokenized
  • Tokens become an AST
  • The AST is compiled into bytecode
  • The Python virtual machine executes the bytecode
  • Output is written and the interpreter cleans up

Most Python developers use these systems every day without ever seeing them directly. The interesting part is that Python exposes many of these internals through standard library modules like tokenize, ast, and dis.

In this article, you will observe each stage yourself using runnable examples and connect them into one complete mental model of how Python executes code.

All examples are tested on Python 3.12.


Step 1 — The Shell Finds Python

Before Python can execute your script, your operating system first needs to locate the Python interpreter itself.

When you type:

python script.py

Your shell does not magically know where Python lives. It searches through directories listed in the PATH environment variable — an ordered list of directories the shell checks whenever you run a command.

Checking Which Python Is Running

macOS / Linux:

which python3

Example output:

/usr/bin/python3

Windows:

where python

Example output:

C:\Python312\python.exe

This tells you exactly which executable runs when you type python.

What Is PATH?

PATH is an environment variable containing a list of directories. When you run a command, the shell checks those directories one by one until it finds a matching executable.

Simplified example:

/usr/local/bin
/usr/bin
/home/user/.local/bin

If Python exists in /usr/bin, the shell launches that interpreter.

Why One Computer May Have Multiple Pythons

Modern development machines often contain several Python installations:

  • System Python (pre-installed by the OS)
  • Homebrew Python (macOS)
  • venv virtual environments
  • Anaconda / Conda environments

This is why running python --version sometimes produces unexpected results — your shell may be finding a different Python executable than you expected.

Historically, python often referred to Python 2 while python3 explicitly meant Python 3. Modern systems increasingly point both commands to Python 3, but the naming convention persists for compatibility.


Step 2 — The Interpreter Starts Up

Once the shell locates Python, the interpreter process starts. This is where CPython initializes itself.

CPython: The Main Python Implementation

When most people say “Python,” they actually mean CPython — the official reference implementation, written mostly in C. Other implementations exist (PyPy, Jython, MicroPython), but CPython dominates normal development.

What Happens During Startup

Before your script runs a single line, CPython initializes several internal systems:

  • Memory management
  • Built-in types (int, str, list, etc.)
  • Exception handling infrastructure
  • Import machinery
  • The sys module
  • Built-in functions like print()

It also constructs Python’s module search path. You can inspect it directly:

python -c "import sys; print(sys.path)"

Example output:

[
 '',
 '/usr/lib/python312.zip',
 '/usr/lib/python3.12',
 '/usr/lib/python3.12/lib-dynload'
]

This list determines where Python searches for modules during imports.

Why Python Startup Sometimes Feels Slow

Even an empty Python process has startup overhead. When you run python, CPython still needs to initialize runtime structures, load built-in modules, configure imports, and allocate internal objects.

This startup cost is one reason Python CLI tools can feel slower than compiled binaries written in C or Rust. For short-lived scripts, startup time can dominate total execution time.


Step 3 — Source Code Is Read and Tokenized

After initialization, Python reads your .py file into memory. But Python does not process raw text directly. The source code first goes through tokenization (also called lexical analysis) — the process of converting plain text into meaningful language units called tokens.

Observing Tokenization Directly

Python exposes this process through the tokenize module:

import tokenize
import io

source = "x = 1 + 2"

tokens = tokenize.generate_tokens(
    io.StringIO(source).readline
)

for tok in tokens:
    print(tok)

Example output:

TokenInfo(type=1 (NAME), string='x', ...)
TokenInfo(type=54 (OP), string='=', ...)
TokenInfo(type=2 (NUMBER), string='1', ...)
TokenInfo(type=54 (OP), string='+', ...)
TokenInfo(type=2 (NUMBER), string='2', ...)

Each token has a type that describes its role:

Token TypeExampleMeaning
NAMExVariable or function name
OP=, +Operator or punctuation
NUMBER1, 2Numeric literal
NEWLINEEnd of logical line

Why Tokenization Exists

The parser cannot work efficiently with raw text. Tokenization gives Python structured building blocks. Instead of seeing "x = 1 + 2" as a string of characters, the parser now sees NAME OP NUMBER OP NUMBER — a structured sequence it can reason about. This makes the next stage, parsing, possible.


Step 4 — Tokens Become an AST

Once tokenization finishes, Python parses the token stream into an AST — Abstract Syntax Tree.

An AST is a tree representation of your program’s structure. It does not care about formatting or whitespace. These two programs:

x=1+2
x = 1 + 2

produce essentially the same AST. The AST represents meaning, not appearance.

Observing the AST

Python exposes this stage through the ast module:

import ast

source = "x = 1 + 2"

tree = ast.parse(source)

print(ast.dump(tree, indent=2))

Expected output:

Module(
  body=[
    Assign(
      targets=[
        Name(id='x', ctx=Store())
      ],
      value=BinOp(
        left=Constant(value=1),
        op=Add(),
        right=Constant(value=2)
      )
    )
  ],
  type_ignores=[]
)

The output shows the full structure of x = 1 + 2:

  • Assign represents the assignment operation (x = ...)
  • BinOp represents the binary operation (1 + 2)
  • Constant represents the literal values 1 and 2
  • Add represents the + operator
  • Name represents the variable x

Why ASTs Matter

The AST is the structured semantic representation of your program — the final stage before compilation. Python tools rely heavily on ASTs: linters, formatters, static analyzers, type checkers, and code transformers like Black and Ruff all work by analyzing or rewriting ASTs.


Step 5 — AST Is Compiled to Bytecode

This is the step many Python developers do not realize exists. Python does compile your code — just not into machine code.

CPython transforms the AST into bytecode: a lower-level instruction format designed for the Python virtual machine. It sits between high-level Python source code and low-level machine instructions.

Viewing Bytecode with dis

Python exposes bytecode through the dis module:

import dis

def add(x, y):
    return x + y

dis.dis(add)

Expected output:

  2           0 RESUME                   0

  3           2 LOAD_FAST                0 (x)
              4 LOAD_FAST                1 (y)
              6 BINARY_OP                0 (+)
             10 RETURN_VALUE

Each instruction does one small thing:

  • LOAD_FAST 0 (x) — push local variable x onto the evaluation stack
  • LOAD_FAST 1 (y) — push local variable y onto the stack
  • BINARY_OP 0 (+) — pop both values, add them, push the result
  • RETURN_VALUE — return the top value from the stack

Python Is Interpreted — But Also Compiled

This surprises many people. Python execution involves both compilation and interpretation. The compilation step produces bytecode. The Python virtual machine then interprets that bytecode.

What About .pyc Files and __pycache__?

Python caches compiled bytecode inside __pycache__/:

__pycache__/script.cpython-312.pyc

These .pyc files allow Python to skip recompilation when source files have not changed. Without bytecode caching, Python would need to tokenize, parse, and compile every imported module on every single run. Caching saves time during repeated execution — especially in large applications with many imports.


Step 6 — Bytecode Runs on the Python VM

Now the actual execution begins. The compiled bytecode is handed to the CPython virtual machine — a stack-based VM where most bytecode instructions manipulate a stack of temporary values.

Tracing x = 1 + 2 Through the VM

Instruction       Stack After
─────────────────────────────
LOAD_CONST 1      [1]
LOAD_CONST 2      [1, 2]
BINARY_OP +       [3]
STORE_NAME x      []

After STORE_NAME, the value 3 is removed from the stack and stored in the current namespace as x. The stack is empty again, ready for the next statement.

Why the Stack Model Matters

Once you understand the stack machine model, many Python behaviors become easier to reason about:

  • Why local variables are faster than globals (LOAD_FAST vs LOAD_GLOBAL)
  • Why function calls have overhead (new stack frame allocation)
  • Why some operations require temporary objects

The dis module is one of the most underrated Python debugging and optimization tools.

A Note on the GIL

The CPython VM contains the famous Global Interpreter Lock (GIL), which ensures that only one thread executes Python bytecode at a time within a single process. This is why CPU-heavy multithreaded Python programs often fail to scale across cores. The GIL deep dive on this site covers how the lock actually works and when it affects performance.


Step 7 — Output and Cleanup

Eventually your program reaches output statements like print("Hello"). But even print() is just another function call.

import sys

sys.stdout.write("Hello without print\n")

Expected output:

Hello without print

print() is essentially a convenience wrapper around writing to sys.stdout. By default, sys.stdout points to the terminal — but shells can redirect it:

python script.py > output.txt

Now stdout points to a file instead. This is why command-line pipelines work.

Cleanup During Program Exit

When execution finishes, Python performs several cleanup tasks:

  • Flushing stdout buffers
  • Closing open file handles
  • Decrementing object references
  • Running atexit hooks
  • Final garbage collection

Then the interpreter process terminates and control returns to the shell. The garbage collector article covers the full details of how CPython decides when to free objects during both normal execution and program shutdown.


Putting It All Together

Here is the complete pipeline from command to output:

python script.py

Shell finds Python interpreter using PATH

CPython interpreter starts
(sys, builtins, imports initialized)

Source file read into memory

Tokenization
(characters → tokens)

Parser builds AST
(tokens → syntax tree)

Compiler generates bytecode
(AST → bytecode instructions)

Bytecode cached in __pycache__

CPython VM executes bytecode
(stack machine, one instruction at a time)

Output written to sys.stdout

Cleanup and interpreter shutdown

This entire pipeline usually completes in milliseconds for small scripts.


Why This Matters in Practice

Understanding Python internals is not just academic trivia. These implementation details explain real-world behavior.

Why Python Startup Feels Slow

Interpreter initialization (Step 2) adds startup overhead before your script runs a single line. This is why Python CLI tools often feel slower than compiled binaries for tiny tasks. The runtime must initialize itself completely regardless of how short your script is.

Why __pycache__ Is Not Junk

Many developers delete __pycache__ folders without understanding them. They are not junk — they speed up repeated execution by caching compiled bytecode. Deleting them occasionally is fine, but Python will simply regenerate them on the next run.

Why Imports Can Be Expensive

Every imported module goes through tokenization, parsing, AST generation, compilation, and execution. Large dependency trees increase startup time significantly. This is one reason massive frameworks can feel slow to import.

The Python import system deep dive on this site covers how sys.path, sys.modules, and package loading work in detail.

Why Bytecode Knowledge Helps with Performance

Bytecode inspection reveals what Python is actually doing internally. Sometimes seemingly similar code produces very different bytecode. Understanding this explains why local variables are faster than globals, why attribute access costs more than variable lookup, and why function calls carry overhead.

Understanding how Python stores and reuses objects — covered in the mutable vs immutable guide — connects directly to how the VM manages values on the evaluation stack.

Understanding how Python manages function scope and closures — which is the foundation for how decorators work — also connects directly to how the VM allocates stack frames for each function call.

For the full picture of how CPython allocates and reclaims memory at a lower level — including arenas, pools, and the small object allocator — see the Python memory management deep dive.


Further Reading

Official documentation:

Once you understand the execution pipeline, many “mysterious” Python behaviors stop feeling magical and start feeling mechanical. If you have questions or want to suggest a future topic, get in touch via the Contact page.