What Happens When You Run python script.py? A Step-by-Step Deep Dive
Introduction
You open a terminal, type python script.py, and press Enter. Your output appears almost instantly.
But between that keypress and the first line of output, Python does a surprising amount of work.
Even a tiny script goes through multiple internal stages:
- The shell finds the Python interpreter
- CPython initializes itself
- Your source code is tokenized
- Tokens become an AST
- The AST is compiled into bytecode
- The Python virtual machine executes the bytecode
- Output is written and the interpreter cleans up
Most Python developers use these systems every day without ever seeing them directly. The interesting part is that Python exposes many of these internals through standard library modules like tokenize, ast, and dis.
In this article, you will observe each stage yourself using runnable examples and connect them into one complete mental model of how Python executes code.
All examples are tested on Python 3.12.
Step 1 — The Shell Finds Python
Before Python can execute your script, your operating system first needs to locate the Python interpreter itself.
When you type:
python script.py
Your shell does not magically know where Python lives. It searches through directories listed in the PATH environment variable — an ordered list of directories the shell checks whenever you run a command.
Checking Which Python Is Running
macOS / Linux:
which python3
Example output:
/usr/bin/python3
Windows:
where python
Example output:
C:\Python312\python.exe
This tells you exactly which executable runs when you type python.
What Is PATH?
PATH is an environment variable containing a list of directories. When you run a command, the shell checks those directories one by one until it finds a matching executable.
Simplified example:
/usr/local/bin
/usr/bin
/home/user/.local/bin
If Python exists in /usr/bin, the shell launches that interpreter.
Why One Computer May Have Multiple Pythons
Modern development machines often contain several Python installations:
- System Python (pre-installed by the OS)
- Homebrew Python (macOS)
venvvirtual environments- Anaconda / Conda environments
This is why running python --version sometimes produces unexpected results — your shell may be finding a different Python executable than you expected.
Historically, python often referred to Python 2 while python3 explicitly meant Python 3. Modern systems increasingly point both commands to Python 3, but the naming convention persists for compatibility.
Step 2 — The Interpreter Starts Up
Once the shell locates Python, the interpreter process starts. This is where CPython initializes itself.
CPython: The Main Python Implementation
When most people say “Python,” they actually mean CPython — the official reference implementation, written mostly in C. Other implementations exist (PyPy, Jython, MicroPython), but CPython dominates normal development.
What Happens During Startup
Before your script runs a single line, CPython initializes several internal systems:
- Memory management
- Built-in types (
int,str,list, etc.) - Exception handling infrastructure
- Import machinery
- The
sysmodule - Built-in functions like
print()
It also constructs Python’s module search path. You can inspect it directly:
python -c "import sys; print(sys.path)"
Example output:
[
'',
'/usr/lib/python312.zip',
'/usr/lib/python3.12',
'/usr/lib/python3.12/lib-dynload'
]
This list determines where Python searches for modules during imports.
Why Python Startup Sometimes Feels Slow
Even an empty Python process has startup overhead. When you run python, CPython still needs to initialize runtime structures, load built-in modules, configure imports, and allocate internal objects.
This startup cost is one reason Python CLI tools can feel slower than compiled binaries written in C or Rust. For short-lived scripts, startup time can dominate total execution time.
Step 3 — Source Code Is Read and Tokenized
After initialization, Python reads your .py file into memory. But Python does not process raw text directly. The source code first goes through tokenization (also called lexical analysis) — the process of converting plain text into meaningful language units called tokens.
Observing Tokenization Directly
Python exposes this process through the tokenize module:
import tokenize
import io
source = "x = 1 + 2"
tokens = tokenize.generate_tokens(
io.StringIO(source).readline
)
for tok in tokens:
print(tok)
Example output:
TokenInfo(type=1 (NAME), string='x', ...)
TokenInfo(type=54 (OP), string='=', ...)
TokenInfo(type=2 (NUMBER), string='1', ...)
TokenInfo(type=54 (OP), string='+', ...)
TokenInfo(type=2 (NUMBER), string='2', ...)
Each token has a type that describes its role:
| Token Type | Example | Meaning |
|---|---|---|
NAME | x | Variable or function name |
OP | =, + | Operator or punctuation |
NUMBER | 1, 2 | Numeric literal |
NEWLINE | End of logical line |
Why Tokenization Exists
The parser cannot work efficiently with raw text. Tokenization gives Python structured building blocks. Instead of seeing "x = 1 + 2" as a string of characters, the parser now sees NAME OP NUMBER OP NUMBER — a structured sequence it can reason about. This makes the next stage, parsing, possible.
Step 4 — Tokens Become an AST
Once tokenization finishes, Python parses the token stream into an AST — Abstract Syntax Tree.
An AST is a tree representation of your program’s structure. It does not care about formatting or whitespace. These two programs:
x=1+2
x = 1 + 2
produce essentially the same AST. The AST represents meaning, not appearance.
Observing the AST
Python exposes this stage through the ast module:
import ast
source = "x = 1 + 2"
tree = ast.parse(source)
print(ast.dump(tree, indent=2))
Expected output:
Module(
body=[
Assign(
targets=[
Name(id='x', ctx=Store())
],
value=BinOp(
left=Constant(value=1),
op=Add(),
right=Constant(value=2)
)
)
],
type_ignores=[]
)
The output shows the full structure of x = 1 + 2:
Assignrepresents the assignment operation (x = ...)BinOprepresents the binary operation (1 + 2)Constantrepresents the literal values1and2Addrepresents the+operatorNamerepresents the variablex
Why ASTs Matter
The AST is the structured semantic representation of your program — the final stage before compilation. Python tools rely heavily on ASTs: linters, formatters, static analyzers, type checkers, and code transformers like Black and Ruff all work by analyzing or rewriting ASTs.
Step 5 — AST Is Compiled to Bytecode
This is the step many Python developers do not realize exists. Python does compile your code — just not into machine code.
CPython transforms the AST into bytecode: a lower-level instruction format designed for the Python virtual machine. It sits between high-level Python source code and low-level machine instructions.
Viewing Bytecode with dis
Python exposes bytecode through the dis module:
import dis
def add(x, y):
return x + y
dis.dis(add)
Expected output:
2 0 RESUME 0
3 2 LOAD_FAST 0 (x)
4 LOAD_FAST 1 (y)
6 BINARY_OP 0 (+)
10 RETURN_VALUE
Each instruction does one small thing:
LOAD_FAST 0 (x)— push local variablexonto the evaluation stackLOAD_FAST 1 (y)— push local variableyonto the stackBINARY_OP 0 (+)— pop both values, add them, push the resultRETURN_VALUE— return the top value from the stack
Python Is Interpreted — But Also Compiled
This surprises many people. Python execution involves both compilation and interpretation. The compilation step produces bytecode. The Python virtual machine then interprets that bytecode.
What About .pyc Files and __pycache__?
Python caches compiled bytecode inside __pycache__/:
__pycache__/script.cpython-312.pyc
These .pyc files allow Python to skip recompilation when source files have not changed. Without bytecode caching, Python would need to tokenize, parse, and compile every imported module on every single run. Caching saves time during repeated execution — especially in large applications with many imports.
Step 6 — Bytecode Runs on the Python VM
Now the actual execution begins. The compiled bytecode is handed to the CPython virtual machine — a stack-based VM where most bytecode instructions manipulate a stack of temporary values.
Tracing x = 1 + 2 Through the VM
Instruction Stack After
─────────────────────────────
LOAD_CONST 1 [1]
LOAD_CONST 2 [1, 2]
BINARY_OP + [3]
STORE_NAME x []
After STORE_NAME, the value 3 is removed from the stack and stored in the current namespace as x. The stack is empty again, ready for the next statement.
Why the Stack Model Matters
Once you understand the stack machine model, many Python behaviors become easier to reason about:
- Why local variables are faster than globals (
LOAD_FASTvsLOAD_GLOBAL) - Why function calls have overhead (new stack frame allocation)
- Why some operations require temporary objects
The dis module is one of the most underrated Python debugging and optimization tools.
A Note on the GIL
The CPython VM contains the famous Global Interpreter Lock (GIL), which ensures that only one thread executes Python bytecode at a time within a single process. This is why CPU-heavy multithreaded Python programs often fail to scale across cores. The GIL deep dive on this site covers how the lock actually works and when it affects performance.
Step 7 — Output and Cleanup
Eventually your program reaches output statements like print("Hello"). But even print() is just another function call.
print() Uses sys.stdout
import sys
sys.stdout.write("Hello without print\n")
Expected output:
Hello without print
print() is essentially a convenience wrapper around writing to sys.stdout. By default, sys.stdout points to the terminal — but shells can redirect it:
python script.py > output.txt
Now stdout points to a file instead. This is why command-line pipelines work.
Cleanup During Program Exit
When execution finishes, Python performs several cleanup tasks:
- Flushing stdout buffers
- Closing open file handles
- Decrementing object references
- Running
atexithooks - Final garbage collection
Then the interpreter process terminates and control returns to the shell. The garbage collector article covers the full details of how CPython decides when to free objects during both normal execution and program shutdown.
Putting It All Together
Here is the complete pipeline from command to output:
python script.py
↓
Shell finds Python interpreter using PATH
↓
CPython interpreter starts
(sys, builtins, imports initialized)
↓
Source file read into memory
↓
Tokenization
(characters → tokens)
↓
Parser builds AST
(tokens → syntax tree)
↓
Compiler generates bytecode
(AST → bytecode instructions)
↓
Bytecode cached in __pycache__
↓
CPython VM executes bytecode
(stack machine, one instruction at a time)
↓
Output written to sys.stdout
↓
Cleanup and interpreter shutdown
This entire pipeline usually completes in milliseconds for small scripts.
Why This Matters in Practice
Understanding Python internals is not just academic trivia. These implementation details explain real-world behavior.
Why Python Startup Feels Slow
Interpreter initialization (Step 2) adds startup overhead before your script runs a single line. This is why Python CLI tools often feel slower than compiled binaries for tiny tasks. The runtime must initialize itself completely regardless of how short your script is.
Why __pycache__ Is Not Junk
Many developers delete __pycache__ folders without understanding them. They are not junk — they speed up repeated execution by caching compiled bytecode. Deleting them occasionally is fine, but Python will simply regenerate them on the next run.
Why Imports Can Be Expensive
Every imported module goes through tokenization, parsing, AST generation, compilation, and execution. Large dependency trees increase startup time significantly. This is one reason massive frameworks can feel slow to import.
The Python import system deep dive on this site covers how sys.path, sys.modules, and package loading work in detail.
Why Bytecode Knowledge Helps with Performance
Bytecode inspection reveals what Python is actually doing internally. Sometimes seemingly similar code produces very different bytecode. Understanding this explains why local variables are faster than globals, why attribute access costs more than variable lookup, and why function calls carry overhead.
Understanding how Python stores and reuses objects — covered in the mutable vs immutable guide — connects directly to how the VM manages values on the evaluation stack.
Understanding how Python manages function scope and closures — which is the foundation for how decorators work — also connects directly to how the VM allocates stack frames for each function call.
For the full picture of how CPython allocates and reclaims memory at a lower level — including arenas, pools, and the small object allocator — see the Python memory management deep dive.
Further Reading
Official documentation:
Once you understand the execution pipeline, many “mysterious” Python behaviors stop feeling magical and start feeling mechanical. If you have questions or want to suggest a future topic, get in touch via the Contact page.