How to Read and Write Excel Files in Python with openpyxl: A Beginner's Guide
Introduction
Excel files are still everywhere in daily work. Finance teams export reports as spreadsheets, operations teams track inventory in .xlsx files, and analysts often receive raw Excel data from clients or coworkers. Instead of copying values manually, Python can automate these repetitive tasks in minutes.
One of the most popular libraries for handling Excel files in Python is openpyxl. It is written in pure Python, supports modern .xlsx files, and works well for both small scripts and larger automation projects.
In this tutorial, you will learn how to:
- Install openpyxl
- Read Excel files
- Write data into spreadsheets
- Handle common real-world automation tasks
All code in this tutorial is tested on Python 3.12 and openpyxl 3.1.5.
Installing openpyxl
Install openpyxl with pip:
pip install openpyxl
You can verify the installation by importing the package in Python:
import openpyxl
print(openpyxl.__version__)
Expected output:
3.1.5
This simple test confirms that Python can successfully import the library and that the installation is working correctly.
Reading an Excel File
Before reading data, create a simple Excel file named sample.xlsx.
Create a Sample Excel File
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws.title = "Sales"
ws.append(["Quarter", "Revenue"])
ws.append(["Q1", 12000])
ws.append(["Q2", 15500])
ws.append(["Q3", 18200])
ws.append(["Q4", 21000])
wb.save("sample.xlsx")
print("sample.xlsx created successfully")
Expected output:
sample.xlsx created successfully
This code creates a workbook with quarterly sales data and saves it as sample.xlsx.
Open an Existing Workbook
from openpyxl import load_workbook
wb = load_workbook("sample.xlsx")
ws = wb.active
print(ws.title)
Expected output:
Sales
This code opens the Excel file and returns the currently active worksheet.
Read a Single Cell
from openpyxl import load_workbook
wb = load_workbook("sample.xlsx")
ws = wb.active
print(ws["A1"].value)
print(ws["B2"].value)
Expected output:
Quarter
12000
ws["A1"] accesses a specific cell directly. The .value attribute returns the actual content stored in the cell.
Iterate Through Rows
from openpyxl import load_workbook
wb = load_workbook("sample.xlsx")
ws = wb.active
for row in ws.iter_rows(values_only=True):
print(row)
Expected output:
('Quarter', 'Revenue')
('Q1', 12000)
('Q2', 15500)
('Q3', 18200)
('Q4', 21000)
This approach is useful when processing tables or exporting spreadsheet data into another system.
Read a Specific Range
from openpyxl import load_workbook
wb = load_workbook("sample.xlsx")
ws = wb.active
for row in ws["A2:B4"]:
values = [cell.value for cell in row]
print(values)
Expected output:
['Q1', 12000]
['Q2', 15500]
['Q3', 18200]
This code reads only a selected range instead of the entire worksheet. That becomes useful when working with large spreadsheets.