KhueApps
Home/Python/Automate Excel Editing in Python with openpyxl and pandas

Automate Excel Editing in Python with openpyxl and pandas

Last updated: October 07, 2025

Using Python to automatically edit Excel files

Automating Excel edits with Python is reliable and fast for .xlsx files. This guide shows practical patterns with openpyxl (structural edits, styles) and pandas (bulk data updates). You do not need Excel installed.

When to use which library

LibraryNeeds Excel?Edits existing .xlsxStylingBest for
openpyxlNoYesGoodCell-level edits, formulas, formatting, sheets
pandasNoYes (via openpyxl)LimitedLarge, table-like data transforms
xlwingsYesYesFullDriving the Excel app (macOS/Windows)

Quickstart

  1. Install packages
pip install openpyxl pandas
  1. Know your target
  • .xlsx files are supported directly.
  • .xls (legacy) should be converted to .xlsx first for best results.
  1. Plan the edit
  • Structural/styled changes: use openpyxl.
  • Bulk data transforms: use pandas, then write back with openpyxl engine.

Minimal working example (create, edit, save)

This script creates a small workbook if missing, then edits it.

from pathlib import Path
from openpyxl import Workbook, load_workbook
from openpyxl.styles import Font

xlsx = Path("example.xlsx")

# 1) Create a sample workbook if it doesn't exist
if not xlsx.exists():
    wb = Workbook()
    ws = wb.active
    ws.title = "Data"
    ws.append(["Item", "Qty", "Price"])  # header
    ws.append(["Apples", 10, 1.50])
    ws.append(["Oranges", 8, 2.00])
    wb.save(xlsx)

# 2) Load and edit
wb = load_workbook(xlsx)
ws = wb["Data"]

# Bold header row
for cell in ws[1]:
    cell.font = Font(bold=True)

# Add a Total column with a formula per row
ws["D1"] = "Total"
for r in range(2, ws.max_row + 1):
    ws[f"D{r}"] = f"=B{r}*C{r}"

# Append a new row
ws.append(["Bananas", 5, 1.20])

# Save to a new file to keep the original
wb.save("example_edited.xlsx")
print("Wrote example_edited.xlsx")

Notes:

  • Formulas are written but not evaluated by openpyxl. Excel will compute them when the file is opened.

Common tasks (recipes)

Update specific cells

from openpyxl import load_workbook
wb = load_workbook("example_edited.xlsx")
ws = wb["Data"]
ws["B2"] = 12  # change Qty for row 2
wb.save("example_edited.xlsx")

Find and update rows by value

from openpyxl import load_workbook
wb = load_workbook("example_edited.xlsx")
ws = wb["Data"]
for row in ws.iter_rows(min_row=2):
    item_cell, qty_cell, price_cell = row[:3]
    if item_cell.value == "Oranges":
        qty_cell.value = (qty_cell.value or 0) + 2
wb.save("example_edited.xlsx")

Insert/delete rows or columns

from openpyxl import load_workbook
wb = load_workbook("example_edited.xlsx")
ws = wb["Data"]
ws.insert_rows(2, amount=1)   # insert one row at row index 2
ws.delete_cols(3, amount=1)   # delete the Price column
wb.save("example_edited.xlsx")

Bulk transform with pandas and write back

import pandas as pd

# Read existing sheet
df = pd.read_excel("example_edited.xlsx", sheet_name="Data")
# Add a discounted price column (10% off)
df["Discounted"] = (df.get("Price") or df["Price"]).astype(float) * 0.9

# Replace the sheet with the updated dataframe
with pd.ExcelWriter(
    "example_edited.xlsx", engine="openpyxl", mode="a", if_sheet_exists="replace"
) as writer:
    df.to_excel(writer, sheet_name="Data", index=False)

Tip: if_sheet_exists="replace" overwrites the entire sheet, which is safest when the shape changes. Use "overlay" only if you control ranges precisely.

Numbered steps: building a robust automation

  1. Choose the tool: openpyxl for structural edits; pandas for large data changes.
  2. Install dependencies: pip install openpyxl pandas.
  3. Load the workbook: load_workbook(path). Avoid data_only=True when writing formulas.
  4. Locate targets: column names, cell coordinates, or iterate rows with iter_rows.
  5. Apply changes: set values, formulas, insert/delete rows/cols, styles.
  6. Save safely: write to a new file first; validate; then replace the original if needed.
  7. Validate output: open with Excel or read back via openpyxl/pandas. Remember formula cells may show None until Excel recalculates.

Pitfalls and gotchas

  • Formula calculation: openpyxl does not compute formulas. Results appear after Excel recalculates. Avoid data_only=True if you need to preserve formulas.
  • Dates and times: write Python datetime/date/time objects; avoid strings to prevent locale issues.
  • Types: Excel cells can contain numbers, strings, booleans, dates, or formulas. Explicitly cast when reading to avoid surprises.
  • Pandas formatting: to_excel writes values; most cell-level formatting is lost. Apply styles with openpyxl afterward if required.
  • File locks (Windows): close the workbook before saving elsewhere; ensure the file is not open in Excel while writing.
  • Merged cells: reads/writes across merged regions can raise errors. Unmerge/merge carefully.
  • Sheet names: must be <= 31 chars and cannot contain certain characters. Validate before adding/renaming.
  • Performance traps: avoid cell-by-cell loops for huge ranges; prefer pandas or streaming modes.

Performance notes

  • Reading large sheets: use ws.iter_rows(values_only=True) to stream values without style objects.
  • Streaming writes: create Workbook(write_only=True) to write many rows efficiently (cannot read existing files with write_only).
  • Minimize style writes: applying Font/Fill per cell is expensive. Use NamedStyle and apply to ranges.
  • Batch with pandas: perform vectorized operations in pandas, then write once.
  • Limit saves: call wb.save once per batch; frequent saves are slow.
  • Memory: for very large files, split work by sheet or chunk with pandas.read_excel(usecols=..., nrows=..., skiprows=...).

Tiny FAQ

Q: Do I need Excel installed? A: No. openpyxl and pandas edit .xlsx files directly. You need Excel only if you want to drive the app (e.g., with xlwings) or recalc formulas visually.

Q: Can Python edit .xls files? A: Prefer converting .xls to .xlsx first. Modern libraries focus on .xlsx. Editing .xls reliably often requires platform-specific tools.

Q: How do I get formula results in Python? A: Excel computes them. If you must compute in Python, replace formulas with values you calculate, or open then save in Excel to trigger recalculation.

Q: How do I preserve formatting when updating with pandas? A: You generally cannot. Write values with pandas, then re-open with openpyxl to reapply styles or structure.

What to automate next

  • Scheduled updates: run your script via cron/Task Scheduler.
  • Data pipelines: combine pandas transforms with openpyxl styling to build polished reports automatically.

Series: Automate boring tasks with Python

Python