Writing Word documents with Python
Writing Word (.docx) files in Python is easy with python-docx. Here’s a quick guide to your first project.
This tutorial is written using Python 3.7.4 on Windows 10 with python-docx 0.8.10. Other versions and operating systems, including Anaconda, Linux or Mac, should have similar results as long as they’re relatively up-to-date.
Key concepts
Understanding how content is managed in Word is helpful to using python-docx with less frustration.
- The document object is the highest level file object. Attributes for the document include the file name and other file-level properties.
- The middle ‘paragraph’ level includes objects like paragraphs, tables, pictures, headings, page breaks, and sections. Using paragraph as an example, attributes include line spacing, space before/after, etc.
- The lowest level is the ‘text’ level. This includes attributes like fonts (color, typeface, size).
Behind the scenes, python-docx creates a Word (.docx) document using XML that can be interpreted by Word. I think of python-docx as a really powerful and easier-to-master mail merge. The basic file structure is created by python-docx in the background so I can focus on adding content in an automated and repeatable fashion.
Step 1: Install python-docx
From the command line (within the virtual environment if you’re using one), use pip to install.
pip install python-docx
Step 2: Write the code
Create a docx-demo.py file and add the code below.
Here are a few notes on the above code:
- Import docx — not python-docx
- Font size is a shared class. Typing “docx.shared.Pt(12)” to set the font size is a little cryptic — just be aware that this is one way to do it, and there are a lot of other options that are similarly buried in the documentation. Hoping this example helps someone!
Refer to python-docx docs for full API documentation and available options. For example, here are paragraph and paragraph format options.
Step 3: Run the code
Run the code from the command line.
python docx-demo.py
Here’s what the output looks like: