Python Split and Merge PDF with PyMUPDF: A Completed Guide

By | April 15, 2020

This tutorial is in: Python PDF Document Processing Notes for Beginners

Python can split a big pdf file to some small ones, meanwhile, we also can merge some small pdf files to a big one. In this tutorial, we will introduce how to split and merge pdf files using python pymupdf library.


You should install python pymupdf library first.

pip install pymupdf

Open a source pdf file

To split or merge a pdf file, you should open a source pdf first. To open a pdf file in python pymupdf, we can do like this:

import sys, fitz

file = '231420-digitalimageforensics.pdf'
    doc = 
except Exception as e:

page_count  = doc.pageCount

Run this code, you will find the total page of source document (231420-digitalimageforensics.pdf) is: 199.

Then we can split some pages from the source pdf to a new pdf.

To split or merge pdf files in pymupdf, we can use Document.insertPDF() function.

insertPDF(docsrc, from_page=-1, to_page=-1, start_at=-1, rotate=-1, links=True, annots=True)

This function can select some pages from docsrc to insert into a new pdf.

The index of pages in a pdf document

In python pymupdf, the index of page starts with 0, which means the page index is in [0, total_page – 1].

This is very important if you plan to select some pages from a source pdf file.

Important parameters explain

docsrc: a source pdf file, we can select some page [from_page, to_page].

As to [from_page = 3, to_page = 5], which means we will select 3 pages (page 4, page 5, page 6) from a source pdf.

from_page: int, the start index of page in docsrc.

to_page: int, the end index of page in docsrc, you should notice this index page is also selected.

start_at: int, this parameter determines where to insert pages from docsrc.

For exampe: start_at = 1, which means we will insert pages from docsrc in between page index 0 and page index 1 in destination pdf file.

Menwhile, start_at should be smaller than the total page of destination pdf file.

For example:

doc2 ="new-doc-1.pdf")
doc2.insertPDF(doc, from_page = 3, to_page = 5, start_at = 1)"new-doc-4.pdf")

This code will select 3 pages from 231420-digitalimageforensics.pdf. Then, we will insert these pages into the end of first page of new-doc-1.pdf to create a new pdf document new-doc-4.pdf.

This code can split a pdf file and merge two pdf files to a new one.