Unlike some other languages, such as Ruby or PHP, Python has relatively few PDF conversion libraries. It goes without saying that the lack of a given tool can be very problematic, however, it can also bring about change. In fact, we created DocRaptor in 2010 when we couldn’t find any HTML-to-PDF libraries that could meet our requirements (in any language).
Over a decade later, we still think DocRaptor is the best online HTML to PDF API, but that doesn’t mean it’s right for every project. The Python HTML to PDF tools that do exist are well-featured, open-source libraries, and they’re worthy of consideration.
To help you find the right one, we have curated a list with the top Python PDF generation libraries. Based on our PDF conversion experience, we review some of the top benefits of and concerns about each library. Though we haven’t used all these in a production environment—which means, you should do your own testing and research—we trust this is a good starting point.
We’re biased, but we love DocRaptor. It takes just minutes to start creating documents with our HTML to PDF Python agent, and you have the option to sign up for a free plan or just use our public API key. Our API-based approach eliminates maintenance time and scalability concerns (which are more severe for PDF conversion jobs than most web server tasks).
Our partnership with the Prince commercial PDF library means we have the best support for PDF-specific functionality, such as advanced headers and footers, footnotes, fine-tuned page-break controls, forms, accessible PDFs, printer’s marks, varying page sizes, and much more. DocRaptor also has better CSS and JavaScript support than any non-Chromium-based open-source HTML-to-PDF library.
Python-based WeasyPrint has more PDF-specific features than any other open-source library, regardless of language. It supports a lot complex PDF functionality such as varying headers and footers—than most open-source tools. Unfortunately, it lacks any support for JavaScript—WeasyPrint relies solely on HTML and CSS. This blocks most chart libraries, dynamic table-of-contents, and many other advanced content options. WeasyPrint also lacks support for PDF forms. We've created an in-depth guide guide comparing DocRaptor and WeasyPrint.
A number of libraries let you to access headless browsers, including the Export to PDF functionality. They obviously support all the latest JavaScript and CSS functionality, but at a lack of support for PDF-specific features. Browsers is based on the concept of a single continuously scrolling page, and not the multiple pages contained in a PDF document. For that reason, they have poor support for headers and footers, page breaks, footnotes, and similar features.
Python-PDFKit is an adaption of the Ruby PDFKit library. It uses the wkhtmltopdf library for the HTML-to-PDF conversion. Unfortunately, wkhtmltopdf is based on an ancient version of WebKit and lacks support for many modern CSS and JavaScript features, such as flexbox. For that reason, we don’t recommend Python-PDFKit unless you know you only need to generate extremely simple documents. We have an entire list of wkhtmltopdf alternatives.
ReportLab has a lightweight open-source option and a more expansive commercial version—which supports a custom XML-to-PDF-esque conversion option. Their open-source library is very popular, but it doesn’t contain any HTML-to-PDF functionality. Instead, you’re required to build your PDF element-by-element, paragraph-by-paragraph. This generation method allows you to build a PDF of any complexity, but it will require a lot more coding than an HTML-based generator.
xhtml2pdf builds upon ReportLab’s open-source library and several other tools to create an actual HTML-to-PDF solution. While it supports some advanced functionality, such as named pages, xhtml2pdf uses non-standard nomenclature and CSS. This creates a steeper learning curve than with tools that support the CSS Paged Media specifications, such as DocRaptor and WeasyPrint.
xhtml2pdf lacks support for JavaScript, like WeasyPrint. Compared to xhtml2pdf, WeasyPrint appears to have a larger feature set and more expansive documentation. For that reason, we’d generally recommend WeasyPrint over xhtml2pdf.
Overall, Python has only a few well-supported PDF conversion options, but in terms of open-source options, they tend to have more powerful features than open-source HTML-to-PDF libraries in other languages.
When selecting a PDF generator, consider factors such as:
Use these questions to guide your choice, and try creating your PDF in multiple libraries to compare their output and ease of use. We’d recommend starting with the most complex elements of your PDF first. Support for edge cases tends to vary the most between generators.
If we can be of any assistance in your research, please reach out at support@docraptor.com. Good luck and happy PDFing!