Puppeteer Tutorial #4 – Create and Generate PDFs From Webpages

The ability to generate PDF documents from web pages has become a crucial feature in various applications, from generating reports to archiving web content.

Puppeteer, a popular Node.js library developed by the Chrome team, provides a simple and effective way to achieve this.

In this tutorial, we will walk through the process of using Puppeteer to generate a PDF from a web page and save it locally.

The final completed code is given at the end of the article.

1. Introduction to Generating PDFs with Puppeteer

Puppeteer’s ability to control headless or full browsers over the DevTools Protocol makes it an ideal choice for automating web interactions, including PDF generation.

Whether you need to generate invoices, capture web content, or save reports, Puppeteer simplifies the process.

2. Setting Up Puppeteer

Before we dive into the code, ensure you have Node.js installed on your system.

You can install Puppeteer using the following command:

npm install puppeteer

3. Generating a PDF

Let’s start by creating a function that takes a URL and an output file name as parameters.

This function will launch a browser, navigate to the specified URL, generate a PDF, and then close the browser.

const puppeteer = require("puppeteer");

async function generatePDF(url, outputfile){

    try {
        // Launch the browser 
        const browser = await puppeteer.launch({headless: false});
        const page = await browser.newPage();

        // Navigate to the page 
        await page.goto(url);

        // Generate a PDF 
        await page.pdf({path: outputfile, format: 'A4'});
        
        // Close the browser 
        await browser.close();

    } catch(err) {
        console.log(err);
    }
}

4. Customizing PDF Output

Puppeteer’s page.pdf() function provides various options to customize the PDF output.

In this example, we use the format option to specify the paper format as ‘A4’.

See also  Crypto Module Nodejs Examples

You can explore additional options such as setting margins, headers, footers, and more.

5. Final complete Code

The final completed code is shared below for your reference.

const puppeteer = require("puppeteer");

async function generatePDF(url, outputfile){

    try {
        // Launch the browser 
        const browser = await puppeteer.launch({headless: false});
        const page = await browser.newPage();

        // Navigate to the page 
        await page.goto(url);

        // Generate a PDF 
        await page.pdf({path: outputfile, format: 'A4'});
        
        // close the browser 
        await browser.close();

    } catch(err) {
        console.log(err);
    }
}

const url = "http://google.com";
const outputfile = "output.pdf";

generatePDF(url, outputfile);

Generating PDFs from web pages using Puppeteer is a powerful tool that can enhance your workflow by automating the process of creating documents for various purposes.

In this tutorial, we learned how to set up Puppeteer, generate a PDF from a web page, and customize the output format.

Whether you’re building a web application or need to automate document generation, Puppeteer’s capabilities offer a reliable and efficient solution.

As you continue to explore Puppeteer’s features, you’ll discover even more ways to automate web interactions, data extraction, and document creation.

Happy coding!