Heroku: Use Selenium to run Google Chrome in a Python Script

Author: Jacek Trociński, Last Revision: 2020.11.17, Topic area: Heroku

Introduction

You can run Google Chrome on Heroku using Python and Selenium in order to automate browser tasks, take screenshots and much, much more. Here’s how to setup a Heroku app and Python script to run Google Chrome.

Add Google Chrome Buildpacks to your App

In your Heroku app go to Settings and add the following Buildpacks:

https://github.com/heroku/heroku-buildpack-chromedriver

https://github.com/heroku/heroku-buildpack-google-chrome

Add Google Chrome buildpacks to Heroku.

Create a Python Script

Create a Python script that will use Selenium to run Chrome:

# foo.py
import time

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

gChromeOptions = webdriver.ChromeOptions()
gChromeOptions.add_argument("window-size=1920x1480")
gChromeOptions.add_argument("disable-dev-shm-usage")
gDriver = webdriver.Chrome(
    chrome_options=gChromeOptions, executable_path=ChromeDriverManager().install()
)
gDriver.get("https://www.python.org/")
time.sleep(3)
gDriver.save_screenshot("my_screenshot.png")
gDriver.close()

In the script above, the Chrome option disable-dev-shm-usage is added in order to avoid the error session deleted because of page crash. This error occurs as a result of /dev/shm being too small on Heroku to run Chrome.

The window-size option is added in order to have the ability to save screenshots while running chrome in --headless mode. One thing to note is that Heroku’s filesystem is ephemeral, which means files will be saved only for the duration of a dyno run and are cleaned up soon after. In order to permanently save files I recommend using AWS S3 because of its well-documented boto3 library.

Create a Procfile

In order for a Python script to run on Heroku a Procfile is needed. The Procfile defines the process type of the dyno you wish to run. There is a misconception that in order for Python to have access to the internet a script should be run in a web dyno. A web dyno is only needed for app’s that have a web interface, for example a flask app. For the Python script above, since there is no web interface, we will define the process type as worker. A web dyno would result in a Heroku Boot Timeout (Error R10) error.

worker: python foo.py

Conclusion

Heroku can be used as a platform to run automated tasks in Google Chrome via the Selenium library in Python. The steps described in this post should help get you started building a Python app using Selenium that runs without error on Heroku.