Heroku: Use Selenium to run Google Chrome in a Python Script
Author: Jacek Trociński, Last Revision: 2020.11.17, Topic area: HerokuIntroduction
You can run Google Chrome on Heroku using Python and Selenium in order to automate browser tasks, take screenshots and much, much more. Here’s how to setup a Heroku app and Python script to run Google Chrome.
Add Google Chrome Buildpacks to your App
In your Heroku app go to Settings and add the following Buildpacks:
https://github.com/heroku/heroku-buildpack-chromedriver
https://github.com/heroku/heroku-buildpack-google-chrome
Create a Python Script
Create a Python script that will use Selenium to run Chrome:
# foo.py import time from selenium import webdriver from webdriver_manager.chrome import ChromeDriverManager gChromeOptions = webdriver.ChromeOptions() gChromeOptions.add_argument("window-size=1920x1480") gChromeOptions.add_argument("disable-dev-shm-usage") gDriver = webdriver.Chrome( chrome_options=gChromeOptions, executable_path=ChromeDriverManager().install() ) gDriver.get("https://www.python.org/") time.sleep(3) gDriver.save_screenshot("my_screenshot.png") gDriver.close()
In the script above, the Chrome option disable-dev-shm-usage
is added in order to avoid the
error session deleted because of page crash
. This error occurs as a result of
/dev/shm
being too small on Heroku to run Chrome.
The window-size
option is added in order to have the ability to save screenshots while
running chrome in --headless
mode. One thing to note is that Heroku’s filesystem is
ephemeral, which means files will be saved only for the duration of a dyno run and are cleaned
up soon after. In order to permanently save files I recommend using AWS S3 because of its
well-documented
boto3
library.
Create a Procfile
In order for a Python script to run on Heroku a Procfile is needed. The Procfile defines the
process type of the dyno you wish to run. There is a misconception that in order for Python to have
access to the internet a script should be run in a web dyno. A web dyno is only needed for app’s that
have a web interface, for example a flask app. For the Python script above, since there is no web
interface, we will define the process type as worker. A web dyno would result in a
Heroku Boot Timeout (Error R10)
error.
worker: python foo.py
Conclusion
Heroku can be used as a platform to run automated tasks in Google Chrome via the Selenium library in Python. The steps described in this post should help get you started building a Python app using Selenium that runs without error on Heroku.