Headless Chrome on Heroku

I’ve been experimenting with headless chrome for a Link Unshortener tool I’ve built to take screenshots of websites. I’ve been using BrowserShot which is great. It’s a php wrapper around Puppeteer which makes it simple to use in Laravel. To experiment more with Puppeteer, I wanted to get a node app running on Heroku. Overall it’s pretty straight forward but there are a few gotchas.

Here is a sample project that should get you started. Clone this and take a look at the source. You’ll notice I specified the node.js version in the package.json file. This is required for Heroku to know what version of node to use. Another issue I ran into is to run Puppeteer on Heroku you have to specify –no-sandbox. They last hurdle was adding the puppeteer Heroku buildpack. Follow the steps below and you should have a working screenshot app running locally and on Heroku. These instructions assume you are familiar with node and Heroku.

Setup Steps

  1. Clone sample project: git [email protected]:timleland/headless-chrome.git
  2. Make sure you are in the correct directory: cd headless-chrome
  3. Install node dependencies: npm install
  4. Run the app to test: npm start
  5. Open your browser to: http://localhost:8080/?url=google.com
  6. You should have download a png screenshot of google.com
  7. To deploy to Heroku make sure you have the Heroku cli tool installed 
  8. Create a new app on Heroku: heroku create APPNAMEHERE
  9. Add the puppeteer heroku buildpack: heroku buildpacks:add https://github.com/jontewks/puppeteer-heroku-buildpack
  10. Deploy to Heroku: git push heroku master
  11. Now just append the query string url=site.
  12. You should now have a working Heroku app that will screenshot any url you send it.

Leave your questions and feedback in the comments below.

 


Thanks for reading. Make sure you follow me on Twitter to stay up to date on the progress of my side projects T.LYWeather Extension, and Link Shortener Extension. If you are interested in the tech I use daily, check out my uses page.  

49 thoughts to “Headless Chrome on Heroku”

      1. @Tim

        Yeah, sorry I meant: `heroku buildpacks:add heroku/nodejs` (Note the `add` instead of `set` as `set` seems to overwrite the existing buildpacks)

        Thanks

    1. I am using the same commands but it is giving me the following error ->
      Error: Failed to launch the browser process
      [20:20:0610/055305.613555:ERROR:browser_main_loop.cc(1425)] Unable to open X display.

  1. Hello!

    Thanks for the tutorial in the first place. I’ve followed the steps but I got the following error in my console:

    /app/node_modules/puppeteer/.local-chromium/linux-508693/chrome-linux/chrome: error while loading shared libraries: libcairo-gobject.so.2: cannot open shared object file: No such file or directory

    Do you know anything about this issue?

      1. I’ve already done that and it’s still not working. I’ve added the libcairo-gobject to the buildpack and I’ve managed to get it run.

  2. Thank you,.
    #1
    2018-01-08T21:50:44.412485+00:00 heroku[router]: at=error code=H14 desc=”No web processes running” method=GET path=”/?url=google.com” host=infinite-headland-44709.herokuapp.com request_id=85fa1725-7ee0-4404-bb19-8dd54fbfffbe fwd=”24.5.137.98″ dyno= connect= service= status=503 bytes= protocol=https
    2018-01-08T21:50:44.862414+00:00 heroku[router]: at=error code=H14 desc=”No web processes running” method=GET path=”/favicon.ico” host=infinite-headland-44709.herokuapp.com request_id=46644015-4aed-4038-a941-80321c466eb0 fwd=”24.5.137.98″ dyno= connect= service= status=503 bytes= protocol=https

    #2 Tried doing this
    heroku ps:scale web=1

    got

    Scaling dynos… !
    ▸ Couldn’t find that process type.

    #4 Tried removing buildpacks
    heroku buildpacks:remove heroku/nodejs
    heroku buildpacks:remove https://github.com/jontewks/puppeteer-heroku-buildpack

    #5 Tried doing this now
    heroku ps:scale web=1

    got

    Scaling dynos… !
    ▸ Couldn’t find that process type.

    Not sure if any issues with https://github.com/timleland/headless-chrome.git

  3. After following your instructions I’m getting this weird error when launching puppeteer:

    TypeError: input.on is not a function
    2018-01-19T06:16:08.710918+00:00 app[web.1]: at new Interface (readline.js:181:11)
    2018-01-19T06:16:08.710919+00:00 app[web.1]: at Object.createInterface (readline.js:64:10)
    2018-01-19T06:16:08.710922+00:00 app[web.1]: at waitForWSEndpoint (/app/node_modules/puppeteer/lib/Launcher.js:195:10)
    2018-01-19T06:16:08.710921+00:00 app[web.1]: at new Promise ()
    2018-01-19T06:16:08.710923+00:00 app[web.1]: at Function.launch (/app/node_modules/puppeteer/lib/Launcher.js:133:39)
    2018-01-19T06:16:08.710924+00:00 app[web.1]: at
    2018-01-19T06:16:08.805193+00:00 app[web.1]: events.js:183
    2018-01-19T06:16:08.805197+00:00 app[web.1]: throw er; // Unhandled ‘error’ event
    2018-01-19T06:16:08.710920+00:00 app[web.1]: at Promise (/app/node_modules/puppeteer/lib/Launcher.js:196:25)

    Any idea what’s going on? Your help is much appreciated :).

    1. For this example use “web: node app.js” as the example does not have an index.js, but an app.js 🙂

  4. Has anyone tried to start puppeteer with –proxy-server option on Heroku? I’m getting always timeout when attempting to connect (not a problem of the proxy, it works on my local env)

    Thanks,
    Alessandro

  5. For who is trying to make it work on heroku you need to launch puppeteer with the following options:

    puppeteer.launch({ headless: true, args:[‘–no-sandbox’, ‘–disable-setuid-sandbox’] })

    Also, make sure you call heroku buildpacks:add , not heroku buildpacks:set, set is removing all buildpacks you already have!

    1. I’m using puppeteer version 1.20.0 and the instructions are working for me, so yes.

      Importantly I also needed to add the Node buildpack mentioned in the comments, without it I was getting ‘node: command not found’ errors.

      My own application is running. But when I tried to run this one out of curiosity, I got an error:
      ‘No web processes running’
      Perhaps what’s needed is adding a Procfile with text: ‘web: node app.js’? Not sure, just an idea, really.

  6. Hi, I added the heroku build back and it shows up in the settings > buidpacks. However my app still wont run. When I run heroku logs –tail I get (node:4) UnhandledPromiseRejectionWarning: Error: Failed to launch chrome!
    2019-10-17T19:04:21.485029+00:00 app[worker.1]: /app/node_modules/puppeteer/.local-chromium/linux-686378/chrome-linux/chrome: error while loading shared libraries: libX11-xcb.so.1: cannot open shared object file: No such file or directory
    I have no idea what to do any help is appreciated thank you so much!
    ps in my main file I placed this code:
    const browser = puppeteer.launch({
    ‘args’: [
    ‘–no-sandbox’,
    ‘–disable-setuid-sandbox’
    ]
    });
    If you have any questions or would like to see any more code feel free to contact me! Instagram: joey2031

    1. I had the exact same issue. Check out the URL error

      “/app/node_modules/puppeteer/.local-chromium/ –> linux-686378/chrome-linux/ <–"

      I noticed that this program wanted to use Chrome-linux, but I was using the Windows version of Chrome on WIndows! Also I noticed I was using the wrong version of Node. All I did was switch Operating Systems and now it works. Try running the program again in a Linux environment like Ubuntu or Fedora.

      Good luck!

  7. thanks, guys this code it’s worked!! 😀 only I say that add like say “DAN” and too Procfile web: node app.js

  8. I feel like I have tried everything suggested and still no luck.

    My launch:

    puppeteer.launch({
    args: [‘–no-sandbox’, ‘–disable-setuid-sandbox’]
    });

    I have these buildpack: https://github.com/jontewks/puppeteer-heroku-buildpack

    But still get this message:

    /app/node_modules/puppeteer/.local-chromium/linux-737027/chrome-linux/chrome: error while loading shared libraries: libgbm.so.1: cannot open shared object file: No such file or directory

    HELP, please!

  9. Hello TIM,

    Thanks for the article.

    But I have been trying to deploy my app on Heroku but it is still not working even after doing all these steps.

    My puppeteer lunch statement is as follows:

    const browser = await puppetter.launch({
    headless: true,
    args: [“–no-sandbox”,
    ‘–disable-setuid-sandbox’],
    });

    When I deploy it Just exits with the following error:

    Starting process with command `npm start`
    2020-10-13T16:33:17.284061+00:00 app[web.1]:
    2020-10-13T16:33:17.284089+00:00 app[web.1]: > [email protected] start /app
    2020-10-13T16:33:17.284089+00:00 app[web.1]: > node index
    2020-10-13T16:33:17.284089+00:00 app[web.1]:
    2020-10-13T16:33:17.795600+00:00 app[web.1]: Innitializing…..
    2020-10-13T16:33:17.795944+00:00 app[web.1]: Starting the server….
    2020-10-13T16:33:17.796003+00:00 app[web.1]: Server started..
    2020-10-13T16:33:17.796051+00:00 app[web.1]: Server Listening on port 4000
    2020-10-13T16:33:18.086323+00:00 app[web.1]: DB Connected successfully
    2020-10-13T16:33:18.790572+00:00 app[web.1]: Scraping Done
    2020-10-13T16:33:19.789613+00:00 app[web.1]: Scraping Done
    2020-10-13T16:33:20.857065+00:00 app[web.1]: Scraping Done
    2020-10-13T16:33:21.863133+00:00 app[web.1]: Scraping Done
    2020-10-13T16:33:22.064961+00:00 app[web.1]: events.js:291
    2020-10-13T16:33:22.064963+00:00 app[web.1]: throw er; // Unhandled ‘error’ event
    2020-10-13T16:33:22.064964+00:00 app[web.1]: ^
    2020-10-13T16:33:22.064965+00:00 app[web.1]:
    2020-10-13T16:33:22.064966+00:00 app[web.1]: Error: read ENOTCONN
    2020-10-13T16:33:22.064966+00:00 app[web.1]: at tryReadStart (net.js:575:20)
    2020-10-13T16:33:22.064967+00:00 app[web.1]: at Socket._read (net.js:586:5)
    2020-10-13T16:33:22.064967+00:00 app[web.1]: at Socket.Readable.read (_stream_readable.js:470:10)
    2020-10-13T16:33:22.064986+00:00 app[web.1]: at Socket.read (net.js:626:39)
    2020-10-13T16:33:22.064987+00:00 app[web.1]: at new Socket (net.js:378:12)
    2020-10-13T16:33:22.064987+00:00 app[web.1]: at Object.Socket (net.js:269:41)
    2020-10-13T16:33:22.064987+00:00 app[web.1]: at createSocket (internal/child_process.js:314:14)
    2020-10-13T16:33:22.064988+00:00 app[web.1]: at ChildProcess.spawn (internal/child_process.js:437:23)
    2020-10-13T16:33:22.064988+00:00 app[web.1]: at Object.spawn (child_process.js:553:9)
    2020-10-13T16:33:22.064989+00:00 app[web.1]: at BrowserRunner.start (/app/node_modules/puppeteer/lib/launcher/BrowserRunner.js:51:34)
    2020-10-13T16:33:22.064990+00:00 app[web.1]: at ChromeLauncher.launch (/app/node_modules/puppeteer/lib/Launcher.js:64:16)
    2020-10-13T16:33:22.064990+00:00 app[web.1]: at async localScrape (/app/Scraper/localScrape.js:10:22)
    2020-10-13T16:33:22.064990+00:00 app[web.1]: Emitted ‘error’ event on Socket instance at:
    2020-10-13T16:33:22.064991+00:00 app[web.1]: at emitErrorNT (internal/streams/destroy.js:92:8)
    2020-10-13T16:33:22.064991+00:00 app[web.1]: at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
    2020-10-13T16:33:22.064992+00:00 app[web.1]: at processTicksAndRejections (internal/process/task_queues.js:84:21) {
    2020-10-13T16:33:22.064992+00:00 app[web.1]: errno: ‘ENOTCONN’,
    2020-10-13T16:33:22.064993+00:00 app[web.1]: code: ‘ENOTCONN’,
    2020-10-13T16:33:22.064993+00:00 app[web.1]: syscall: ‘read’
    2020-10-13T16:33:22.064993+00:00 app[web.1]: }
    2020-10-13T16:33:22.079027+00:00 app[web.1]: npm ERR! code ELIFECYCLE
    2020-10-13T16:33:22.079442+00:00 app[web.1]: npm ERR! errno 1
    2020-10-13T16:33:22.088068+00:00 app[web.1]: npm ERR! [email protected] start: `node index`
    2020-10-13T16:33:22.088603+00:00 app[web.1]: npm ERR! Exit status 1
    2020-10-13T16:33:22.088605+00:00 app[web.1]: npm ERR!
    2020-10-13T16:33:22.088605+00:00 app[web.1]: npm ERR! Failed at the [email protected] start script.
    2020-10-13T16:33:22.088778+00:00 app[web.1]: npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
    2020-10-13T16:33:22.097784+00:00 app[web.1]:
    2020-10-13T16:33:22.098004+00:00 app[web.1]: npm ERR! A complete log of this run can be found in:
    2020-10-13T16:33:22.098157+00:00 app[web.1]: npm ERR! /app/.npm/_logs/2020-10-13T16_33_22_089Z-debug.log
    2020-10-13T16:33:22.158923+00:00 heroku[web.1]: Process exited with status 1

  10. i made a discord bot and when im pushing it to a git repository because i want to host it on heroku
    its showing this error

    remote: Resolving deltas: 100% (201/201), done.
    remote: error: GH001: Large files detected. You may want to try Git Large File Storage – https://git-
    lfs.github.com.
    remote: error: Trace: 7d64b752d73168c3a6458afa6527db0648f1e315d28d8ec81702616a674ae731
    remote: error: See http://git.io/iEPt8g for more information.
    remote: error: File node_modules/puppeteer/.local-chromium/win64-818858/chrome-win/chrome.dll is
    138.94 MB; this exceeds GitHub’s file size limit of 100.00 MB
    remote: error: File node_modules/puppeteer/.local-chromium/win64-818858/chrome-
    win/interactive_ui_tests.exe is 141.87 MB; this exceeds GitHub’s file size limit of 100.00 MB
    To https://github.com/sanidhyajain1/discord-bot.git
    ! [remote rejected] master -> master (pre-receive hook declined)
    error: failed to push some refs to ‘https://github.com/sanidhyajain1/discord-bot.git’

  11. Hey. I followed the guide but I got this error when running:

    /app/node_modules/puppeteer/.local-chromium/linux-848005/chrome-linux/chrome: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory

    Can someone help me out?

  12. I put my project in herku but after 2 minutes im getting “Error R10 (Boot timeout) -> Web process failed to bind to $PORT within 60 seconds of launch”

Leave a Reply

Your email address will not be published. Required fields are marked *