A Heroku buildpack that installs PDF command-line utilities from the Poppler PDF rendering library:
pdftotext— extract text from PDF filespdftoppm— convert PDF pages to images (PPM, PNG, JPEG)
Loosely based on the Heroku NGINX buildpack.
Add to a Heroku app:
heroku buildpacks:add https://github.com/[user]/heroku-buildpack-pdftotextAfter deployment the binaries are available on $PATH at:
app/bin/pdftotextapp/bin/pdftoppm
| Stack | Status |
|---|---|
| heroku-26 | Current |
| heroku-24 | Supported |
The correct binary is selected at compile time based on the $STACK
environment variable set by Heroku.
Pre-compiled binaries live in bin/ and are checked into the repo. They are
built inside the corresponding heroku/heroku:NN-build Docker image so the
resulting binaries match the runtime stack's glibc.
Requires Docker. Build all currently-supported stacks:
make buildBuild a specific stack:
make build-heroku-26Build every stack including legacy ones:
make build-allOverride the Poppler version:
make build POPPLER_VERSION=26.05.0Open a shell in the heroku-26 build image for debugging:
make shell- Poppler is built with
BUILD_SHARED_LIBS=OFFfor static linking. - Qt, GLib, Boost, GPGme, OpenJPEG and the C++ bindings are disabled to keep the binary small and avoid runtime dependencies beyond what the Heroku stack provides.
- Default Poppler version is pinned in
scripts/build_pdftotextand theMakefile; bumping requires rebuilding the binaries for each supported stack.
MIT — see LICENSE.md.