docker: Add optional FFmpeg support for whisper.cpp by ts200-G · Pull Request #785 · mostlygeek/llama-swap

ts200-G · 2026-05-23T19:18:34Z

Summary

Solves #783
Adds optional FFmpeg support for whisper.cpp builds as described here

Currently ffmpeg needs to be installed manually and it is launched as a sub-process when using it with whisper-server and --convert. Setting WHISPER_FFMPEG=yes at build time enables linking the ffmpeg libraries into the whisper binary at compile time.

The result is that the binary can natively decode audio formats like Opus, AAC, MP3, etc. without spawning any external process.

Changes:

Adds the build arg/env toggle
Installs required FFmpeg dev/runtime libraries in build and runtime images.
Adds -DWHISPER_FFMPEG=ON to CMAKE_FLAGS in install-whisper.sh if enabled.
Separates Docker build cache for FFmpeg/non-FFmpeg builds

WHISPER_FFMPEG Behavior:

yes / true => enabled (default)
no / false => disabled

Libraries are not installed conditionally, but image size difference is minimal:
Build: pkg-config libavcodec-dev libavformat-dev libavutil-dev libswresample-dev ~31.5 MB
Runtime: libavcodec60 libavformat60 libavutil58 libswresample4 ~20.4 MB

pkg-config not mentioned in whisper.cpp readme, but added to build image as whisper uses it to detect them. libswresample-dev/libswresample4` also not in readme, but is checked for at build time so added for consistency: src

coderabbitai · 2026-05-23T19:18:40Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3c2bc46d-f179-4255-aad9-9ee777169709

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

This PR introduces FFmpeg as an optional dependency in the unified Docker whisper.cpp build pipeline. The install script processes a new WHISPER_FFMPEG configuration flag to conditionally enable FFmpeg support during CMake compilation, while the Dockerfile provides the necessary build-time and runtime libraries across CUDA and Vulkan stages, and the build script exposes the feature to users as a configurable knob.

Changes

FFmpeg-optional whisper.cpp build configuration

Layer / File(s)	Summary
Install script FFmpeg environment handling `docker/unified/install-whisper.sh`	install-whisper.sh accepts `WHISPER_FFMPEG` (defaulting to `yes`), normalizes it to lowercase, sets an internal `WHISPER_FFMPEG_ENABLED` flag, and conditionally appends `-DWHISPER_FFMPEG=ON` to CMake configuration when enabled.
Dockerfile builder and runtime stage updates `docker/unified/Dockerfile`	Both CUDA and Vulkan builder stages now install FFmpeg development libraries (`pkg-config`, `libavcodec`, `libavformat`, `libavutil`, `libswresample`). The whisper-build stage accepts `WHISPER_FFMPEG` and `WHISPER_COMMIT_HASH` build args, passes them to install-whisper.sh, and adds BuildKit cache mounts for `ccache` and the whisper build directory. Both CUDA and Vulkan runtime stages now install FFmpeg runtime libraries (`libavcodec60`, `libavformat60`, `libavutil58`, `libswresample4`).
Build script configuration and documentation `docker/unified/build-image.sh`	Introduces `WHISPER_FFMPEG` environment variable (defaulting to `yes`), documents it in the `--help` output, and passes it to `docker buildx build` as an additional build argument.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

mostlygeek/llama-swap#597: Initial unified CUDA container infrastructure that this PR extends with FFmpeg support and improved build caching.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding optional FFmpeg support for whisper.cpp builds in Docker.
Description check	✅ Passed	The description thoroughly explains the purpose, implementation details, and rationale for the changes, directly relating to the modified Docker files.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mostlygeek · 2026-05-24T22:25:18Z

@coderabbitai review

coderabbitai · 2026-05-24T22:25:24Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

docker/unified/Dockerfile (1)
63-64: 💤 Low value

Consider the cache segregation strategy.

The cache mount ID includes ${WHISPER_FFMPEG}, which creates separate build caches for FFmpeg-enabled and FFmpeg-disabled builds. While this ensures clean builds, it means users toggling between WHISPER_FFMPEG=yes and WHISPER_FFMPEG=no will not benefit from shared cache and will trigger full rebuilds.

If toggling is expected to be rare, this is fine. If users frequently switch, consider whether a shared cache with proper CMake reconfiguration would be more efficient.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docker/unified/Dockerfile` around lines 63 - 64, The current Dockerfile RUN
uses a cache mount id "whisper-${BACKEND}-${WHISPER_FFMPEG}" which splits the
whisper build cache by the WHISPER_FFMPEG flag causing separate caches and full
rebuilds when toggling; to fix, change the cache id to omit ${WHISPER_FFMPEG}
(e.g., "whisper-${BACKEND}") so FFmpeg-enabled/disabled builds share the same
cache, and ensure the CMake/configure step in the whisper build (look for the
whisper.cpp build invocation) properly reconfigures on flag changes to avoid
stale artifacts; if you must keep separate caches, add a short comment
documenting the trade-off and rationale.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docker/unified/Dockerfile`:
- Around line 140-144: The Vulkan runtime Dockerfile RUN command currently
installs the build-time tool "pkg-config" alongside runtime libraries; remove
"pkg-config" from the apt-get install list in the Vulkan runtime stage (the RUN
that installs libgomp1, libvulkan1, mesa-vulkan-drivers, python3, curl,
ca-certificates, pkg-config, libavcodec60, libavformat60, libavutil58,
libswresample4) so only runtime packages remain (leave libgomp1, libvulkan1,
mesa-vulkan-drivers, python3, curl, ca-certificates, and the libav* packages)
mirroring the CUDA runtime stage which correctly omits pkg-config.

---

Nitpick comments:
In `@docker/unified/Dockerfile`:
- Around line 63-64: The current Dockerfile RUN uses a cache mount id
"whisper-${BACKEND}-${WHISPER_FFMPEG}" which splits the whisper build cache by
the WHISPER_FFMPEG flag causing separate caches and full rebuilds when toggling;
to fix, change the cache id to omit ${WHISPER_FFMPEG} (e.g.,
"whisper-${BACKEND}") so FFmpeg-enabled/disabled builds share the same cache,
and ensure the CMake/configure step in the whisper build (look for the
whisper.cpp build invocation) properly reconfigures on flag changes to avoid
stale artifacts; if you must keep separate caches, add a short comment
documenting the trade-off and rationale.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4cfed62a-893f-4382-8ee4-89821949c763

📥 Commits

Reviewing files that changed from the base of the PR and between 636b53e and 45acd73.

📒 Files selected for processing (3)

docker/unified/Dockerfile
docker/unified/build-image.sh
docker/unified/install-whisper.sh

coderabbitai · 2026-05-24T22:28:14Z

 RUN apt-get update && apt-get install -y --no-install-recommends \
    libgomp1 libvulkan1 mesa-vulkan-drivers \
    python3 curl ca-certificates \
+    pkg-config libavcodec60 libavformat60 libavutil58 libswresample4 \
    && rm -rf /var/lib/apt/lists/*


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Remove pkg-config from the Vulkan runtime stage.

pkg-config is a build-time tool used to detect FFmpeg during compilation (as noted in the PR description). It should not be included in the runtime image. The CUDA runtime stage (line 126) correctly omits it.

🔧 Proposed fix

RUN apt-get update && apt-get install -y --no-install-recommends \ libgomp1 libvulkan1 mesa-vulkan-drivers \ python3 curl ca-certificates \ - pkg-config libavcodec60 libavformat60 libavutil58 libswresample4 \ + libavcodec60 libavformat60 libavutil58 libswresample4 \ && rm -rf /var/lib/apt/lists/*

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RUN apt-get update && apt-get install -y --no-install-recommends \

libgomp1 libvulkan1 mesa-vulkan-drivers \

python3 curl ca-certificates \

pkg-config libavcodec60 libavformat60 libavutil58 libswresample4 \

&& rm -rf /var/lib/apt/lists/*

RUN apt-get update && apt-get install -y --no-install-recommends \

libgomp1 libvulkan1 mesa-vulkan-drivers \

python3 curl ca-certificates \

libavcodec60 libavformat60 libavutil58 libswresample4 \

&& rm -rf /var/lib/apt/lists/*

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docker/unified/Dockerfile` around lines 140 - 144, The Vulkan runtime Dockerfile RUN command currently installs the build-time tool "pkg-config" alongside runtime libraries; remove "pkg-config" from the apt-get install list in the Vulkan runtime stage (the RUN that installs libgomp1, libvulkan1, mesa-vulkan-drivers, python3, curl, ca-certificates, pkg-config, libavcodec60, libavformat60, libavutil58, libswresample4) so only runtime packages remain (leave libgomp1, libvulkan1, mesa-vulkan-drivers, python3, curl, ca-certificates, and the libav* packages) mirroring the CUDA runtime stage which correctly omits pkg-config.

…tlygeek#783) - Add FFmpeg dev libraries at build time and runtime libraries in CUDA/Vulkan runtime stages. - Add `WHISPER_FFMPEG` build arg (default: yes) to build whisper.cpp with `-DWHISPER_FFMPEG=ON`.

ts200-G · 2026-06-05T15:24:32Z

Have removed pkg-config from vulkan runtime stage and also removed "whisper-${BACKEND}" from the cache bind mount id as I couldn't find anything that mentions this id option in the docker docs to confirm what it does, so left it alone since build args invalidate the cache anyways.

Build ran succesfully with:

docker buildx build \                                    
  --no-cache \
  --build-arg BACKEND=cuda \
  --build-arg CMAKE_CUDA_ARCHITECTURES=86 \
  --build-arg RUN_UID=10001 \
  --build-arg WHISPER_FFMPEG=yes \
  --build-arg LLAMA_COMMIT_HASH=86591c7536ced84cea49ee5b3e24096632a33c5a \
  --build-arg WHISPER_COMMIT_HASH=99613cb720b65036237d44b52f753b51f75c2797 \
  --build-arg SD_COMMIT_HASH=1f9ee88e09c258053fa59d5e05e23dfb10fa0b13 \
  --build-arg IK_LLAMA_COMMIT_HASH=6b9de3dbaa21ae95ea80638e5ee836795cc48c93 \
  --build-arg LS_VERSION=ccfba0df28ab4d5dcace9056469cbc929249696b \
  -t llama-swap:unified-cuda-rootless \
  ./docker/unified

And tested with this config:

macros:
  "WHISPER": >
    /usr/local/bin/whisper-server
    --host 127.0.0.1
    --port ${PORT}
    --flash-attn
    --inference-path ""
    --request-path /v1/audio/transcriptions
#....

models:
  "stt-large-turbo": &stt_def
    macros:
      "MODEL": /models/stt/ggml-large-v3-turbo-q8_0.bin
      "BACKEND": WHISPER
      "modality": audio->text
      "family": whisper
      "variant": large-v3-turbo
      "params": 0.8B
      "quant": Q8_0
      "dl_url": https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-large-v3-turbo-q8_0.bin
    cmd:
      ${backend}
      --port ${PORT}
      -m ${MODEL}
    proxy: "http://127.0.0.1:${PORT}"
    description: "audio transcriptions"
    checkEndpoint: /v1/audio/transcriptions/
    metadata: &tts_meta
      "model-family": ${family}
      "model-params": ${params}
      "backend": ${backend}
      "modality": ${modality}
      "capabilities":
        audio:
          "transcription": true
          "translation": false # translation data excluded from finetune
          "language-detection": true

Correctly converts audio without using --convert and no additional ffmpeg install/sub-process.

Just let me know if you'd like any other changes

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

docker: add optional whisper.cpp FFmpeg support to unified image (mos…

79a2dba

…tlygeek#783) - Add FFmpeg dev libraries at build time and runtime libraries in CUDA/Vulkan runtime stages. - Add `WHISPER_FFMPEG` build arg (default: yes) to build whisper.cpp with `-DWHISPER_FFMPEG=ON`.

ts200-G force-pushed the whisper-ffmpeg-support branch from 45acd73 to 79a2dba Compare June 5, 2026 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker: Add optional FFmpeg support for whisper.cpp#785

docker: Add optional FFmpeg support for whisper.cpp#785
ts200-G wants to merge 1 commit into
mostlygeek:mainfrom
ts200-G:whisper-ffmpeg-support

ts200-G commented May 23, 2026

Uh oh!

coderabbitai Bot commented May 23, 2026 •

edited

Loading

Review skipped

Uh oh!

mostlygeek commented May 24, 2026

Uh oh!

coderabbitai Bot commented May 24, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

ts200-G commented Jun 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ts200-G commented May 23, 2026

Summary

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

mostlygeek commented May 24, 2026

Uh oh!

coderabbitai Bot commented May 24, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

ts200-G commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 23, 2026 •

edited

Loading

ts200-G commented Jun 5, 2026 •

edited

Loading