Skip to content

[ci] Guard downstream consumers on empty upstream output URLs#5227

Closed
FIM43-Redeye wants to merge 1 commit into
ROCm:mainfrom
FIM43-Redeye:users/FIM43-Redeye/fork-guard-uploads
Closed

[ci] Guard downstream consumers on empty upstream output URLs#5227
FIM43-Redeye wants to merge 1 commit into
ROCm:mainfrom
FIM43-Redeye:users/FIM43-Redeye/fork-guard-uploads

Conversation

@FIM43-Redeye
Copy link
Copy Markdown
Contributor

@FIM43-Redeye FIM43-Redeye commented May 13, 2026

Summary

When upstream upload steps (S3 wheel/package uploads) produce no URL
output — e.g. fork PRs without credentials, or transient upload
failures — downstream consumer jobs (install tests, wheel tests,
PyTorch builds) still run and then fail at the install step with
"no such package" errors. This PR adds && steps.X.outputs.Y != ''
to each consumer's if: condition so they skip cleanly when the
upstream URL is empty, rather than starting and erroring.

The guard is robust to any cause of a missing upload URL, not only
forks. It is purely a consumer-side defensive check; the upstream
auth gap that produces empty URLs on fork PRs is being addressed
separately at the credential layer in #5066.

Changes

File Edit
ci_linux.yml 2 consumer empty-output guards
ci_windows.yml 2 consumer empty-output guards
multi_arch_ci_linux.yml 5 consumer empty-output guards
multi_arch_ci_windows.yml 2 consumer empty-output guards

Out of scope

Test plan

  • pre-commit run clean on all 4 files (including actionlint)
  • CI exercises the multi-arch path; consumer jobs should now
    skip cleanly when upstream outputs are empty rather than
    erroring during install

Generated using Claude Code.

Copilot AI review requested due to automatic review settings May 13, 2026 02:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates several GitHub Actions workflows to avoid unauthenticated S3 upload attempts on fork PRs (where OIDC role assumption is skipped), and to prevent downstream consumer jobs from running when the upstream upload-produced URLs are empty.

Changes:

  • Add fork/repo guards to S3 upload steps so fork PRs no-op instead of failing with NoCredentialsError.
  • Add downstream job if: guards to skip install/wheel/PyTorch jobs when package_repository_url / package_find_links_url outputs are empty.
  • Document the “empty output on forks” behavior inline in the workflows to clarify why skips occur.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
.github/workflows/multi_arch_build_native_linux_packages.yml Mirrors the existing AWS-credential fork guard on the native package repo upload step.
.github/workflows/build_portable_linux_python_packages.yml Adds an upload-step guard to prevent unauthenticated S3 uploads on forks (but currently too strict for rockrel release callers).
.github/workflows/build_windows_python_packages.yml Adds an upload-step guard to prevent unauthenticated S3 uploads on forks (but currently too strict for rockrel release callers).
.github/workflows/multi_arch_ci_linux.yml Skips install tests / wheel tests / PyTorch builds when the upstream URL outputs are empty.
.github/workflows/multi_arch_ci_windows.yml Skips wheel tests / PyTorch builds when the upstream find-links URL output is empty.
.github/workflows/ci_linux.yml Skips wheel tests / PyTorch builds when the upstream find-links URL output is empty.
.github/workflows/ci_windows.yml Skips wheel tests / PyTorch builds when the upstream find-links URL output is empty.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/build_portable_linux_python_packages.yml Outdated
Comment thread .github/workflows/build_windows_python_packages.yml Outdated
FIM43-Redeye added a commit to FIM43-Redeye/TheRock that referenced this pull request May 13, 2026
Address Copilot review on ROCm#5227: ROCm/rockrel calls the Python package
build workflows via workflow_call for scheduled releases (documented in
multi_arch_release.yml). When that happens, github.repository inside
the called workflow is "ROCm/rockrel", not "ROCm/TheRock".

The original guard `github.repository == 'ROCm/TheRock'` therefore
force-skipped the upload step on rockrel runs even though those runs
have working AWS credentials via the runner-mounted
AWS_SHARED_CREDENTIALS_FILE (the OIDC composite's repo allowlist is
narrower than the actual credential reachability).

Widen the allowlist to {ROCm/TheRock, ROCm/rockrel}, matching the
existing condition used by the Configure AWS Credentials step in
multi_arch_build_native_linux_packages.yml.

Generated using Claude Code.
Copy link
Copy Markdown
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this. We had a mechanism to authenticate differently when a request comes from a fork. Something seems broken here and it should indeed be supported on approved runs to push builds when the PR comes from a fork.

@geomin12 @ScottTodd, I saw you two have been partly involved in related fixes, can you comment too?

Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All workflows that run on pull requests must work from PRs sent from forks, without losing any functionality.

Not uploading essential files is not an option.

Geo has #5066 trying to fix the deb/rpm upload, and I've seen a few instances recently where the runner base credentials don't work. We need to fix in those ways, not by skipping uploads.

@ScottTodd
Copy link
Copy Markdown
Member

There are some scenarios where skipping uploads could make sense, like running workflows in forks (as I do occasionally for testing, see https://github.com/ROCm/TheRock/blob/main/docs/development/github_actions_debugging.md#working-effectively-from-forks), but in those cases I'm happy enough seeing that the workflows generally work (inputs get propagated correctly, scripts handle them, no syntax errors, builds start running, etc.). GitHub-hosted runners are slow and can run out of disk space on many of our workflows, so I don't really see much value in trying to make the workflows run to successful completion outside of the ROCm org.

Adds `&& outputs.X != ''` to install tests, wheel tests, and PyTorch
builds in ci_linux.yml, ci_windows.yml, multi_arch_ci_linux.yml, and
multi_arch_ci_windows.yml so these jobs cleanly skip when the upstream
upload produces no URL, rather than running and failing during the
install step.

Defensive on the consumer side only; the upstream auth gap that
produces empty URLs on fork PRs is being addressed separately in ROCm#5066.
@FIM43-Redeye FIM43-Redeye force-pushed the users/FIM43-Redeye/fork-guard-uploads branch from 13deea7 to ff1a6e6 Compare May 13, 2026 17:10
@FIM43-Redeye FIM43-Redeye changed the title [ci] Skip S3 uploads and downstream consumers on fork PRs [ci] Guard downstream consumers on empty upstream output URLs May 13, 2026
@FIM43-Redeye
Copy link
Copy Markdown
Contributor Author

@ScottTodd @marbre That's a lot more sensible for sure. Old mindsets from old times, I'm used to treating anything involving credentials as set in stone. Scrapped the fork guards on upload steps. Are the downstream checks helpful or should I close this one?

@ScottTodd ScottTodd self-requested a review May 13, 2026 19:37
Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ehhh this doesn't look right either.

Consider the case of running in a personal fork without self-hosted runners or credentials to upload:

  • The builds on CPU machines are likely to fail due to low disk space or slow build times
  • The uploads are attempted and will fail
  • The tests on GPU machines would fail due to not having files to download and not having self-hosted runners registered to process the jobs

I don't want to sprinkle so many conditions in the workflows - they are designed for this repository where they have everything they need.

A few of the lighter weight workflows like https://github.com/ROCm/TheRock/blob/main/.github/workflows/build_portable_linux_python_packages.yml that consume already built artifacts from your choice of repository + run id and package them are at least possible to test from repository forks without self-hosted runners or credentials - they just fail on the final upload step.

Here's a recent run from my fork for example: https://github.com/ScottTodd/TheRock/actions/runs/24217435275/job/70701188683. That fetched artifacts and built tarballs from them in ~10 minutes:

   python build_tools/build_tarballs.py \
    --run-id="24187929660" \
    --run-github-repo="ROCm/TheRock" \
    --dist-amdgpu-families="gfx1151;gfx1100" \
    --platform="linux" \
    --package-version="7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3" \
    --output-dir="/home/runner/work/TheRock/TheRock/tarballs"

then it failed to upload (as expected) at the end:


[INFO] Uploading to s3://therock-ci-artifacts-external/ScottTodd-TheRock/24217435275-linux/tarballs
S3 upload attempt 1/3 failed for s3://therock-ci-artifacts-external/ScottTodd-TheRock/24217435275-linux/tarballs/therock-dist-linux-gfx1100-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz: Unable to locate credentials. Retrying in 1.0s...
S3 upload attempt 1/3 failed for s3://therock-ci-artifacts-external/ScottTodd-TheRock/24217435275-linux/tarballs/therock-dist-linux-gfx1151-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz: Unable to locate credentials. Retrying in 1.0s...

(there were three 3GB tarballs, so too large for github artifacts and would need to go to some cloud storage like our S3 buckets)

That workflow was functional enough for me to test there before I moved over to a PR in this repository to contribute it.

@FIM43-Redeye
Copy link
Copy Markdown
Contributor Author

Got it, the credential layer is where the fixes live. Thanks for walking me through the fork-test model, that was the bit I was missing. Closing as inapplicable.

@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage May 13, 2026
@FIM43-Redeye FIM43-Redeye deleted the users/FIM43-Redeye/fork-guard-uploads branch May 13, 2026 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants