Skip to content

Test: bump rocm-libraries for MIOpen candidate selection validation#5193

Closed
BrianHarrisonAMD wants to merge 1 commit into
mainfrom
users/bharriso/test-miopen-candidate-selection-db
Closed

Test: bump rocm-libraries for MIOpen candidate selection validation#5193
BrianHarrisonAMD wants to merge 1 commit into
mainfrom
users/bharriso/test-miopen-candidate-selection-db

Conversation

@BrianHarrisonAMD
Copy link
Copy Markdown
Contributor

Purpose

Validation-only PR - not intended to merge.

This bumps the rocm-libraries submodule to 909543b00dabad1af3f5b32e3ee8d498140e1eea from ROCm/rocm-libraries#7320.

The goal is to exercise TheRock multi-arch CI with ci:run-all-archs and confirm the MIOpen candidate-selection model tests no longer fail on gfx950 when the gfx942 DB/model artifact is not installed in the gfx950 test job.

What changed

  • rocm-libraries submodule: 8d33c2d2d1 -> 909543b00d

Next steps

  1. Apply the ci:run-all-archs label to trigger all-architecture CI.
  2. Monitor the Linux::release / Test gfx950-dcgpu / Test miopen / Test miopen (shard 4/4) (gfx950-dcgpu) job.
  3. If clean, use the run as validation for [MIOpen] Skip foreign-arch candidate selection model tests rocm-libraries#7320 and close this PR without merging.

@BrianHarrisonAMD BrianHarrisonAMD added the ci:run-all-archs Opt-in to building for all architectures on a pull request label May 11, 2026
adickin-amd pushed a commit to ROCm/rocm-libraries that referenced this pull request May 12, 2026
## Summary

This PR updates the MIOpen AI candidate-selection model tests so
foreign-architecture model artifacts are skipped when they are not
installed for the current GPU job. This addresses the `gfx950` TheRock
failure where the test tried to validate `gfx942` model files that were
produced by the run but not selected for the `gfx950` install context:
[TheRock
run](https://github.com/ROCm/TheRock/actions/runs/25682869406/job/75431307165?pr=5166).

## Risk Assessment

Low risk. This is an isolated MIOpen gtest-only change that does not
affect production code, public APIs, packaging, or build configuration;
the main behavior change is converting missing foreign-arch model files
from failures to skips while preserving native-arch failures.

## Testing Summary

- Static validation confirmed the single-file change has no whitespace
errors and is formatted.
- Targeted MIOpen/gtest validation passed on gfx90a with the gfx942
model files present, confirming the updated suite runs instead of
skipping when the required files exist.
- Targeted missing-file validation passed on gfx90a after moving the
relevant build-local gfx942 model files out of the DB directory,
confirming the updated suite skips foreign-arch missing files without
error.
- MIOpen gtest naming validation passed for the renamed
`GPU_CandidateSelection_FP32` suite.
- TheRock multi-arch validation has been started with a validation-only
draft PR that bumps only the `rocm-libraries` submodule.

## Testing Checklist

- [x] Formatting - `clang-format -i
projects/miopen/test/gtest/conv_ai_candidate_selection_model.cpp` -
Status: Passed
- [x] Diff whitespace check - `git diff --check` - Status: Passed
- [x] Commit hooks - `git commit` - Status: Passed
- [x] MIOpen configure - `cmake -S projects/miopen -B
WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen
-GNinja ... -DMIOPEN_TEST_GFX90A=On -DMIOPEN_TEST_DISCRETE=On` - ASICs:
gfx90a - Status: Passed
- [x] MIOpen candidate-selection gtest build - `ninja -C
WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen
test_conv_ai_candidate_selection_model` - Status: Passed
- [x] MIOpen candidate-selection gtest -
`test_conv_ai_candidate_selection_model
--gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a -
Status: Passed
- [x] MIOpen candidate-selection missing-file skip path - moved
build-local `gfx942_ConvHipImplicitGemm*GroupWrwXdlops_*.tn.model` files
out of the DB path, then ran `test_conv_ai_candidate_selection_model
--gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a -
Status: Passed
- [x] MIOpen gtest name check - `check_names.py --list
test-candidate-selection-list.txt` - Status: Passed
- [x] TheRock multi-arch validation - Status: Pending - Link: [TheRock
PR #5193](ROCm/TheRock#5193), [Multi-Arch CI
run](https://github.com/ROCm/TheRock/actions/runs/25703671638)
- [x] PR CI - GitHub PR checks - Status: Pending

## Technical Changes

- Renames the enabled candidate-selection model suite from
`CPU_CandidateSelection_NONE` to `GPU_CandidateSelection_FP32` so it
reflects the arch-dependent model artifact behavior while satisfying
current MIOpen gtest naming rules.
- Adds model-file discovery and fixture gating for the
candidate-selection model files, skipping missing files for non-current
architectures and failing when the missing files are for the current
device architecture.
- Keeps the disabled-AI placeholder test CPU-scoped because it does not
require a GPU-visible architecture context.
@BrianHarrisonAMD
Copy link
Copy Markdown
Contributor Author

Closing this validation-only PR. The PR 7320 multi-arch verification run has served its purpose, and the next validation will use a new TheRock branch pinned to ROCm/rocm-libraries#7354.

@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage May 12, 2026
amontoison pushed a commit to amontoison/rocm-libraries that referenced this pull request May 13, 2026
## Summary

This PR updates the MIOpen AI candidate-selection model tests so
foreign-architecture model artifacts are skipped when they are not
installed for the current GPU job. This addresses the `gfx950` TheRock
failure where the test tried to validate `gfx942` model files that were
produced by the run but not selected for the `gfx950` install context:
[TheRock
run](https://github.com/ROCm/TheRock/actions/runs/25682869406/job/75431307165?pr=5166).

## Risk Assessment

Low risk. This is an isolated MIOpen gtest-only change that does not
affect production code, public APIs, packaging, or build configuration;
the main behavior change is converting missing foreign-arch model files
from failures to skips while preserving native-arch failures.

## Testing Summary

- Static validation confirmed the single-file change has no whitespace
errors and is formatted.
- Targeted MIOpen/gtest validation passed on gfx90a with the gfx942
model files present, confirming the updated suite runs instead of
skipping when the required files exist.
- Targeted missing-file validation passed on gfx90a after moving the
relevant build-local gfx942 model files out of the DB directory,
confirming the updated suite skips foreign-arch missing files without
error.
- MIOpen gtest naming validation passed for the renamed
`GPU_CandidateSelection_FP32` suite.
- TheRock multi-arch validation has been started with a validation-only
draft PR that bumps only the `rocm-libraries` submodule.

## Testing Checklist

- [x] Formatting - `clang-format -i
projects/miopen/test/gtest/conv_ai_candidate_selection_model.cpp` -
Status: Passed
- [x] Diff whitespace check - `git diff --check` - Status: Passed
- [x] Commit hooks - `git commit` - Status: Passed
- [x] MIOpen configure - `cmake -S projects/miopen -B
WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen
-GNinja ... -DMIOPEN_TEST_GFX90A=On -DMIOPEN_TEST_DISCRETE=On` - ASICs:
gfx90a - Status: Passed
- [x] MIOpen candidate-selection gtest build - `ninja -C
WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen
test_conv_ai_candidate_selection_model` - Status: Passed
- [x] MIOpen candidate-selection gtest -
`test_conv_ai_candidate_selection_model
--gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a -
Status: Passed
- [x] MIOpen candidate-selection missing-file skip path - moved
build-local `gfx942_ConvHipImplicitGemm*GroupWrwXdlops_*.tn.model` files
out of the DB path, then ran `test_conv_ai_candidate_selection_model
--gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a -
Status: Passed
- [x] MIOpen gtest name check - `check_names.py --list
test-candidate-selection-list.txt` - Status: Passed
- [x] TheRock multi-arch validation - Status: Pending - Link: [TheRock
PR ROCm#5193](ROCm/TheRock#5193), [Multi-Arch CI
run](https://github.com/ROCm/TheRock/actions/runs/25703671638)
- [x] PR CI - GitHub PR checks - Status: Pending

## Technical Changes

- Renames the enabled candidate-selection model suite from
`CPU_CandidateSelection_NONE` to `GPU_CandidateSelection_FP32` so it
reflects the arch-dependent model artifact behavior while satisfying
current MIOpen gtest naming rules.
- Adds model-file discovery and fixture gating for the
candidate-selection model files, skipping missing files for non-current
architectures and failing when the missing files are for the current
device architecture.
- Keeps the disabled-AI placeholder test CPU-scoped because it does not
require a GPU-visible architecture context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:run-all-archs Opt-in to building for all architectures on a pull request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant