Test: bump rocm-libraries for MIOpen candidate selection validation#5193
Closed
BrianHarrisonAMD wants to merge 1 commit into
Closed
Test: bump rocm-libraries for MIOpen candidate selection validation#5193BrianHarrisonAMD wants to merge 1 commit into
BrianHarrisonAMD wants to merge 1 commit into
Conversation
10 tasks
adickin-amd
pushed a commit
to ROCm/rocm-libraries
that referenced
this pull request
May 12, 2026
## Summary This PR updates the MIOpen AI candidate-selection model tests so foreign-architecture model artifacts are skipped when they are not installed for the current GPU job. This addresses the `gfx950` TheRock failure where the test tried to validate `gfx942` model files that were produced by the run but not selected for the `gfx950` install context: [TheRock run](https://github.com/ROCm/TheRock/actions/runs/25682869406/job/75431307165?pr=5166). ## Risk Assessment Low risk. This is an isolated MIOpen gtest-only change that does not affect production code, public APIs, packaging, or build configuration; the main behavior change is converting missing foreign-arch model files from failures to skips while preserving native-arch failures. ## Testing Summary - Static validation confirmed the single-file change has no whitespace errors and is formatted. - Targeted MIOpen/gtest validation passed on gfx90a with the gfx942 model files present, confirming the updated suite runs instead of skipping when the required files exist. - Targeted missing-file validation passed on gfx90a after moving the relevant build-local gfx942 model files out of the DB directory, confirming the updated suite skips foreign-arch missing files without error. - MIOpen gtest naming validation passed for the renamed `GPU_CandidateSelection_FP32` suite. - TheRock multi-arch validation has been started with a validation-only draft PR that bumps only the `rocm-libraries` submodule. ## Testing Checklist - [x] Formatting - `clang-format -i projects/miopen/test/gtest/conv_ai_candidate_selection_model.cpp` - Status: Passed - [x] Diff whitespace check - `git diff --check` - Status: Passed - [x] Commit hooks - `git commit` - Status: Passed - [x] MIOpen configure - `cmake -S projects/miopen -B WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen -GNinja ... -DMIOPEN_TEST_GFX90A=On -DMIOPEN_TEST_DISCRETE=On` - ASICs: gfx90a - Status: Passed - [x] MIOpen candidate-selection gtest build - `ninja -C WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen test_conv_ai_candidate_selection_model` - Status: Passed - [x] MIOpen candidate-selection gtest - `test_conv_ai_candidate_selection_model --gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a - Status: Passed - [x] MIOpen candidate-selection missing-file skip path - moved build-local `gfx942_ConvHipImplicitGemm*GroupWrwXdlops_*.tn.model` files out of the DB path, then ran `test_conv_ai_candidate_selection_model --gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a - Status: Passed - [x] MIOpen gtest name check - `check_names.py --list test-candidate-selection-list.txt` - Status: Passed - [x] TheRock multi-arch validation - Status: Pending - Link: [TheRock PR #5193](ROCm/TheRock#5193), [Multi-Arch CI run](https://github.com/ROCm/TheRock/actions/runs/25703671638) - [x] PR CI - GitHub PR checks - Status: Pending ## Technical Changes - Renames the enabled candidate-selection model suite from `CPU_CandidateSelection_NONE` to `GPU_CandidateSelection_FP32` so it reflects the arch-dependent model artifact behavior while satisfying current MIOpen gtest naming rules. - Adds model-file discovery and fixture gating for the candidate-selection model files, skipping missing files for non-current architectures and failing when the missing files are for the current device architecture. - Keeps the disabled-AI placeholder test CPU-scoped because it does not require a GPU-visible architecture context.
Contributor
Author
|
Closing this validation-only PR. The PR 7320 multi-arch verification run has served its purpose, and the next validation will use a new TheRock branch pinned to ROCm/rocm-libraries#7354. |
amontoison
pushed a commit
to amontoison/rocm-libraries
that referenced
this pull request
May 13, 2026
## Summary This PR updates the MIOpen AI candidate-selection model tests so foreign-architecture model artifacts are skipped when they are not installed for the current GPU job. This addresses the `gfx950` TheRock failure where the test tried to validate `gfx942` model files that were produced by the run but not selected for the `gfx950` install context: [TheRock run](https://github.com/ROCm/TheRock/actions/runs/25682869406/job/75431307165?pr=5166). ## Risk Assessment Low risk. This is an isolated MIOpen gtest-only change that does not affect production code, public APIs, packaging, or build configuration; the main behavior change is converting missing foreign-arch model files from failures to skips while preserving native-arch failures. ## Testing Summary - Static validation confirmed the single-file change has no whitespace errors and is formatted. - Targeted MIOpen/gtest validation passed on gfx90a with the gfx942 model files present, confirming the updated suite runs instead of skipping when the required files exist. - Targeted missing-file validation passed on gfx90a after moving the relevant build-local gfx942 model files out of the DB directory, confirming the updated suite skips foreign-arch missing files without error. - MIOpen gtest naming validation passed for the renamed `GPU_CandidateSelection_FP32` suite. - TheRock multi-arch validation has been started with a validation-only draft PR that bumps only the `rocm-libraries` submodule. ## Testing Checklist - [x] Formatting - `clang-format -i projects/miopen/test/gtest/conv_ai_candidate_selection_model.cpp` - Status: Passed - [x] Diff whitespace check - `git diff --check` - Status: Passed - [x] Commit hooks - `git commit` - Status: Passed - [x] MIOpen configure - `cmake -S projects/miopen -B WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen -GNinja ... -DMIOPEN_TEST_GFX90A=On -DMIOPEN_TEST_DISCRETE=On` - ASICs: gfx90a - Status: Passed - [x] MIOpen candidate-selection gtest build - `ninja -C WIP/worktrees/rocm-libraries/miopen-skip-missing-candidate-db/build-miopen test_conv_ai_candidate_selection_model` - Status: Passed - [x] MIOpen candidate-selection gtest - `test_conv_ai_candidate_selection_model --gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a - Status: Passed - [x] MIOpen candidate-selection missing-file skip path - moved build-local `gfx942_ConvHipImplicitGemm*GroupWrwXdlops_*.tn.model` files out of the DB path, then ran `test_conv_ai_candidate_selection_model --gtest_filter='Full/GPU_CandidateSelection_FP32.*'` - ASICs: gfx90a - Status: Passed - [x] MIOpen gtest name check - `check_names.py --list test-candidate-selection-list.txt` - Status: Passed - [x] TheRock multi-arch validation - Status: Pending - Link: [TheRock PR ROCm#5193](ROCm/TheRock#5193), [Multi-Arch CI run](https://github.com/ROCm/TheRock/actions/runs/25703671638) - [x] PR CI - GitHub PR checks - Status: Pending ## Technical Changes - Renames the enabled candidate-selection model suite from `CPU_CandidateSelection_NONE` to `GPU_CandidateSelection_FP32` so it reflects the arch-dependent model artifact behavior while satisfying current MIOpen gtest naming rules. - Adds model-file discovery and fixture gating for the candidate-selection model files, skipping missing files for non-current architectures and failing when the missing files are for the current device architecture. - Keeps the disabled-AI placeholder test CPU-scoped because it does not require a GPU-visible architecture context.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Validation-only PR - not intended to merge.
This bumps the
rocm-librariessubmodule to909543b00dabad1af3f5b32e3ee8d498140e1eeafrom ROCm/rocm-libraries#7320.The goal is to exercise TheRock multi-arch CI with
ci:run-all-archsand confirm the MIOpen candidate-selection model tests no longer fail ongfx950when thegfx942DB/model artifact is not installed in thegfx950test job.What changed
rocm-librariessubmodule:8d33c2d2d1->909543b00dNext steps
ci:run-all-archslabel to trigger all-architecture CI.Linux::release / Test gfx950-dcgpu / Test miopen / Test miopen (shard 4/4) (gfx950-dcgpu)job.