[AMDGPU] comgr: Hotswap dispatcher vtable + .def registry, fix Windows support#2491
Merged
Merged
Conversation
harsh-amd
reviewed
May 12, 2026
harsh-amd
reviewed
May 12, 2026
harsh-amd
reviewed
May 12, 2026
harsh-amd
reviewed
May 12, 2026
chinmaydd
reviewed
May 12, 2026
chinmaydd
reviewed
May 12, 2026
Signed-off-by: xintin <gaurav.verma@amd.com>
Signed-off-by: xintin <gaurav.verma@amd.com>
Author
|
PR #2379 landed in amd-staging and uses the old weak-symbol + |
2c0862e to
28af859
Compare
Signed-off-by: xintin <gaurav.verma@amd.com>
28af859 to
657b312
Compare
|
@xintin when merging, could you shorten down the commit message to a minimal description of the PR ? Jacob and me have been discussing about this and will share a directive soon but we want to move towards commit messages that reflect only what the goal of the PR is. EDIT: I guess directive is the wrong word, but we want to discuss this more broadly. |
lamb-j
added a commit
to ROCm/SPIRV-LLVM-Translator
that referenced
this pull request
May 13, 2026
ROCm/llvm-project#2491 (merged to amd-staging) fixes the HotswapMCTests link failure on Windows that #2479 tracked. The SED that stripped add_subdirectory(test-unit) is no longer needed — test-unit builds and runs cleanly. check-comgr on Windows now exercises lit + ctest + gtest, matching the Linux variant.
lamb-j
added a commit
to ROCm/SPIRV-LLVM-Translator
that referenced
this pull request
May 13, 2026
* [CI] Add Windows variant (Build + 3 test jobs) Adds Windows::release dispatcher job + spirv-ci-windows.yml — full parity with the Linux variant: Build + Test SPIRV translator lit (with baseline-diff + gate) + Test LLVM SPIRV codegen + Test Comgr. Mirrors TheRock's Windows CI conventions: - Runner: azure-windows-scale-rocm - No container (Windows runs natively) - MSVC env via ilammy/msvc-dev-cmd action - Pinned ninja 1.12.1 via Chocolatey (1.13.0 has a known bug) - bash shell defaults - git long paths enabled - No strip step (PE/COFF debug info lives in separate .pdb) Translator-lit uses a separate sticky-comment marker (spirv-ci:translator-lit-windows) so it doesn't fight with the Linux variant's comment on the same PR. After this lands the PR rollup gets 4 more checks: SPIRV Compiler CI / Windows::release / Build SPIRV Compiler CI / Windows::release / Test SPIRV translator lit SPIRV Compiler CI / Windows::release / Test LLVM SPIRV codegen SPIRV Compiler CI / Windows::release / Test Comgr Wall-time risk: Windows build is typically 25-45 min vs ~14 min on Linux. Total wall-clock for a PR roughly 2x. Test-side risk: Windows lit + Comgr fixture generation are unproven on this runner. If individual test jobs flake, can break them out into follow-up PRs. Companion: ROCm/llvm-project copy of this workflow doesn't get the Windows variant in this PR — defer until Windows is proven on the translator side. * [CI] Touch build.ninja after untar (Windows variant) Same fix as #197 for the Linux variant. tar -m sets per-file mtimes from sequential extraction order; build.ninja ends up older than CMakeCache.txt, triggering ninja's regen rule and cascade rebuild. Touch build.ninja explicitly to make it the newest file in the tree. * [CI debug] Locate AMDDeviceLibsConfig.cmake on Windows Comgr's find_package(AMDDeviceLibs) fails on Windows. Add a one-shot debug step before the Comgr configure to print: - $PWD (validates path expansion in MSVC bash) - file path of AMDDeviceLibs*.cmake under build-device-libs - any cmake/ subdirs under build-device-libs Once we see where the file actually lands we can pin AMDDeviceLibs_DIR explicitly. Revert this debug after the targeted fix. * [CI] Use pwd -W on Windows for cmake -D paths Comgr's find_package(AMDDeviceLibs) failed on Windows even though the config file was at the expected build-device-libs/lib/cmake/AMDDeviceLibs/ location. Cause: $PWD in MSYS bash returns a Unix-style path (/c/home/runner/...) that cmake's find_package can't traverse on Windows. Switch device-libs and Comgr configures to compute PWD_WIN=$(pwd -W) which returns Windows-style with forward slashes (C:/home/runner/...) that cmake handles natively. Also pin AMDDeviceLibs_DIR explicitly as belt-and-suspenders. Drop the debug step. * [CI] Skip Comgr gtests on Windows (lit + ctest only) Comgr's HotswapMCTests gtest binary fails to link on Windows because the hotswap apply* path is fully MSVC-guarded out (see ROCm/llvm-project#2479). Replace `ninja check-comgr` with `ninja test test-lit` on Windows so we run the lit suite and ctest without dragging in the gtest aggregator (test-unit). Linux variant unchanged — still runs full check-comgr including the gtest binaries. * [CI] Skip Comgr test-unit subdirectory on Windows Drop test-unit (gtest binaries) at configure time so check-comgr's lit and ctest layers still work on Windows. HotswapMCTests fails to link because of the MSVC-guarded hotswap apply* path (ROCm/llvm-project#2479); rather than special-casing the test step, sed out add_subdirectory(test-unit) from Comgr's CMakeLists.txt before configuring. This is a workflow-side hack — when the upstream Comgr issue is resolved (or Comgr exposes a CMake option to gate test-unit), drop the sed and revert to plain check-comgr. * [CI] Enable zstd via vcpkg for clang-offload-bundler (Windows) Two of the Comgr lit failures on Windows (unbundle, cache) were hitting "Compression not supported" from clang-offload-bundler — LLVM was built without zstd or zlib. The manylinux container ships zstd dev libs on Linux; on Windows we install via vcpkg (already present at $VCPKG_ROOT) and pass CMAKE_TOOLCHAIN_FILE + LLVM_ENABLE_ZSTD=FORCE_ON so the configure fails loudly if zstd isn't located. * [CI] Use $VCPKG_ROOT vcpkg.exe to install zstd (Windows) The runner has two vcpkg installs: C:\vcpkg (first in PATH) and the MSVC-bundled one at $VCPKG_ROOT. Plain `vcpkg install` resolved to C:\vcpkg, but Configure LLVM's CMAKE_TOOLCHAIN_FILE points at the MSVC-bundled vcpkg.cmake, so find_package(zstd) couldn't see the package. Invoke $VCPKG_ROOT\vcpkg.exe directly so install and lookup share the same root. * [CI] Switch to C:\vcpkg for classic-mode install (Windows) The MSVC-bundled vcpkg at $VCPKG_ROOT is manifest-mode only — `vcpkg install <pkg>` errors out with "this distribution does not have a classic mode instance". The other vcpkg on the runner (C:\vcpkg, what plain `vcpkg` resolves to via PATH) supports classic mode. Point both the install command and CMAKE_TOOLCHAIN_FILE at C:\vcpkg so the package zstd installs and the LLVM cmake find_package locate it. * [CI] Pass vcpkg toolchain to device-libs and Comgr (Windows) LLVM built with zstd exports LLVMSupport linking against zstd::libzstd_shared. Downstream find_package(LLVM) callers in device-libs and Comgr need CMAKE_TOOLCHAIN_FILE pointing at the same vcpkg.cmake so the imported zstd target is defined when the export file is processed; otherwise cmake errors with "target was not found". * [CI] Install zstd on Windows test runners (tactical) build.ninja embeds C:/vcpkg/installed/.../zstd.lib as a build-edge input that ninja validates at planning time. Test jobs running ninja against the downloaded artifact need the same vcpkg state as the build runner. Install zstd in each Windows test job to match. This pattern doesn't scale — every new build-time system dep requires updates to N test jobs. Followup planned to drop ninja from test jobs in favor of llvm-lit-direct invocation, which removes the test-time dependency on build-time system libs. * [CI] Drop SED workaround for Comgr test-unit on Windows ROCm/llvm-project#2491 (merged to amd-staging) fixes the HotswapMCTests link failure on Windows that #2479 tracked. The SED that stripped add_subdirectory(test-unit) is no longer needed — test-unit builds and runs cleanly. check-comgr on Windows now exercises lit + ctest + gtest, matching the Linux variant. * [CI] Pin test jobs to Build job's resolved llvm-project SHA Build and test jobs both used \`ref: amd-staging\` for llvm-project checkout. If amd-staging advances mid-run (e.g., upstream merge lands between Build and Test), test jobs land on a newer commit than what Build produced. ninja then sees source > object, decides to rebuild, and uses the OLD compile command from build.ninja — which can lack include paths added by the newer commit (recently hit when libc/shared became a Support dependency for APFloat). Build job now exports the resolved SHA as a job output. Test jobs check out llvm-project at that exact SHA via needs.build.outputs. Both Linux and Windows variants.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Requires: to run CI in order to test the fix.
Summary
Replaces the
LLVM_ATTRIBUTE_WEAK+#if !defined(_MSC_VER)patch dispatch pattern with a function-pointer vtable driven by a centralcomgr-hotswap-patches.defregistry.Addresses issue #2479: on Windows,
LLVM_ATTRIBUTE_WEAKexpands to nothing under MSVC, so the no-op stubs incomgr-hotswap-b0a0.cppbecame regular definitions and either (a) collided with the strong overrides asLNK2005duplicate symbols, or (b) silently won every dispatch when the patch.cppfiles were guarded out with#if !defined(_MSC_VER), disabling hotswap entirely while still reporting build success.The new dispatch contract is:
one slot on
HotswapPatchVTableper patch family;one alpha-ordered line in
comgr-hotswap-patches.defper merged patch; oneregister*Patch(HotswapPatchVTable&)exported from eachcomgr-hotswap-patch-*.cpp;installHotswapPatches()walks the.defto bind every slot;retargetCodeObjectB0A0drives the install once viastd::call_once;a missing registrar is a libamd_comgr / HotswapMCTests link error, not a silent feature disable.
applyWmmaSplitPatchesandapplyScratchPatcheskeep nullptr defaults because no patch file ships them yet.The dispatcher treats nullptr as a no-op slot, preserving the previous "weak stub returns 0" behaviour until those families land their first strong override.
Changes
HOTSWAP_PATCH(Name)line per merged patch (InPlace,Trampoline,Vop3px2Src2,WmmaHazard).HotswapPatchVTable,getHotswapPatchVTable(),installHotswapPatches(). Forward-declare everyregister*Patchfrom the same.defso internal.h is the single prototype source for both libamd_comgr and the unit tests. Remove the per-passapplyXxxexternal declarations -- impls are now file-local in their patch.cpp.apply*stubs and the forward-declaration block. Add the singleton accessor and the.def-driveninstallHotswapPatches. Route the dispatcher through the vtable via a smallrunPerInstPasshelper that treats nullptr slots as 0-return. Install once per process fromretargetCodeObjectB0A0understd::call_once. Liveness / DWARF stubs that still have no strong override anywhere in tree keep theLLVM_ATTRIBUTE_WEAKpattern -- they migrate the same way the first time a real implementation lands.#if !defined(_MSC_VER)guard and its explanatory comment. Rename the dispatcher entry pointapplyXxxto a file-staticapplyXxxImpl. Add a file-scoperegisterXxxPatch(HotswapPatchVTable&)that binds the impl into the vtable. Test-facing helpers (classifyWmmaNops,patchScaleSrc2) now compile unconditionally, fixing theLNK2019failures inHotswapMCTestson MSVC.Tests
Link errors catch
.defentries without matchingregister*Patchdefinitions. The tests cover the runtime binding behavior the linker cannot validate.TEST(HotswapPatchVTable, ...)blocks.RegisterInPlaceBindsOnlyInPlaceSlot-- canonical worked example that pins the "binds only its own slot" invariant for one installer. Wrong-slot bugs in the other threeregister*Patchfunctions are caught transitively by the install end-to-end test below.InstallBindsRegisteredAndLeavesUnregisteredNull-- end-to-end install: defaults null, every.defentry gets bound, slots without a.defentry (wmma-split,scratch) stay null. Pins the dispatcher's no-op contract for unimplemented pass families.ProcessSingletonIdentityAndInstallPersists--getHotswapPatchVTable()is a real Meyers singleton and bindings persist on it; theretargetCodeObjectB0A0install path understd::call_oncerelies on this.comgr-hotswap-b0a0.cpp,comgr-hotswap-patch-inplace.cpp, andcomgr-hotswap-patch-trampoline.cppintoHotswapMCTests(the other two patch sources were already there) soinstallHotswapPatchesand everyregister*Patchresolve at link time. Apatches.defline without a matching definition produces aHotswapMCTestslink error, which is itself a regression guard for the new contract.Follow-up (deferred)
ds_2addr_stride64per #2379, need a one-line.defadd and aregister*Patchshim when they rebase.LLVM_ATTRIBUTE_WEAKliveness / DWARF stubs (buildCfg,computeLiveness,getInstRegDefUse,getBranchImm,verifyPatchCorrectness,addTrampolineSymbols,patchDebug*) onto the same.defcontract the first time any of them grows a real implement.