[Comgr][hotswap] Add AMDGPU MC state setup by tgymnich · Pull Request #2439 · ROCm/llvm-project

tgymnich · 2026-05-07T16:28:28Z

Adds mc_state.{hpp,cpp} providing initMCState — a one-call setup of the LLVM MC stack (Target, Triple, MCInstrInfo, MCRegisterInfo, MCAsmInfo, MCSubtargetInfo, MCContext, MCDisassembler, MCInstPrinter) keyed off an AMDGPU ISA name.

The MC stack is consumed by the disassembler and per-handler raiser code that lands in subsequent commits. Includes the
InlineSrcMgr-on-MCContext defensive init that prevents the SIG6 Either SourceMgr should be available UNREACHABLE abort, with a gtest regression fence.

CMakeLists gains the LLVM build-tree requirement, the AMDGPU target-private include paths (SIDefines.h, AMDGPUBaseInfo.h, the TableGen-generated AMDGPUGen*.inc files), and the LLVM MC + AMDGPU library link surface.

tgymnich · 2026-05-07T16:29:52Z

Warning

Stacked PR

ftynse

Two overall design points:

We should seriously consider having the opcode->canonical-op table be produced from tablegen inputs defining the instruction sets. This will ensure a single source of truth with the lowering path and provide compile-time completeness checking for us instead of runtime checking. It may also somewhat avoid the huge lists of boilerplate macros.
There should be some way to test these. The documentation refers to some corpus of tests, but those are integration tests at best.

More minor design point is to avoid parsing the string from of mnemonics. There should be a better way of obtaining the same information from LLVM.

ftynse · 2026-05-18T07:10:59Z

+// type used by the disassembler's TSFlags / OperandType fields.
+#include "SIDefines.h"
+
+#include "Utils/AMDGPUBaseInfo.h" // AMDGPU::isVOPD


The comment here is excessive

ftynse · 2026-05-18T07:11:58Z

+// Alias `COMGR::hotswap::SIInstrFlags` to the LLVM namespace so existing call
+// sites (`SIInstrFlags::SOPP`, `SIInstrFlags::FLAT`, etc.) keep compiling.


What does "existing call sites" mean here? Don't allow temporal references in comments, they are useless if not harmful over time.

ftynse · 2026-05-18T07:12:15Z

+
+// AMDGPU target-specific operand type for VOP3 source modifiers (abs, neg).
+// Defined in llvm::AMDGPU::OperandType from SIDefines.h.
+constexpr unsigned OPERAND_INPUT_MODS = llvm::AMDGPU::OPERAND_INPUT_MODS;


Why not use the field from llvm::AMDGPU directly?

ftynse · 2026-05-18T07:14:52Z

+//   * `VOP3P` coexists with `VOP3` on some subtargets; check VOP3P first.
+//   * VOPD has no dedicated TSFlags bit (LLVM's VOPD3 bit varies across
+//     versions); use `AMDGPU::isVOPD(opc)` instead.
+inline const char *formatName(uint64_t Flags, unsigned Opc) {


StringRef. Never use const char *.

ftynse · 2026-05-18T07:15:30Z

+  // the handler refusal contract is keyed on.
+  if (Flags & SIInstrFlags::VIMAGE)     return "VIMAGE";
+  if (Flags & SIInstrFlags::TENSOR_CNT) return "VIMAGE";
+  return "Unknown";


I would rather use Expected<StringRef> here or another proper error reporting mechanism that a magic string that is then string-matched to detect error.

ftynse · 2026-05-18T08:34:19Z

+// Rationale: LLVM exposes `AMDGPU::getVCMPXOpFromVCMP` as a V_CMP → V_CMPX
+// mapping, but no public helper that hands back a CmpInst::Predicate or
+// element width. Rather than hand-list 100 opcode→metadata pairs we parse
+// the pseudo name once at init time; the same token grammar is already
+// hard-coded in LLVM's TableGen for these instructions.


Dubious. There's already a huge table above with a lot of boilerplate, and this function + doc is longer than 100 LoC so we don't even save space at a cost of extra complexity and potentially fragile string parsing.

ftynse · 2026-05-18T08:35:21Z

+CanonicalOp OpcodeMap::lookup(unsigned Opcode) const {
+  auto It = Map.find(Opcode);
+  return It != Map.end() ? It->second : CanonicalOp::Unknown;
+}


If only there was DenseMap::lookup/lookup_or.

ftynse · 2026-05-18T08:40:51Z

+// `_nosdst_` collapse: starting with GFX11, VOPC CMPX instructions no longer
+// write a scalar destination register (EXEC receives the mask directly) and
+// LLVM represents this as a `_nosdst_` variant. The non-`_nosdst_` target
+// form keeps the scalar dst (for older subtargets). Both forms share
+// dispatch-relevant TSFlags; the raiser's CMPX handler only writes EXEC and
+// ignores the optional sdst, so collapsing the variant onto the base is
+// safe. The source has one fewer def when the base includes sdst (e64
+// forms) and the same number of defs otherwise (e32, where both lack sdst).


I don't see what this comment relates to, perhaps nosdstDropsScalarDef way below.

ftynse · 2026-05-18T08:42:04Z

+    ByName.try_emplace(MCII.getName(P), P);
+
+  struct Rule {
+    llvm::StringRef Needle;


Nit: consider a more descriptive name.

ftynse · 2026-05-18T08:44:03Z

+  auto StripOnce = [&](llvm::StringRef Name, std::string &Out) -> int {
+    for (size_t I = 0; I < std::size(Rules); ++I) {
+      const Rule &R = Rules[I];
+      if (R.IsSuffix) {
+        if (!Name.ends_with(R.Needle))
+          continue;
+        Out = Name.drop_back(R.Needle.size()).str();
+        return static_cast<int>(I);
+      }
+      size_t Pos = Name.find(R.Needle);
+      if (Pos == llvm::StringRef::npos)
+        continue;
+      Out = (Name.substr(0, Pos).str() + std::string("_") +
+             Name.substr(Pos + R.Needle.size()).str());
+      return static_cast<int>(I);
+    }
+    return -1;
+  };


SmallString + the corresponding ostream would avoid reallocations here.

… in raiser Adds the AMDGPU code-object metadata extraction surface that the rest of the raiser pipeline depends on: * `extractTextSection`, `listKernelNames`, `extractKernelMeta`, `findKernelSymbolOffset`, and `detectIsaFromElf` in `hotswap/code_object_utils.{h,cpp}`. * `KernelMeta` populated with both MsgPack-derived fields (kernarg/group/ private segment sizes, args) and the raw kernel-descriptor register bytes (`compute_pgm_rsrc{1,2}`, `kernel_code_properties`, `kernarg_preload`) read from `<name>.kd` in `.rodata`. Reuses comgr's existing parsing infrastructure rather than duplicating it: each helper wraps the input ELF bytes in an `AMD_COMGR_DATA_KIND_EXECUTABLE` data object via a small `ScopedDataObject` RAII helper and routes through `metadata::getMetadataRoot` / `metadata::getElfIsaName` for the metadata walk and ISA-name extraction. Also adopts llvm::Error / llvm::Expected for the raiser scaffolding entry point (mirrors upstream rocm commit f68dd49): raiseToIR returns Expected<RaiseResult>, validation failures surface as HotswapError values via the existing hotswap-error.h, and the handrolled raise-failure.h is removed. An empty-input scaffolding mode bypass is added so test scaffolding can stub the raiser without standing up a real ISA + metadata pair. The KD-bytes lookup comments reflect partial-success (MsgPack-derived fields are returned even if .rodata KD parse fails). No MC stack, no disassembler, no opcode canonicalisation yet -- those land in subsequent commits. Co-Authored-By: Tim Gymnich <tim@gymni.ch> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Alex Zinenko <git@ozinenko.com>

Adds the three foundation files the per-format handlers will branch on: * canonical_op.{h,cpp} — `CanonicalOp` enum (one entry per target-independent semantic op the raiser knows how to lift) and a name table for diagnostics. * amdgpu_formats.h — convenience aliases over `SIInstrFlags` so handler dispatches read as `Flags & FORMAT_VALU` / `FORMAT_SOPK` instead of hand-typing the bit layout. * opcode_map.{h,cpp} — `OpcodeMap` builds an MC-opcode -> CanonicalOp table once at first use by walking the MCInstrInfo registered for a given AMDGPU subtarget. Per-handler entries are folded into the table here; the actual `lookup` consumers come in subsequent handler commits. `hotswap/CMakeLists.txt` grows to compile canonical_op.cpp + opcode_map.cpp into the OBJECT lib. No new external dependencies. Co-Authored-By: Tim Gymnich <tim@gymni.ch> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

tgymnich requested review from chinmaydd and lamb-j as code owners May 7, 2026 16:28

This was referenced May 7, 2026

[Comgr][hotswap] Add code-object metadata parsing #2437

Open

[Comgr][hotswap] Add bare-bones hotswap transpiler scaffolding #2438

Merged

tgymnich requested review from kzhuravl, martin-luecke, yoonseoch and yxsamliu May 7, 2026 16:30

tgymnich mentioned this pull request May 7, 2026

Patch landing stack for AMD LLVM fork martin-luecke/rocm-systems#47

Open

3 tasks

chinmaydd added comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature labels May 7, 2026

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from d1a05c9 to 8be9478 Compare May 8, 2026 10:17

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from e8a1fa9 to 9183737 Compare May 8, 2026 10:18

tgymnich marked this pull request as draft May 8, 2026 21:26

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 8be9478 to 6ab4011 Compare May 11, 2026 11:41

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 9183737 to 6122d14 Compare May 11, 2026 11:41

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 6ab4011 to dd6bb68 Compare May 11, 2026 12:32

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 6122d14 to e1822d4 Compare May 11, 2026 12:32

tgymnich marked this pull request as ready for review May 11, 2026 12:40

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from dd6bb68 to 94e24d0 Compare May 11, 2026 12:42

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from e1822d4 to b9c85a5 Compare May 11, 2026 12:42

tgymnich marked this pull request as draft May 11, 2026 15:19

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch 3 times, most recently from 2b8bfde to 70e5099 Compare May 11, 2026 20:21

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from b9c85a5 to dade881 Compare May 11, 2026 20:22

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 70e5099 to 73d63ce Compare May 11, 2026 20:48

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from dade881 to c1170b2 Compare May 11, 2026 20:48

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 73d63ce to 4314144 Compare May 11, 2026 21:26

tgymnich marked this pull request as ready for review May 13, 2026 14:18

This was referenced May 13, 2026

[Comgr][hotswap] Add DecodedInst + basic kernel decoder #2517

Open

[Comgr][hotswap] Add wave projection + ISA profile + abi/wave-mask docs #2518

Open

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 73d12ca to cb7ba8d Compare May 13, 2026 14:25

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 963fd34 to 5f3d5aa Compare May 13, 2026 15:40

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from cb7ba8d to ccdfadc Compare May 15, 2026 11:16

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 5f3d5aa to 5a5c860 Compare May 15, 2026 11:16

This was referenced May 15, 2026

[Comgr][hotswap] Add raise_failure + RaiseContext + register file #2551

Open

[Comgr][hotswap] Add raise_cli + lit harness for hotswap-raise fixtures #2552

Open

[Comgr][hotswap] Add SOPP + SOPC handlers #2553

Open

tgymnich requested a review from ftynse May 15, 2026 11:44

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from ccdfadc to 9bc71b6 Compare May 15, 2026 13:06

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 5a5c860 to a098906 Compare May 15, 2026 13:06

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 9bc71b6 to 9880187 Compare May 15, 2026 15:28

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from a098906 to 4eda47a Compare May 15, 2026 15:28

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 9880187 to afbdc65 Compare May 15, 2026 16:06

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4eda47a to 4b9f8b4 Compare May 15, 2026 16:07

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from afbdc65 to 9978b9f Compare May 15, 2026 16:18

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4b9f8b4 to 97a7c25 Compare May 15, 2026 16:18

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 9978b9f to a1bb2b1 Compare May 15, 2026 18:54

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 97a7c25 to 4dbeb97 Compare May 15, 2026 18:55

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from a1bb2b1 to e20dd25 Compare May 15, 2026 20:45

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4dbeb97 to 8cbf5bd Compare May 15, 2026 20:45

ftynse reviewed May 18, 2026

View reviewed changes

martin-luecke and others added 2 commits May 18, 2026 13:55

tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 8cbf5bd to d3b2b77 Compare May 18, 2026 14:38

tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from e20dd25 to 877df15 Compare May 18, 2026 14:38

tgymnich mentioned this pull request May 18, 2026

[Comgr][hotswap] Add setpc analysis + sync-translation doc #2588

Open

ftynse force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 877df15 to 9428a8b Compare May 21, 2026 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Comgr][hotswap] Add AMDGPU MC state setup#2439

[Comgr][hotswap] Add AMDGPU MC state setup#2439
tgymnich wants to merge 2 commits into
users/tgymnich/hotswap-pr-02-code-object-metadatafrom
users/tgymnich/hotswap-pr-03-mc-state

tgymnich commented May 7, 2026

Uh oh!

tgymnich commented May 7, 2026 •

edited

Loading

Uh oh!

ftynse left a comment

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

ftynse May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		// Alias `COMGR::hotswap::SIInstrFlags` to the LLVM namespace so existing call
		// sites (`SIInstrFlags::SOPP`, `SIInstrFlags::FLAT`, etc.) keep compiling.

Conversation

tgymnich commented May 7, 2026

Uh oh!

tgymnich commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ftynse left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tgymnich commented May 7, 2026 •

edited

Loading