Skip to content

[Comgr][hotswap] Add AMDGPU MC state setup#2439

Open
tgymnich wants to merge 2 commits into
users/tgymnich/hotswap-pr-02-code-object-metadatafrom
users/tgymnich/hotswap-pr-03-mc-state
Open

[Comgr][hotswap] Add AMDGPU MC state setup#2439
tgymnich wants to merge 2 commits into
users/tgymnich/hotswap-pr-02-code-object-metadatafrom
users/tgymnich/hotswap-pr-03-mc-state

Conversation

@tgymnich
Copy link
Copy Markdown

@tgymnich tgymnich commented May 7, 2026

Adds mc_state.{hpp,cpp} providing initMCState — a one-call setup of the LLVM MC stack (Target, Triple, MCInstrInfo, MCRegisterInfo, MCAsmInfo, MCSubtargetInfo, MCContext, MCDisassembler, MCInstPrinter) keyed off an AMDGPU ISA name.

The MC stack is consumed by the disassembler and per-handler raiser code that lands in subsequent commits. Includes the
InlineSrcMgr-on-MCContext defensive init that prevents the SIG6 Either SourceMgr should be available UNREACHABLE abort, with a gtest regression fence.

CMakeLists gains the LLVM build-tree requirement, the AMDGPU target-private include paths (SIDefines.h, AMDGPUBaseInfo.h, the TableGen-generated AMDGPUGen*.inc files), and the LLVM MC + AMDGPU library link surface.

@tgymnich tgymnich requested review from chinmaydd and lamb-j as code owners May 7, 2026 16:28
@chinmaydd chinmaydd added comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature labels May 7, 2026
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from d1a05c9 to 8be9478 Compare May 8, 2026 10:17
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from e8a1fa9 to 9183737 Compare May 8, 2026 10:18
@tgymnich tgymnich marked this pull request as draft May 8, 2026 21:26
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 8be9478 to 6ab4011 Compare May 11, 2026 11:41
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 9183737 to 6122d14 Compare May 11, 2026 11:41
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 6ab4011 to dd6bb68 Compare May 11, 2026 12:32
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 6122d14 to e1822d4 Compare May 11, 2026 12:32
@tgymnich tgymnich marked this pull request as ready for review May 11, 2026 12:40
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from dd6bb68 to 94e24d0 Compare May 11, 2026 12:42
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from e1822d4 to b9c85a5 Compare May 11, 2026 12:42
@tgymnich tgymnich marked this pull request as draft May 11, 2026 15:19
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch 3 times, most recently from 2b8bfde to 70e5099 Compare May 11, 2026 20:21
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from b9c85a5 to dade881 Compare May 11, 2026 20:22
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 70e5099 to 73d63ce Compare May 11, 2026 20:48
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from dade881 to c1170b2 Compare May 11, 2026 20:48
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 73d63ce to 4314144 Compare May 11, 2026 21:26
@tgymnich tgymnich marked this pull request as ready for review May 13, 2026 14:18
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 73d12ca to cb7ba8d Compare May 13, 2026 14:25
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 963fd34 to 5f3d5aa Compare May 13, 2026 15:40
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from cb7ba8d to ccdfadc Compare May 15, 2026 11:16
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 5f3d5aa to 5a5c860 Compare May 15, 2026 11:16
@tgymnich tgymnich requested a review from ftynse May 15, 2026 11:44
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from ccdfadc to 9bc71b6 Compare May 15, 2026 13:06
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 5a5c860 to a098906 Compare May 15, 2026 13:06
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 9bc71b6 to 9880187 Compare May 15, 2026 15:28
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from a098906 to 4eda47a Compare May 15, 2026 15:28
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 9880187 to afbdc65 Compare May 15, 2026 16:06
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4eda47a to 4b9f8b4 Compare May 15, 2026 16:07
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from afbdc65 to 9978b9f Compare May 15, 2026 16:18
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4b9f8b4 to 97a7c25 Compare May 15, 2026 16:18
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 9978b9f to a1bb2b1 Compare May 15, 2026 18:54
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 97a7c25 to 4dbeb97 Compare May 15, 2026 18:55
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from a1bb2b1 to e20dd25 Compare May 15, 2026 20:45
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4dbeb97 to 8cbf5bd Compare May 15, 2026 20:45
Copy link
Copy Markdown

@ftynse ftynse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two overall design points:

  1. We should seriously consider having the opcode->canonical-op table be produced from tablegen inputs defining the instruction sets. This will ensure a single source of truth with the lowering path and provide compile-time completeness checking for us instead of runtime checking. It may also somewhat avoid the huge lists of boilerplate macros.
  2. There should be some way to test these. The documentation refers to some corpus of tests, but those are integration tests at best.

More minor design point is to avoid parsing the string from of mnemonics. There should be a better way of obtaining the same information from LLVM.

Comment thread amd/comgr/src/hotswap/amdgpu-formats.h Outdated
// type used by the disassembler's TSFlags / OperandType fields.
#include "SIDefines.h"

#include "Utils/AMDGPUBaseInfo.h" // AMDGPU::isVOPD
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is excessive

Comment thread amd/comgr/src/hotswap/amdgpu-formats.h Outdated
Comment on lines +24 to +25
// Alias `COMGR::hotswap::SIInstrFlags` to the LLVM namespace so existing call
// sites (`SIInstrFlags::SOPP`, `SIInstrFlags::FLAT`, etc.) keep compiling.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "existing call sites" mean here? Don't allow temporal references in comments, they are useless if not harmful over time.

Comment thread amd/comgr/src/hotswap/amdgpu-formats.h Outdated

// AMDGPU target-specific operand type for VOP3 source modifiers (abs, neg).
// Defined in llvm::AMDGPU::OperandType from SIDefines.h.
constexpr unsigned OPERAND_INPUT_MODS = llvm::AMDGPU::OPERAND_INPUT_MODS;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the field from llvm::AMDGPU directly?

Comment thread amd/comgr/src/hotswap/amdgpu-formats.h Outdated
// * `VOP3P` coexists with `VOP3` on some subtargets; check VOP3P first.
// * VOPD has no dedicated TSFlags bit (LLVM's VOPD3 bit varies across
// versions); use `AMDGPU::isVOPD(opc)` instead.
inline const char *formatName(uint64_t Flags, unsigned Opc) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringRef. Never use const char *.

Comment thread amd/comgr/src/hotswap/amdgpu-formats.h Outdated
// the handler refusal contract is keyed on.
if (Flags & SIInstrFlags::VIMAGE) return "VIMAGE";
if (Flags & SIInstrFlags::TENSOR_CNT) return "VIMAGE";
return "Unknown";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather use Expected<StringRef> here or another proper error reporting mechanism that a magic string that is then string-matched to detect error.

Comment on lines +1588 to +1592
// Rationale: LLVM exposes `AMDGPU::getVCMPXOpFromVCMP` as a V_CMP → V_CMPX
// mapping, but no public helper that hands back a CmpInst::Predicate or
// element width. Rather than hand-list 100 opcode→metadata pairs we parse
// the pseudo name once at init time; the same token grammar is already
// hard-coded in LLVM's TableGen for these instructions.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dubious. There's already a huge table above with a lot of boilerplate, and this function + doc is longer than 100 LoC so we don't even save space at a cost of extra complexity and potentially fragile string parsing.

Comment on lines +1681 to +1684
CanonicalOp OpcodeMap::lookup(unsigned Opcode) const {
auto It = Map.find(Opcode);
return It != Map.end() ? It->second : CanonicalOp::Unknown;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only there was DenseMap::lookup/lookup_or.

Comment thread amd/comgr/src/hotswap/opcode-map.cpp Outdated
Comment on lines +1323 to +1330
// `_nosdst_` collapse: starting with GFX11, VOPC CMPX instructions no longer
// write a scalar destination register (EXEC receives the mask directly) and
// LLVM represents this as a `_nosdst_` variant. The non-`_nosdst_` target
// form keeps the scalar dst (for older subtargets). Both forms share
// dispatch-relevant TSFlags; the raiser's CMPX handler only writes EXEC and
// ignores the optional sdst, so collapsing the variant onto the base is
// safe. The source has one fewer def when the base includes sdst (e64
// forms) and the same number of defs otherwise (e32, where both lack sdst).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what this comment relates to, perhaps nosdstDropsScalarDef way below.

ByName.try_emplace(MCII.getName(P), P);

struct Rule {
llvm::StringRef Needle;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consider a more descriptive name.

Comment on lines +1454 to +1471
auto StripOnce = [&](llvm::StringRef Name, std::string &Out) -> int {
for (size_t I = 0; I < std::size(Rules); ++I) {
const Rule &R = Rules[I];
if (R.IsSuffix) {
if (!Name.ends_with(R.Needle))
continue;
Out = Name.drop_back(R.Needle.size()).str();
return static_cast<int>(I);
}
size_t Pos = Name.find(R.Needle);
if (Pos == llvm::StringRef::npos)
continue;
Out = (Name.substr(0, Pos).str() + std::string("_") +
Name.substr(Pos + R.Needle.size()).str());
return static_cast<int>(I);
}
return -1;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SmallString + the corresponding ostream would avoid reallocations here.

martin-luecke and others added 2 commits May 18, 2026 13:55
… in raiser

Adds the AMDGPU code-object metadata extraction surface that the rest of
the raiser pipeline depends on:

* `extractTextSection`, `listKernelNames`, `extractKernelMeta`,
  `findKernelSymbolOffset`, and `detectIsaFromElf` in
  `hotswap/code_object_utils.{h,cpp}`.
* `KernelMeta` populated with both MsgPack-derived fields (kernarg/group/
  private segment sizes, args) and the raw kernel-descriptor register
  bytes (`compute_pgm_rsrc{1,2}`, `kernel_code_properties`,
  `kernarg_preload`) read from `<name>.kd` in `.rodata`.

Reuses comgr's existing parsing infrastructure rather than duplicating
it: each helper wraps the input ELF bytes in an
`AMD_COMGR_DATA_KIND_EXECUTABLE` data object via a small
`ScopedDataObject` RAII helper and routes through
`metadata::getMetadataRoot` / `metadata::getElfIsaName` for the metadata
walk and ISA-name extraction.

Also adopts llvm::Error / llvm::Expected for the raiser scaffolding
entry point (mirrors upstream rocm commit f68dd49): raiseToIR
returns Expected<RaiseResult>, validation failures surface as
HotswapError values via the existing hotswap-error.h, and the
handrolled raise-failure.h is removed. An empty-input scaffolding
mode bypass is added so test scaffolding can stub the raiser without
standing up a real ISA + metadata pair. The KD-bytes lookup comments
reflect partial-success (MsgPack-derived fields are returned even if
.rodata KD parse fails).

No MC stack, no disassembler, no opcode canonicalisation yet -- those
land in subsequent commits.

Co-Authored-By: Tim Gymnich <tim@gymni.ch>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Alex Zinenko <git@ozinenko.com>
Adds the three foundation files the per-format handlers will branch on:

  * canonical_op.{h,cpp} — `CanonicalOp` enum (one entry per
    target-independent semantic op the raiser knows how to lift) and a
    name table for diagnostics.
  * amdgpu_formats.h — convenience aliases over `SIInstrFlags` so
    handler dispatches read as `Flags & FORMAT_VALU` / `FORMAT_SOPK`
    instead of hand-typing the bit layout.
  * opcode_map.{h,cpp} — `OpcodeMap` builds an MC-opcode -> CanonicalOp
    table once at first use by walking the MCInstrInfo registered for
    a given AMDGPU subtarget. Per-handler entries are folded into the
    table here; the actual `lookup` consumers come in subsequent
    handler commits.

`hotswap/CMakeLists.txt` grows to compile canonical_op.cpp +
opcode_map.cpp into the OBJECT lib. No new external dependencies.

Co-Authored-By: Tim Gymnich <tim@gymni.ch>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 8cbf5bd to d3b2b77 Compare May 18, 2026 14:38
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from e20dd25 to 877df15 Compare May 18, 2026 14:38
@ftynse ftynse force-pushed the users/tgymnich/hotswap-pr-02-code-object-metadata branch from 877df15 to 9428a8b Compare May 21, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants