Skip to content

[Comgr][hotswap] Add DecodedInst + basic kernel decoder#2517

Open
tgymnich wants to merge 1 commit into
users/tgymnich/hotswap-pr-03-mc-statefrom
users/tgymnich/hotswap-pr-04-decoder
Open

[Comgr][hotswap] Add DecodedInst + basic kernel decoder#2517
tgymnich wants to merge 1 commit into
users/tgymnich/hotswap-pr-03-mc-statefrom
users/tgymnich/hotswap-pr-04-decoder

Conversation

@tgymnich
Copy link
Copy Markdown

Wraps the MC-layer disassembler in a hotswap-shaped API:

  • decoded_inst.h — DecodedInst value type bundling MCInst, raw bytes, kernel offset, TSFlags, and an explicit IsBranch / BranchTarget decoration so Phase-3 BB layout can run without re-querying MCInstrDesc.
  • parsed_reg.h — ParsedReg{Kind, BaseIdx, NReg} value type so handlers can match on register class without re-parsing MCOperand::getReg() at every consumer.
  • decode.{h,cpp} — decodeKernel walks the .text section once, populates a DecodeResult{Insts, BlockStarts, Offsets} view used by every subsequent raise phase. Block boundaries derive from the branch target set discovered during the linear sweep so Phase 3 can pre-create LLVM BasicBlocks before any IR builder lands.
  • mc_state.cpp — minor reshuffle to expose the MCDisassembler / MCSubtargetInfo pair to the new decode entry point.

@tgymnich tgymnich requested review from chinmaydd and lamb-j as code owners May 13, 2026 14:23
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 963fd34 to 5f3d5aa Compare May 13, 2026 15:40
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from 6e6f193 to 4d23bed Compare May 13, 2026 15:40
@tgymnich tgymnich requested a review from martin-luecke May 13, 2026 15:56
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 5f3d5aa to 5a5c860 Compare May 15, 2026 11:16
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from 4d23bed to f434f7f Compare May 15, 2026 11:16
@tgymnich tgymnich requested a review from ftynse May 15, 2026 11:44
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 5a5c860 to a098906 Compare May 15, 2026 13:06
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from f434f7f to 5b815e6 Compare May 15, 2026 13:06
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from a098906 to 4eda47a Compare May 15, 2026 15:28
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from 5b815e6 to 634765b Compare May 15, 2026 15:28
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4eda47a to 4b9f8b4 Compare May 15, 2026 16:07
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from 634765b to a32b336 Compare May 15, 2026 16:07
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4b9f8b4 to 97a7c25 Compare May 15, 2026 16:18
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from a32b336 to 5249422 Compare May 15, 2026 16:18
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 97a7c25 to 4dbeb97 Compare May 15, 2026 18:55
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from 5249422 to eac9fc5 Compare May 15, 2026 18:55
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 4dbeb97 to 8cbf5bd Compare May 15, 2026 20:45
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from eac9fc5 to c1bbdbf Compare May 15, 2026 20:46
Copy link
Copy Markdown

@ftynse ftynse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the DecodedInst data structure looks needlessly fat and redundant. A lot of its fields are duplicating information from MCInst, which it also stores, and should be converted into accessor functions. Some flags are relevant for any instruction with a specific name/class, not a particular instance, so there is no value in storing them per-instance except micro-optimization on the hot path (that would require data prior to implementing them). Even when it's a hot path, a plausible solution would be to have a look-aside table with that information and store pointers into that table instead.

There are also comments about string parsing, though I don't really see any code performing, and would complain about it unless the approach is unavoidable. The only place with string parsing seems to be vcmp that I already complained about in the previous PR.

Comment on lines +21 to +23
}

namespace COMGR::hotswap {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
namespace COMGR::hotswap {


namespace COMGR::hotswap {

struct DecodedInst {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a doc. All of the fields need doc.


} // namespace COMGR::hotswap

#endif
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add a comment saying which #if this closes.

Comment on lines +26 to +27
std::string Mnemonic;
std::string RawMnemonic;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to store the textual mnemonic, twice? For every instance of an instruction. Can't we rather have an enum entry and/or the opcode. It also looks like we have the MCInst below, so this may even be redundant.

struct DecodedInst {
std::string Mnemonic;
std::string RawMnemonic;
std::string FullText;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really need to store textual representation of the instruction for some reason, as opposed to generating it on-the-fly from MCInt, maybe we can generate full textual assembly once into a memory mapped file or something like that and only store pointers/StringRefs to that instead of piecemeal dynamic allocations of strings for every instance.

DecodedInst Di;
Di.RawMnemonic = getMnemonic(Mc, Inst);
{
std::string S;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SmallString to avoid dynamic allocations.

Comment on lines +696 to +697
Mc.Printer->printInst(&Inst, 0, "", *Mc.SubtargetInfo, Os);
Di.FullText = StringRef(S).ltrim().str();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above about printing the whole function and keeping references instead, assuming this is needed at all.

Mc.Printer->printInst(&Inst, 0, "", *Mc.SubtargetInfo, Os);
Di.FullText = StringRef(S).ltrim().str();
}
Di.Mnemonic = stripEncoding(StringRef(Di.RawMnemonic)).str();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be done on the fly?

break;
}
Off += InstSize;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we want to signal how many bytes we processed if it's not the whole input.

Comment on lines +177 to +179
static constexpr unsigned KMaxSrcs = 24;
unsigned SrcMap[KMaxSrcs] = {};
unsigned ModMap[KMaxSrcs] = {};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are storing 48*4=192 bytes of information per instruction here where the overwhelming majority of instructions won't ever use 24 sources. Sounds ridiculously wasteful. Just use a SmallVector.

@ftynse
Copy link
Copy Markdown

ftynse commented May 18, 2026

A separate question is testing. We need a way to test this as well.

Wraps the MC-layer disassembler in a hotswap-shaped API:

  * decoded_inst.h — `DecodedInst` value type bundling MCInst, raw
    bytes, kernel offset, TSFlags, and an explicit `IsBranch` /
    `BranchTarget` decoration so Phase-3 BB layout can run without
    re-querying MCInstrDesc.
  * parsed_reg.h — `ParsedReg{Kind, BaseIdx, NReg}` value type so
    handlers can match on register class without re-parsing
    `MCOperand::getReg()` at every consumer.
  * decode.{h,cpp} — `decodeKernel` walks the .text section once,
    populates a `DecodeResult{Insts, BlockStarts, Offsets}` view used
    by every subsequent raise phase. Block boundaries derive from the
    branch target set discovered during the linear sweep so Phase 3
    can pre-create LLVM BasicBlocks before any IR builder lands.
  * mc_state.cpp — minor reshuffle to expose the `MCDisassembler` /
    `MCSubtargetInfo` pair to the new decode entry point.

Co-Authored-By: Tim Gymnich <tim@gymni.ch>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-03-mc-state branch from 8cbf5bd to d3b2b77 Compare May 18, 2026 14:38
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-04-decoder branch from c1bbdbf to 36b17dd Compare May 18, 2026 14:38
@chinmaydd chinmaydd added comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature labels May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants