[0035] Specify a way to transpose Thread Scope matrices on load and constraint loads to A-type Matrices.#803
Conversation
| Not all component types support transposing, it is implementation specific. | ||
| Applications need to query the driver to determine if a matrix transpose is | ||
| supported. | ||
|
|
There was a problem hiding this comment.
This sounds like we need some changes to the metadata and PSV0 to ensure that we are surfacing whether or not transpose is used.
There was a problem hiding this comment.
Yep, I can add a Layout field, but before that a couple of questions. Currently we have,
struct MatrixUse {
uint32_t Dimensions[3]; // M, N, K
uint8_t Scope;
uint8_t OperandType;
uint8_t ResultType;
uint8_t RESERVED; // unused but reserved for padding/alignment.
uint32_t Flags; // do we need this?
};
It seems to capture all the required details to validate the Matrix x Matrix Multiply operation, but not for Matrix x Vector operations, OuterProduct and the recently ratified VectorAccumulate. Should we add new structs to capture the relevant validation information for those?
struct MatrixVectorOpsUse {
uint8_t InputVecType;
uint8_t MatrixType;
bool MatrixIsTranposed;
uint8_t BiasType;
}
and a similar structs for OuterProdAcc and VectorAcc. If this seems ok, I'll add these to the PR.
There was a problem hiding this comment.
The right approach might be to make a bigger change. Something like:
struct MatrixUseBase {
uint8_t Scope;
uint8_t OperandType;
}
struct MatrixMatrixUse : public MatrixUseBase {
uint8_t ResultType;
uint8_t RESERVED; // unused but reserved for padding/alignment.
uint32_t Flags; // do we need this?
uint32_t Dimensions[3]; // M, N, K
};
struct MatrixVectorUse : public MatrixUseBase{
uint8_t InputVecType;
uint8_t BiasType;
uint32_t Flags; // 0x0001 - IsTransposed
uint32_t Dimensions[2]; // M, N
uint32_t Unused; // Only exists to make use struc
};
static_assert(sizeof(MatrixMatrixUse) == sizeof(MatrixVectorUse), "These should match!");
This allows us to have a single list of uses that are slightly different.
@jenatali & @JoeCitizen, this will have impact on the runtime side of things. Do you have thoughts?
There was a problem hiding this comment.
Tagging @tex3d too, since a related topic came up in a meeting a few minutes ago.
There was a problem hiding this comment.
@tex3d If you are ok with the above, I'll make the changes to the PR.
There was a problem hiding this comment.
More extensive changes than these are needed. Leave the PSV0 updates to me, and I should have a PR up for these later today.
There was a problem hiding this comment.
Forgot to leave follow-up comment. PR for PSV0/RDAT updates is here:
#832
|
I think this PR just needs a rebase and updating the godbolt link. |
|
@anupamachandra - can you get this ready to merge please? Plan is to update the DXC implementation to match this for the preview (@hekota to put change together) by adding the new enum entries. |
|
Yep, will do. |
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: Jesse Natalie <jenatali@microsoft.com>
07efbf2 to
4a97411
Compare
All the changes are in. This is ready to merge. |
Based on spec update microsoft/hlsl-specs#803.
This PR fixes PR#787.
Changes:
MatrixLayoutEnumto contain Transpose types that specify that the matrices will be transposed on load. This applies only to Thread scope optimal layout matrices. InterlockedAccumulate will continue to support onlyOuterProductOptimallayout for thread scope matrix, and not the transpose variant.