Skip to content

Fix discrepancy in a_stride_m/a_stride_k for transposed dot kernels#9840

Closed
copybara-service[bot] wants to merge 0 commit intomasterfrom
test_892702737
Closed

Fix discrepancy in a_stride_m/a_stride_k for transposed dot kernels#9840
copybara-service[bot] wants to merge 0 commit intomasterfrom
test_892702737

Conversation

@copybara-service
Copy link
Copy Markdown
Contributor

Fix discrepancy in a_stride_m/a_stride_k for transposed dot kernels

Currently, the stride of k for transposed kernels (passed as a_stride_m because this is the row dimension when A is transposed) is the stride of tile_k values of k. This is inconsistent, because the stride is not for one value of k, which we assume in several places. This leads to multiplying or dividing strides to make them consistent.

In particular, run_dot multiplies the stride by k, while kernels do not, which means we can't use the same stride for both run_dot and a kernel. This discrepancy is preventing refactoring run_dot to capture the strides to pass to the kernels easily, which I think is a necessary step towards addressing some issues (packing A/B in the loops of run_dot).

@copybara-service copybara-service bot force-pushed the test_892702737 branch 3 times, most recently from ddbf6ad to 85a5bea Compare April 3, 2026 17:05
@copybara-service copybara-service bot closed this Apr 3, 2026
@copybara-service copybara-service bot deleted the test_892702737 branch April 3, 2026 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants