Problem description
When MIGraphX compiles a model with dynamic dimensions, the compilation time heavily depends on the maximum size of the specified dynamic dimensions.
For the simple reproduction script below, it seems to scale linearly, but we've observed it to scale quadratically with the maximum size of the specified dynamic dimensions for larger models.
Also, the first run of the reproduction script below will produce compilation times (and sizes) as follows:
Compiling model with max size 8 ...
Compilation time: 4.881 s
Model size: 0.119063 MiB
Compiling model with max size 16 ...
Compilation time: 3.184 s
Model size: 0.236793 MiB
Compiling model with max size 32 ...
Compilation time: 7.989 s
Model size: 0.472382 MiB
Compiling model with max size 64 ...
Compilation time: 15.368 s
Model size: 0.943545 MiB
Compiling model with max size 128 ...
Compilation time: 30.448 s
Model size: 1.886622 MiB
Compiling model with max size 256 ...
Compilation time: 47.748 s
Model size: 3.777006 MiB
Compiling model with max size 512 ...
Compilation time: 96.688 s
Model size: 7.562881 MiB
While the second run will produce the following compilation times:
Compiling model with max size 8 ...
Compilation time: 0.396 s
Model size: 0.119063 MiB
Compiling model with max size 16 ...
Compilation time: 0.692 s
Model size: 0.236793 MiB
Compiling model with max size 32 ...
Compilation time: 1.334 s
Model size: 0.472382 MiB
Compiling model with max size 64 ...
Compilation time: 2.672 s
Model size: 0.943545 MiB
Compiling model with max size 128 ...
Compilation time: 5.298 s
Model size: 1.886622 MiB
Compiling model with max size 256 ...
Compilation time: 10.788 s
Model size: 3.777006 MiB
Compiling model with max size 512 ...
Compilation time: 22.186 s
Model size: 7.562881 MiB
We've traced the difference in compilation times to the cache stored in ~/.cache/comgr.
The cache doesn't help much in our specific scenario, because we compile all of our models exactly once.
Regardless, both series of compilation times are very bad and basically make the dynamic dimension support unusable for realistic computer-vision models.
As a side note, the model seems to be internally copied max_seq_len-times, which is very odd.
This is visible from the last print(migraphx_model) statement in the reproduction script.
Steps to reproduce
import math
import migraphx
import os
import time
import torch
DEVICE = "cuda:0"
EMBEDDING_COUNT = 32
EMBEDDING_DIM = 16
BATCH_SIZE = 4
torch.inference_mode(True)
torch.cuda.set_device(DEVICE)
model = torch.nn.Embedding(EMBEDDING_COUNT, EMBEDDING_DIM)
model.eval()
input_batch = torch.arange(math.ceil(EMBEDDING_COUNT / 2)).repeat(BATCH_SIZE, 1).contiguous()
torch.onnx.export(
model,
(input_batch,),
"model.onnx",
external_data=False,
dynamo=True,
dynamic_shapes=[
{0: torch.export.Dim.DYNAMIC, 1: torch.export.Dim.DYNAMIC},
],
)
for max_seq_len in [8, 16, 32, 64, 128, 256, 512]:
print("Compiling model with max size", max_seq_len, "...")
migraphx_model = migraphx.parse_onnx("model.onnx", map_dyn_input_dims={
"input": [
migraphx.shape.dynamic_dimension(BATCH_SIZE, BATCH_SIZE, {BATCH_SIZE}),
migraphx.shape.dynamic_dimension(1, max_seq_len, {1}),
],
})
compilation_time = -time.perf_counter()
migraphx_model.compile(migraphx.get_target("gpu"), offload_copy=False)
compilation_time += time.perf_counter()
migraphx.save(migraphx_model, "model.mxr")
mxr_size = os.path.getsize("model.mxr") / 1024 / 1024
print("Compilation time:", f"{compilation_time:.03f}", "s")
print("Model size:", f"{mxr_size:03f}", "MiB")
print()
print(migraphx_model)
Environment
OS: Debian GNU/Linux 12 (bookworm)
CPU: AMD Ryzen 9 9950X
GPU: AMD Radeon AI PRO R9700
ROCm version: 7.2.1
MIGraphX version: 2.16.0.dev+20250912-17-406-gb91f1c0c0
Problem description
When MIGraphX compiles a model with dynamic dimensions, the compilation time heavily depends on the maximum size of the specified dynamic dimensions.
For the simple reproduction script below, it seems to scale linearly, but we've observed it to scale quadratically with the maximum size of the specified dynamic dimensions for larger models.
Also, the first run of the reproduction script below will produce compilation times (and sizes) as follows:
While the second run will produce the following compilation times:
We've traced the difference in compilation times to the cache stored in
~/.cache/comgr.The cache doesn't help much in our specific scenario, because we compile all of our models exactly once.
Regardless, both series of compilation times are very bad and basically make the dynamic dimension support unusable for realistic computer-vision models.
As a side note, the model seems to be internally copied
max_seq_len-times, which is very odd.This is visible from the last
print(migraphx_model)statement in the reproduction script.Steps to reproduce
Environment