Checking the code, there is a potential race condition with pipeline depth 3, and CPU being unable to catch up
This would show up as repeated tokens as the sampler output would not be updated in time for the next iteration.
The proper solution is matching all buffers explicitly with the pipeline depth.
Checking the code, there is a potential race condition with pipeline depth 3, and CPU being unable to catch up
This would show up as repeated tokens as the sampler output would not be updated in time for the next iteration.
The proper solution is matching all buffers explicitly with the pipeline depth.