[Perf] Enable forcing perf dispatch use specific implementation#401
[Perf] Enable forcing perf dispatch use specific implementation#401hughperkins wants to merge 15 commits intomainfrom
Conversation
Document the @qd.perf_dispatch auto-tuning decorator: basic usage, decorator order, compatibility filtering, geometry hashing, tuning parameters, and benchmarking internals.
Two env vars allow overriding the auto-tuning: - QD_PERFDISPATCH_FORCE=dispatcher:impl[,dispatcher2:impl2,...] forces specific implementations for named dispatchers. - QD_PERFDISPATCH_FORCE_INDEX=N forces the Nth registered implementation for all dispatchers (per-dispatcher override takes priority). When either env var is set, all dispatchers print their name and registered implementations on first call, making it easy to discover valid values.
Only keep the per-dispatcher QD_PERFDISPATCH_FORCE env var.
Add section on forcing specific implementations and discovering available dispatcher/implementation names.
Use larger arrays, marker array c with IntEnum, do_work_py, and .to_numpy() for assertions — matching the patterns that work reliably on metal/vulkan GPU backends.
| self._dispatch_impl_set: set[DispatchImpl] = set() | ||
| self._dispatch_impl_list: list[DispatchImpl] = [] | ||
| self._forced_impl: DispatchImpl | None = None |
There was a problem hiding this comment.
Why are you storing list and set? This is weird.
tests/python/test_perf_dispatch.py
Outdated
| @@ -1,9 +1,11 @@ | |||
| from enum import IntEnum | |||
| from typing import cast | |||
| from unittest import mock | |||
There was a problem hiding this comment.
Don't use this. Use pytest.
| if ":" not in pair: | ||
| print( | ||
| f"[perf_dispatch] WARNING: ignoring malformed QD_PERFDISPATCH_FORCE entry '{pair}' (expected 'dispatcher:impl')" | ||
| ) |
There was a problem hiding this comment.
I don't think a simple print is ok. It should be an error. Why would you tolerate this?
| f"[perf_dispatch] WARNING: ignoring malformed QD_PERFDISPATCH_FORCE entry '{pair}' (expected 'dispatcher:impl')" | ||
| ) | ||
| continue | ||
| dispatcher_name, impl_name = pair.split(":", 1) |
There was a problem hiding this comment.
dispatcher_name, impl_name = pair.split(":") would guard against invalid characters for free.
| ) | ||
| continue | ||
| dispatcher_name, impl_name = pair.split(":", 1) | ||
| result[dispatcher_name.strip()] = impl_name.strip() |
There was a problem hiding this comment.
You could just do dispatcher_name, impl_name = map(str.strip, pair.split(":")) to avoid all these strip if you want.
| return result | ||
|
|
||
|
|
||
| _FORCE_MAP: dict[str, str] = _parse_force_map(_QD_PERFDISPATCH_FORCE_RAW) |
There was a problem hiding this comment.
What about a frozendict? (if it is already a deps, otherwise not worth it)
There was a problem hiding this comment.
frozendict is not a depdenncy of quadrants https://github.com/Genesis-Embodied-AI/quadrants/blob/main/pyproject.toml
…e-implementation # Conflicts: # docs/source/user_guide/index.md
|
Opus 4.6 review: SummaryThe branch adds two things:
What's done well
Issues to address1.
|
…o single _dispatch_impls list
|
additional thougths from reviewer opus on concern 1: This one I'd actually leave as-is. A few reasons:
If it ever does become noisy (many dispatchers), it could be refined later by only printing the full listing once at process exit or by adding a filter. But that's a problem for |
|
update from reviewer opus on concern 5: Honestly, this one is borderline. _parse_force_map is a simple string parser that's only called once at module load with an env var value. The happy paths (valid entries, |
|
reviewer opus update on concern 3: This is technically correct — there's a self._resolve_force() call plus a self._forced_impl is not None check on every call — but it's not worth fixing. Two boolean checks |
Consistent with the rest of the test suite.
Rejects entries with multiple colons via unpacking ValueError rather than silently accepting them.
…tead of warning Fail loudly so users don't silently get normal benchmarking when they intended to force an implementation.
Issue: #
Brief Summary
copilot:summary
Walkthrough
copilot:walkthrough