-
Notifications
You must be signed in to change notification settings - Fork 208
flowcontrol: Support dynamic priority provisioning #2006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flowcontrol: Support dynamic priority provisioning #2006
Conversation
This commit implements dynamic provisioning for priority bands, enabling the system to accept requests with priority levels that were not explicitly configured at startup. Key Changes: 1. Hybrid State Model: - Adopts `sync.Map` in `FlowRegistry` and `registryShard` for operational state (queues, stats). This enables lock-free reads on the hot path and safe concurrent writes when provisioning new bands. - Retains `map` + `RWMutex` for the `Config` definition to ensure a consistent source of truth for topology limits and names. 2. Dynamic Provisioning Logic: - `prepareNewFlow` implements an optimistic lock-free check for band existence. - `ensurePriorityBand` handles the atomic JIT creation of new bands, updating both the global config definitions and the active shard runtimes. 3. Configuration Extensions: - Added `DefaultPriorityBand` to `Config` to serve as the template for dynamically created bands. - Updated validation logic to support empty static band lists, defaulting instead to a purely dynamic configuration. - Removed obsolescent test cases enforcing static band requirements.
This commit extends the test suite to validate the new dynamic priority provisioning capabilities in the FlowRegistry and Shard components. Key Updates: 1. Shard Tests: - Added `TestShard_DynamicProvisioning` to verify that `addPriorityBand` correctly updates internal state (`sync.Map`, sorted levels) and maintains idempotency. - Verified that accessing missing configuration triggers the expected invariant violation panic. 2. Registry Tests: - Added `TestFlowRegistry_DynamicProvisioning` to verify the end-to-end flow: creating a new priority via `WithConnection` updates the global config, stats, and propagates to all active shards. - Added concurrency tests to ensure multiple clients requesting the same new priority simultaneously do not race or duplicate work. - Verified that dynamic bands are correctly persisted across scaling events. 3. Config Tests: - Updated `TestNewConfig` to support empty static band configurations (now valid). - Added verification for `DefaultPriorityBand` template initialization and cloning behavior.
Removes the explicit hardcoded configuration of the "Default" priority band (Priority 0) in the main entry point. The system now relies entirely on the newly implemented `DefaultPriorityBand` mechanism to dynamically provision Priority 0 just-in-time upon the first request. This simplifies the wiring code and validates the zero-config startup path.
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Hi @LukeAVanDrie. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Justification for |
|
/ok-to-test |
| // protected by its own lock. | ||
| fr.mu.RLock() | ||
| _, exists := fr.config.PriorityBands[key.Priority] | ||
| fr.mu.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fine grained management of locking is error prone, do we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current JIT implementation, yes. In the controller implementation you suggested, probably not depending on how we handle the race condition between reconciliation and request serving.
| // Apply defaults to the template. | ||
| builder.config.DefaultPriorityBand.applyDefaults() | ||
| if builder.config.DefaultPriorityBand.PriorityName == "" { | ||
| builder.config.DefaultPriorityBand.PriorityName = "Dynamic-Default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be the infObj object name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be much better for observability. The JIT path in this PR makes this tricky. If we manage to move to the controller path, this becomes much easier to populate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question though... How do we handle N InferenceObjectives mapped to the same priority level? Right now, priority bands are unique per int. It would not be simple (and perhaps not desirable) to break that.
I guess we could make the name a composite of the objectives mapped to it `"InfObjA,InfObjB,..."? The name itself then becomes dynamic data and is no longer immutable (though I don't foresee any issues with this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point on multiple infObjs mapping to the same priority. I think the name has no value and it is rather confusing since naturally one will expect the infObj name to be logged. So I recommend removing it.
|
|
||
| // If the band was missing, we must acquire the Write Lock to create it. | ||
| if !exists { | ||
| if err := fr.ensurePriorityBand(key.Priority); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting that priority bands are created/deleted as a trigger of InfObj reconcile events. When a new infObj is created we create a band, when deleted we delete the band. doing it in the request processing path is an anti-pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driving this from the InferenceObjective controller is cleaner; however, it introduces specific synchronization and lifecycle challenges that the JIT approach handled implicitly. We need to decide how to handle these scenarios:
Consistency:
- Operator applies
InferenceObjective { Priority: 10 }. - Client immediately sends curl matching that objective.
- The request hits the EPP before the Reconciler loop has updated the
FlowRegistry. - The
FlowRegistrysees "Priority 10", checks its map, sees nothing.
With the current JIT behavior: it works. With the InferenceObjective reconciliation path it fails (503 Service Unavailable or similar) or MUST fall back to a default priority band (e.g., "Priority 0"). I am not yet sure just how narrow we can make this race condition. Perhaps, we can make these reconciliation steps fully atomic, but I am not so certain that this will be simple or feasible.
No InferenceObjective:
An InferenceObjective is not required. Does "Priority 0" exist if no InferenceObjective defines it?
Traffic without an objective defaults to "Priority 0," so we need to always provision a "Default/0" band on startup, regardless of CRDs, to catch fall-through traffic. This places us right back at a hybrid of static + controller-managed config.
Lifecycle Management:
If we listen to InferenceObjective events, we must also handle Delete. When deleted, we must drain and remove the associated priority band. What if flows are currently active (have buffered requests) in that band? We need a graceful drain mechanism before destroying the sync.Map entry.
This isn't strictly an issue with the CRD reconciliation approach; my current PR only adds bands (which is likely okay for now). The state management issue just becomes more apparent under the CRD model though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistency:
This scenario can also happen when we look up the InfObj, it may not have been created by the time the request got to the epp, so it is not a new problem.
Perhaps, we can make these reconciliation steps fully atomic, but I am not so certain that this will be simple or feasible.
We only need to be eventually consistent, but when creating a new infObj should trigger a state update in the epp that includes the availability of the infObj itself as well as all associated state like the Band, this state update in the epp can happen atomically.
An InferenceObjective is not required. Does "Priority 0" exist if no InferenceObjective defines it?
Traffic without an objective defaults to "Priority 0," so we need to always provision a "Default/0" band on startup, regardless of CRDs, to catch fall-through traffic. This places us right back at a hybrid of static + controller-managed config.
The default priority is a special case for which we create a band for it on startup.
All that being said, I am ok proceeding with the current approach you have, but please open an issue to track this. I am sure we will revisit it when we introduce the flow CRD
|
We have two distinct lifecycles intersecting here:
My initial implementation attempts to unify these into a single JIT path for simplicity. If we split them, I am happy to move to this split model (once we resolve these open questions) , but I believe the |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, LukeAVanDrie The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR implements dynamic priority provisioning, allowing the Flow Control layer to accept and process requests with priority levels that were not explicitly configured at startup.
Previously, priority bands had to be statically defined in the config. Requests matching an unknown priority were rejected. This change enables a "zero-config" operational experience where the system automatically provisions new priority bands (with their corresponding queues, policies, stats tracking etc.) Just-In-Time (JIT) based on a default template (which itself is optionally configurable).
Key Changes:
Hybrid State Model (
sync.Map+Mutex):FlowRegistryandregistryShardto usesync.Mapfor operational state (queues, statistics). This ensures the data path remains lock-free for reads while allowing safe, concurrent additions of new bands.mapprotected byRWMutexfor theConfigdefinition to ensure a consistent source of truth for topology limits (MaxBytes) and observability metadata.Dynamic Provisioning Logic:
ensurePriorityBand, which atomically updates the global configuration and orchestrates the rollout of the new band to all active shards.addPriorityBand, which instantiates the runtime components (Queues, DispatchPolicies) for the new priority level.Configuration & Bootstrapping:
DefaultPriorityBandtoConfig. This template controls the policy and queue settings for dynamically created bands.Which issue(s) this PR fixes:
Fixes #1792
Does this PR introduce a user-facing change?: