Skip to content

Broker startup does not reconcile Helix InstanceConfig tags on restart, leading to stale broker tags and misrouting #17432

@praveenc7

Description

@praveenc7

Summary

When a Pinot broker is not shut down cleanly and later restarts (often on a different pod/host), it can come up with a different intended/actual tag than what’s stored in Helix InstanceConfig. Today the broker does not reliably converge to the intended tag on restart because the startup tag reconciliation logic only runs for new brokers. As a result, the broker can remain stuck with stale Helix tags and announce under them, this causes misrouting for us, since we rely on Helix tags for broker routing.

In updateInstanceConfigAndBrokerResourceIfNeeded() , tags are only updated from config when the broker has no tags in Helix: This means for restarts (where InstanceConfig already has tags), the broker never reconciles tags with CONFIG_OF_BROKER_INSTANCE_TAGS, so it can stay permanently incorrect until manual intervention/rebuild broker resources.

Example

Before shutdown

• Actual tag: foo
• Tag in Helix InstanceConfig: foo

After restart (pod moved)

• Actual tag: bar
• Tag in Helix InstanceConfig: foo (stale)

Result: broker may announce as foo even though it should be bar, we rely in this logic for routing so at times this results in mis-routing of queries

Proposal

On broker startup:

  1. Resolve the broker’s intended/actual tag (from config/env/deployment).
  2. Read Helix InstanceConfig for the broker and check the stored tag.
  3. If mismatch (InstanceConfig=foo, intended=bar):
    •Update InstanceConfig to bar even if tags are non-empty (not only when empty), and do so before startup.
  4. Proceed with announcement only after InstanceConfig reflects bar.

This ensures a broker always starts with the correct tag and announces only for that tag.

Metadata

Metadata

Assignees

Labels

PEP-RequestPinot Enhancement Proposal request to be reviewed.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions