Add Azure Monitor alerting for cloud-level resources#143
Draft
ian-flores wants to merge 3 commits intomainfrom
Draft
Add Azure Monitor alerting for cloud-level resources#143ian-flores wants to merge 3 commits intomainfrom
ian-flores wants to merge 3 commits intomainfrom
Conversation
Add prometheus.exporter.azure config blocks to Alloy for Azure workloads covering PostgreSQL, NetApp Files, Load Balancer, Storage, and NAT Gateway (conditional on public_subnet_cidr). Create Monitoring Reader role assignment and workload identity for Alloy service account. Ref: ptd-config#2779
Create Grafana provisioned alert YAML files for Azure cloud resources: - azure_postgres.yaml: CPU, storage, memory, connections, deadlocks - azure_netapp.yaml: capacity, read/write latency - azure_loadbalancer.yaml: health probe, data path, SNAT exhaustion - azure_storage.yaml: availability, E2E latency Ref: ptd-config#2779
Add 27 tests covering: - Azure Monitor Alloy config generation (metric blocks, NAT conditional, subscription/resource group interpolation, AWS returns empty) - Alert YAML file validation (existence, structure, metric queries) - Alloy monitoring identity method existence and signature Ref: ptd-config#2779
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add Azure Monitor-based alerting for Azure cloud resources, equivalent to PR #139 for AWS CloudWatch.
Closes: ptd-config#2779
prometheus.exporter.azureconfig blocks to Alloy for Azure workloads (PostgreSQL, NetApp Files, Load Balancer, Storage, NAT Gateway)Alert rules
Bonus fix
Azure clusters previously received zero alerts — not even cloud-agnostic ones (pods, nodes, healthchecks, applications, mimir). This PR deploys all of them.
Test plan