feat: add metrics API with Postgres and ClickHouse support by alexluong · Pull Request #734 · hookdeck/outpost

alexluong · 2026-03-08T19:49:24Z

implements #210

vercel · 2026-03-08T19:49:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
outpost-docs	Ready	Preview, Comment	Mar 13, 2026 3:54pm
outpost-website	Ready	Preview, Comment	Mar 13, 2026 3:54pm

go.mod

docs/apis/openapi.yaml

Split LogStore into Records + Metrics sub-interfaces. LogStore is now the combined interface so all existing consumers are unaffected. Typed responses per endpoint (EventMetricsResponse, AttemptMetricsResponse) with all fields as pointers. Stub implementations for CH, PG, and mem drivers return errNotImplemented. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a comprehensive metrics test dataset spanning January 2000 with 300 tenant-1 events (50 sparse across 5 days + 250 dense bell-curve on Jan 15) and 5 tenant-2 events for isolation. All dimension cycling produces round numbers (100/topic, 150/dest, 180 success, 120 failed, error_rate=0.4, etc). Covers all granularities (1m, 1h, 1d, 1w, 1M), dimensions, filters, and measures for both event and attempt metrics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Dynamic SQL builder for event and attempt metrics with parameterized queries, time bucketing (date_bin/date_trunc), dimension grouping, conditional aggregates (FILTER WHERE), 30s query timeout fallback, and row limit enforcement with truncation flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Naive implementation using uniqExact/uniqExactIf for dedup-safe aggregation over ReplacingMergeTree without FINAL. Includes 30s query timeout fallback and 100k row limit with truncation detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

All three drivers (mem, pg, ch) now implement the metrics interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire up QueryEventMetrics and QueryAttemptMetrics as GET /api/v1/metrics/events and GET /api/v1/metrics/attempts with query param parsing, allowlist validation, JWT tenant scoping, and response transformation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…re, add metrics OpenAPI spec memlogstore QueryEventMetrics was missing tenant_id dimension causing empty results when grouping by tenant. QueryAttemptMetrics was missing both tenant_id and attempt_number dimensions. Also adds MetricsResponse schemas and /metrics/events, /metrics/attempts paths to the OpenAPI spec. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Split metrics drivertest into DataCorrectness (existing value assertions) and Characteristics (structural contract tests for dense bucket filling, ordering, alignment, deterministic count, zero measures, no-data ranges, and dimension × time filling). Shared dataset setup, single provisioning. Characteristics tests will fail until bucket filling is implemented. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Enables new(expr) syntax for cleaner pointer initialization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Sparse time-series responses caused dashboard charts to render fat bars instead of slim bars across all time slots. This adds a shared bucket filling layer (internal/logstore/bucket/) called by all 3 backends after query, producing dense responses with zero-filled gaps. - Extract TruncateTime into shared bucket package - FillEventBuckets / FillAttemptBuckets with dimension-aware filling - Update drivertest assertions for dense bucket counts - All characteristics tests now pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The API allowlist accepted these filters but all three query builders (pg, ch, mem) silently ignored them, causing filters like attempt_number=0 to return unfiltered results. Add WHERE clauses in all drivers and conformance tests to prevent regression. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ateRange to TimeRange in Go code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge main (PR #732) and update metrics handlers to use ParseArrayQueryParam and resolveTenantIDsFilter. Update OpenAPI spec filters to oneOf string/array schema with bracket notation. Update test query strings to indexed bracket format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add per-second throughput rate measures computed as count / bucket_duration_seconds. Events endpoint gets `rate`, attempts gets `rate`, `successful_rate`, `failed_rate`. Rate computation lives in shared driver/rate.go, called by each driver after bucket filling. Dependency measures are auto-enriched (e.g. requesting `rate` without `count` internally adds `count` for SQL but omits it from API response). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

6-chart metrics dashboard: delivery volume (stacked bar), error rate (line, 0-100%), retries (multi-line), avg attempt number, status code breakdown, and topic breakdown. Shared timeframe selector (1h/24h/7d/30d), all charts use /metrics/attempts endpoint. Includes dataviz CSS vars for info (blue) and warning (orange) themes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…hart Restructure the metrics grid to 3 rows: events/deliveries, error breakdown (3-col), and retry pressure. Add new "Events / count" chart using attempt_number=0 filter. Support title/subtitle pattern with muted subtitles and add filters param to useMetrics hook. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add sparkline + event count to each destination row using 4h granularity with attempt_number=0 filter. Includes Sparkline component with stacked success/failed bars, empty-bar rendering, and granularity override for useMetrics hook. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move Loading import to top of MetricsChart.tsx. Use arrays/objects directly in useMetrics instead of serializing to strings and re-splitting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- seed_metrics.sh: generates realistic event→attempt chains with configurable error rates, retry chains, and time distribution - qa_metrics.sh: 11 named scenarios (healthy, failing, spike, empty, single, all-fail, all-success, recent, many-topics, many-codes, retry-heavy) with verification checklists - README documenting usage and scenarios Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alexluong · 2026-03-11T09:57:52Z

Added seed & QA scripts in scripts/metrics/:

seed_metrics.sh — generates event→attempt chains with configurable error rates, retries, time distribution
qa_metrics.sh — 11 scenarios (healthy, failing, spike, empty, single, all-fail, all-success, recent, many-topics, many-codes, retry-heavy)

Ran full manual QA against Postgres. All 11 scenarios passing across all timeframes (1h/24h/7d/30d). Dashboard renders correctly: error rate Y-axis shows percentages, stacked bars work, breakdowns sort correctly, empty states render, single-event edge case works, attempt_number=0 filter correctly separates event count from delivery count.

Add manual=false filter alongside attempt_number=0 to prevent manual retries (which also start at attempt_number=0) from inflating event counts in the destination metrics chart and destinations list sparkline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Manual retries start a new chain at attempt_number=0, inflating first_attempt_count. Add AND NOT manual to CH, PG, and memlogstore queries. Add FIXME for test dataset which assigns manual and attempt_number independently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…anularity, filters Migrate bench suite to current metrics API (TimeRange, tenant via filters) and add bench cases for rate measures, multi-value granularities (2d/w/M), new dimensions (code, attempt_number), and new filters (code, manual, attempt_number, multi-filter). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

PR #740 changed attempt_number from 0-based to 1-based. Update all metrics query logic, test data, seed scripts, and bench seeds to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview – outpost-docs March 8, 2026 19:49 View deployment

alexluong commented Mar 8, 2026

View reviewed changes

go.mod Show resolved Hide resolved

vercel bot deployed to Preview – outpost-website March 9, 2026 18:52 View deployment

vercel bot deployed to Preview – outpost-docs March 9, 2026 18:53 View deployment

alexbouchardd requested changes Mar 9, 2026

View reviewed changes

vercel bot deployed to Preview – outpost-docs March 10, 2026 14:28 View deployment

vercel bot deployed to Preview – outpost-website March 10, 2026 14:28 View deployment

vercel bot deployed to Preview – outpost-docs March 10, 2026 15:56 View deployment

vercel bot deployed to Preview – outpost-website March 10, 2026 15:56 View deployment

vercel bot deployed to Preview – outpost-website March 10, 2026 18:48 View deployment

vercel bot deployed to Preview – outpost-docs March 10, 2026 18:49 View deployment

alexluong and others added 18 commits March 11, 2026 01:54

refactor: remove unimplemented check from metrics conformance tests

2aaec3e

All three drivers (mem, pg, ch) now implement the metrics interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: gofmt metrics handlers

55fbc47

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: remove duplicate derefIntFromIntPtr helper

5dbd26f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: bump Go from 1.24 to 1.26

8aa26a8

Enables new(expr) syntax for cleaner pointer initialization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: rename date_range to time in metrics API query params and D…

84c4a1b

…ateRange to TimeRange in Go code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: remove eligible_for_retry from event metrics dimensions and filters

010bc04

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add rate measure coverage to drivertest

55e2ac2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alexbouchardd approved these changes Mar 11, 2026

View reviewed changes

alexluong and others added 9 commits March 11, 2026 16:54

fix(portal): replace ResponsiveContainer to fix -1px dimension warnings

5987dc1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(portal): clean up import order and simplify useMetrics URL building

3bc00ad

Move Loading import to top of MetricsChart.tsx. Use arrays/objects directly in useMetrics instead of serializing to strings and re-splitting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: Fix metrics table histogram display

ec8e02c

chore: Destination page metrics improvements

d98a42f

chore: Update time range picker styling

9dd9fda

fix: Fix metric breakdown height and dimension ellipsis

eccac65

vercel bot deployed to Preview – outpost-website March 11, 2026 09:54 View deployment

vercel bot deployed to Preview – outpost-docs March 11, 2026 09:55 View deployment

vercel bot deployed to Preview – outpost-website March 11, 2026 10:11 View deployment

vercel bot deployed to Preview – outpost-docs March 11, 2026 10:12 View deployment

vercel bot deployed to Preview – outpost-website March 11, 2026 12:04 View deployment

vercel bot deployed to Preview – outpost-docs March 11, 2026 12:04 View deployment

alexluong and others added 2 commits March 13, 2026 22:40

bench: add metrics benchmarking suite for ClickHouse and PostgreSQL

a585217

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview – outpost-website March 13, 2026 15:42 View deployment

vercel bot deployed to Preview – outpost-docs March 13, 2026 15:42 View deployment

alexluong and others added 2 commits March 13, 2026 22:48

Merge remote-tracking branch 'origin/main' into metrics

11b9197

fix: align metrics with 1-based attempt_number indexing

340bff9

PR #740 changed attempt_number from 0-based to 1-based. Update all metrics query logic, test data, seed scripts, and bench seeds to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview – outpost-docs March 13, 2026 15:54 View deployment

vercel bot deployed to Preview – outpost-website March 13, 2026 15:54 View deployment

alexluong merged commit 67ccfea into main Mar 13, 2026
4 checks passed

alexluong deleted the metrics branch March 13, 2026 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add metrics API with Postgres and ClickHouse support#734

feat: add metrics API with Postgres and ClickHouse support#734
alexluong merged 42 commits intomainfrom
metrics

alexluong commented Mar 8, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexluong commented Mar 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alexluong commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexluong commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexluong commented Mar 8, 2026 •

edited

Loading

vercel bot commented Mar 8, 2026 •

edited

Loading

alexluong commented Mar 11, 2026 •

edited

Loading