Skip to content

OpenTelemetry trace propagation & email bounce/complaint suppression contract#965

Open
aabxtract wants to merge 2 commits into
Disciplr-Org:mainfrom
aabxtract:feature/otel-tracing
Open

OpenTelemetry trace propagation & email bounce/complaint suppression contract#965
aabxtract wants to merge 2 commits into
Disciplr-Org:mainfrom
aabxtract:feature/otel-tracing

Conversation

@aabxtract

Copy link
Copy Markdown

PR Description

feat+test: OpenTelemetry trace propagation & email bounce/complaint suppression contract


Summary

This PR delivers two independently scoped changes that were developed in parallel on feature/otel-tracing and
test/email-bounce-suppression:

  1. OpenTelemetry Distributed Tracing (feature/otel-tracing)

Wires W3C-compliant distributed tracing across the full HTTP → job → blockchain pipeline so a single create_vault
request can be followed from the inbound Express handler through Soroban submission, ETL job execution, and webhook
delivery as one connected trace.

New files:

  • src/observability/tracing.ts — zero-dependency tracer with OTLPExporter (OTLP/HTTP), InMemoryExporter (tests),
    no-op fallback, TracerImpl with batched flushing and head-based sampling
  • src/observability/tracingMiddleware.ts — Express middleware: extracts/creates W3C traceparent, starts a server
    span per request, injects outgoing traceparent header, attaches trace context + correlation ID to req
  • src/tests/tracing.test.ts — 48 tests covering parse/serialize utilities, parent→child span linkage, withSpan
    success/error paths, no-op tracer, initTracing options, middleware traceparent propagation, correlation-ID fallback
    chain, and job queue span instrumentation

Instrumented call sites:

  • src/services/soroban.ts — submitTransaction wrapped in soroban. span with soroban.contract_id,
    soroban.tx_hash, and RPC failover events
  • src/services/webhooks.ts — deliverOnce wrapped in webhook.http_deliver span with subscriber ID, URL, event type,
    and HTTP status
  • src/services/boundedWebhookDispatcher.ts — dispatch() wrapped in webhook.deliver span
  • src/jobs/queue.ts — runJob wrapped in job. span with job ID, attempt, and max-attempts

Bootstrap integration:

  • src/app.ts — tracingMiddleware mounted before all routes
  • src/index.ts — initTracing() called at startup (no-op when OTEL_EXPORTER_OTLP_ENDPOINT is unset)
  • src/server/shutdown.ts — shutdownTracing() flushes pending spans during graceful shutdown
  • src/config/env.ts — validates OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_SERVICE_NAME, OTEL_TRACES_SAMPLER,
    OTEL_TRACES_SAMPLER_ARG

Env contract (no-op when unset):

┌─────────────────────────────┬────────────────────────────────────────────────────────────────────────┐
│ Variable │ Purpose │
├─────────────────────────────┼────────────────────────────────────────────────────────────────────────┤
│ OTEL_EXPORTER_OTLP_ENDPOINT │ Collector URL (e.g. http://jaeger:4318). Tracing disabled when absent. │
├─────────────────────────────┼────────────────────────────────────────────────────────────────────────┤
│ OTEL_SERVICE_NAME │ Service name tag on spans. Defaults to disciplr-backend. │
├─────────────────────────────┼────────────────────────────────────────────────────────────────────────┤
│ OTEL_TRACES_SAMPLER │ always_on / always_off / traceidratio. Defaults to always_on. │
├─────────────────────────────┼────────────────────────────────────────────────────────────────────────┤
│ OTEL_TRACES_SAMPLER_ARG │ Sampling ratio 0.0–1.0 when sampler is traceidratio. │
└─────────────────────────────┴────────────────────────────────────────────────────────────────────────┘


  1. Email Bounce & Complaint Suppression Contract (test/email-bounce-suppression)

Proves the suppression rules enforced by bounceStore.ts so the notification pipeline never re-mails a poisoned
address. All three suppression paths are covered and isolated.

Changed files:

  • src/services/notifications/bounceStore.ts — extended with recordHardBounce, recordSoftBounce, recordComplaint,
    getSuppressionInfo (typed SuppressionReason), getSoftBounceCount, setSoftBounceCap, getBounces, getComplaints, and
    clearBounces
  • src/tests/notifications.bounceStore.test.ts — 40 contract tests across 7 describe groups

Suppression rules proven by the tests:

┌───────────────────────────────────────┬───────────────────────────────────────────────────────────────────────┐
│ Event │ Outcome │
├───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┤
│ Hard bounce (or SMTP 550/554/5.1.1) │ Immediate suppression; reason: "hard_bounce" │
├───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┤
│ Soft bounce × N < cap │ Not suppressed; count tracked │
├───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┤
│ Soft bounce × N ≥ cap │ Suppressed; reason: "soft_bounce_cap" │
├───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┤
│ Spam complaint │ Immediate suppression; reason: "complaint" — overrides bounce history │
├───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┤
│ isSuppressed() / getSuppressionInfo() │ Short-circuit query for the notification pipeline │
└───────────────────────────────────────┴───────────────────────────────────────────────────────────────────────┘

Precedence order: complaint > hard bounce > soft-bounce cap


Test plan

  • npm test src/tests/tracing.test.ts — 48 tests, all pass
  • npm test src/tests/notifications.bounceStore.test.ts — 40 tests, all pass
  • No-op behaviour verified: tracing disabled when OTEL_EXPORTER_OTLP_ENDPOINT is unset, existing test suites
    unaffected
  • Manual smoke: set OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 and confirm spans appear in a local
    Jaeger/Grafana Alloy instance
  • Confirm OTEL_TRACES_SAMPLER=always_off produces zero spans in collector

closes #678
closes #654

Add suppression state machine to bounceStore with hard bounce, soft bounce
cap, and complaint tracking. Tests prove: hard bounce suppresses immediately,
soft bounces retry up to configurable cap then suppress, complaints suppress
regardless of bounce history, and isSuppressed/getSuppressionInfo provide
queryable short-circuit for the notification pipeline.
@drips-wave

drips-wave Bot commented Jun 29, 2026

Copy link
Copy Markdown

@aabxtract Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant