Skip to content

fix: API - DNS#191

Merged
gocanto merged 7 commits intomainfrom
chore/investigate-api-502-error
May 2, 2026
Merged

fix: API - DNS#191
gocanto merged 7 commits intomainfrom
chore/investigate-api-502-error

Conversation

@gocanto
Copy link
Copy Markdown
Collaborator

@gocanto gocanto commented May 2, 2026

Summary by CodeRabbit

  • New Features

    • Added /health endpoint, global structured request logging (with safe-query filtering) and database ping latency logs.
    • Optional CLI prewarm and Dockerised CLI build with cached Go build volume for faster runs.
  • Documentation

    • Setup guide documents prewarm step, CLI run behaviour, new log-directory env vars and an outage investigation pointer.
  • Tests

    • New tests for proxy/MTLS, compose wiring and healthchecks, and request/health/logging behaviours.
  • Chores

    • Help text and service startup/healthcheck wiring updated; base image pin revisions bumped.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 15acf461-2ac3-4e15-9a9a-d36981133bf3

📥 Commits

Reviewing files that changed from the base of the PR and between 9a35bcd and b9cc0d1.

📒 Files selected for processing (5)
  • go.mod
  • handler/keep_alive_db.go
  • handler/keep_alive_db_test.go
  • pkg/endpoint/server.go
  • pkg/endpoint/server_test.go

Walkthrough

Adds CLI Docker build/prewarm targets and orchestration, mounts a persistent Go build cache and API log path in docker-compose, introduces a /health handler and route, adds structured request and DB-ping logging, updates base image pins/checksums, and adds infra tests plus docs/help text changes.

Changes

CLI Docker Build Cache & Prewarm Infrastructure

Layer / File(s) Summary
Data / Env
.env.example, docker-compose.yml
Adds API_LOGS_PATH example and a named volume go_build_cache (local).
Compose wiring
docker-compose.yml
Mounts go_build_cache into api-runner at /tmp/go-build, sets Go cache env vars; mounts ${API_LOGS_PATH:-./storage/logs/api} into api and sets ENV_APP_LOGS_DIR to /app/storage/logs/logs_%s.log; adds api healthcheck; bumps BASE_IMAGE_VERSION build arg for api-runner and api.
Makefile variables & targets
infra/makefile/build.mk
Adds CLI_DOCKER_BINARY_HOST, CLI_DOCKER_BINARY_CONTAINER, CLI_DOCKER_BUILD_INPUTS; updates .PHONY; bumps BASE_IMAGE_REVISION; adds build-cli-docker and prewarm-cli-docker with compose selection and exit propagation.
DB state helpers
infra/makefile/db.mk
Adds DB_DOCKER_STATE_FUNCS defining db_running() and db_health() shell helpers.
CLI execution & orchestration
infra/makefile/app.mk
Refactors run-cli to select docker compose vs docker-compose, ensure base images, conditionally start DB, poll DB health with bounded retries, build CLI, and run it via compose_cmd run --rm --no-deps api-runner $(CLI_DOCKER_BINARY_CONTAINER).
Docs & help text
Makefile, docs/SETUP.md
Help text adds prewarm-cli-docker; SETUP docs document API_LOGS_PATH/ENV_APP_LOGS_DIR, optional make prewarm-cli-docker, and clarify make run-cli behaviour.

Caddy & Docker-Compose Integration Tests

Layer / File(s) Summary
Compose parsing helpers
infra/caddy/docker_compose_test.go
Adds YAML helpers (mappingValue, readComposeRoot, parseComposeRoot, sequenceContains) and test rejecting empty/comment-only compose documents.
Compose assertions
infra/caddy/docker_compose_test.go
Adds tests asserting caddy_prod is on caddy_net with alias proxy, caddy_prod.depends_on.api.condition == "service_healthy", and api has a /health healthcheck plus persistent logs mount and ENV_APP_LOGS_DIR.
Caddyfile assertions
infra/caddy/caddyfile_prod_test.go
Adds tests that Caddyfile.prod lists /api/generate-signature* (and /api/metrics) in @protected_public and that the :8443 listener enforces MTLS and reverse-proxies /api/generate-signature* to api:8080 with uri strip_prefix /api.

Application Health, Routing & Request Logging

Layer / File(s) Summary
Health handler
handler/health.go, handler/health_test.go
Adds HealthHandler (NewHealthHandler, Handle) returning payload.KeepAliveResponse{Message:"ok", DateTime:...} and test verifying status, message and DateTime parse.
Router wiring & test
metal/router/router.go, metal/router/router_health_test.go
Adds Router.Health() registering GET /health using the router pipeline and a test asserting /health returns 200.
Kernel boot ordering
metal/kernel/app.go
Calls modem.Health() in App.Boot() after keep-alive hooks and before metrics.
DB-ping structured logging
handler/keep_alive_db.go, handler/keep_alive_db_test.go
Wraps DB ping with timing and structured slog logs (success/error with duration and request context); test now asserts error-level log on ping failure.
Request logging middleware
pkg/endpoint/server.go, pkg/endpoint/server_test.go, go.mod
NewServerHandler now wraps handlers with requestLogHandler using httpsnoop to capture status/bytes/duration and emits structured slog logs (error for 5xx, warn for 4xx, info otherwise); adds tests validating logging fields, flusher preservation and safe query filtering; httpsnoop added to go.mod.

Base Image Pinning & Checksums

Layer / File(s) Summary
Builder/runtime Dockerfiles
infra/docker/base-images/Dockerfile.builder, infra/docker/base-images/Dockerfile.runtime, infra/docker/dockerfile-api
Pins default GO build args, bumps OpenSSL/musl/tzdata package revisions and updates ARG BASE_IMAGE_VERSION default from ...-r2 to ...-r3.
Checksums & generator
infra/docker/base-images/checksums/*, infra/docker/base-images/generate-checksums.sh
Updates checksum files and checksum-generation script to reflect musl-1.2.5-r23 and tzdata-2026b-r0 artifacts.

Sequence Diagram

sequenceDiagram
    participant User
    participant Make as Make (run-cli)
    participant Compose as Docker Compose
    participant DB as api-db
    participant Runner as api-runner
    participant Cache as go_build_cache

    User->>Make: make run-cli
    Make->>Make: detect compose command
    Make->>Compose: ensure base images
    Make->>Compose: start api-db (if needed)
    Make->>DB: poll /health (bounded retries)
    DB-->>Make: healthy / timeout
    Make->>Compose: run build-cli-docker / prewarm-cli-docker
    Compose->>Runner: run build step, mount Cache (/tmp/go-build)
    Runner->>Cache: read/write Go cache
    Runner-->>Compose: produce /app/bin/metal-cli
    Make->>Compose: compose run --rm api-runner /app/bin/metal-cli
    Runner-->>User: CLI output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fix: API - DNS' is vague and does not clearly convey the actual scope of changes, which involve implementing health checks, logging, Docker optimisations, and base image updates. Consider a more descriptive title that reflects the main objectives, such as 'add health check endpoint and improve logging with Docker optimisations' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Review rate limit: 2/5 reviews remaining, refill in 29 minutes and 56 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@gocanto gocanto marked this pull request as ready for review May 2, 2026 02:53
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the local development environment by introducing Docker-based CLI binary building, Go build caching, and automated health checks for the database during the run-cli process. It also adds new tests to verify Caddy configurations and Docker Compose structure. Feedback focuses on improving the robustness of the new tests—specifically by adding bounds checks for YAML parsing and using more efficient string matching—and ensuring the Makefile build triggers correctly by including untracked files in the dependency list.

Comment thread infra/caddy/docker_compose_test.go Outdated
Comment thread infra/caddy/caddyfile_prod_test.go
Comment thread infra/caddy/caddyfile_prod_test.go Outdated
Comment thread infra/caddy/caddyfile_prod_test.go Outdated
Comment thread infra/makefile/build.mk Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
infra/makefile/app.mk (1)

112-128: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Accept plain relative secret paths.

Lines 112-128 reject valid overrides such as DB_SECRET_USERNAME=database/infra/secrets/pg_username because only absolute paths and .//../ prefixes are treated as files. That makes make run-cli fail even when the value is a legitimate path. Treat any non-empty override as a path and validate it with -f.

Suggested simplification
-			case "$$secret_value" in \
-				/*|./*|../*) \
-					if [ ! -f "$$secret_value" ]; then \
-						if [ -z "$$missing_files" ]; then \
-							missing_files="  - $$secret_name ($$secret_value)"; \
-						else \
-							missing_files="$$missing_files\n  - $$secret_name ($$secret_value)"; \
-						fi; \
-					fi; \
-					;; \
-				*) \
-					if [ -z "$$missing_files" ]; then \
-						missing_files="  - $$secret_name (literal value — must be a file path)"; \
-					else \
-						missing_files="$$missing_files\n  - $$secret_name (literal value — must be a file path)"; \
-					fi; \
-					;; \
-			esac; \
+			if [ ! -f "$$secret_value" ]; then \
+				if [ -z "$$missing_files" ]; then \
+					missing_files="  - $$secret_name ($$secret_value)"; \
+				else \
+					missing_files="$$missing_files\n  - $$secret_name ($$secret_value)"; \
+				fi; \
+			fi; \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/makefile/app.mk` around lines 112 - 128, The current case block rejects
valid relative secret paths by only accepting values that start with /, ./, or
../; update the logic around secret_value/secret_name/missing_files so any
non-empty override is treated as a file path and validated with [ -f
"$$secret_value" ] (instead of the case pattern). Replace the case branches so
that if secret_value is non-empty you check file existence and append to
missing_files if the file is missing; otherwise treat empty values as
literal/missing as before—use the existing variables secret_value, secret_name
and missing_files to build the same messages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infra/caddy/caddyfile_prod_test.go`:
- Around line 35-49: The test currently checks that tokens in the required slice
exist anywhere in caddyfile, so scope the assertions to the :8443 site block:
locate the start of the ":8443 {" block in caddyfile (use strings.Index on the
literal ":8443 {"), find its matching closing brace (scan forward counting
braces to get the block substring), and then run the regexp.QuoteMeta /
regexp.MustCompile checks against that block substring instead of the whole
caddyfile; keep using the same required slice and t.Fatalf for failures so the
checks still validate presence/order within the MTLS listener block.

In `@infra/makefile/build.mk`:
- Line 25: The CLI_DOCKER_BUILD_INPUTS variable currently uses "git ls-files" so
untracked/new .go files are ignored causing bin/metal-cli to be reused
incorrectly; update CLI_DOCKER_BUILD_INPUTS to derive prerequisites from the
working tree (e.g. a recursive filesystem scan like find or Make's
wildcard/$(shell find . -name '*.go')) or add a rebuild sentinel (e.g. a FORCE
or phony dependency) so the make target that builds bin/metal-cli (and the make
run-cli path) always considers new/untracked source files; modify the variable
and the related build targets that consume CLI_DOCKER_BUILD_INPUTS (and any
cached-binary logic around bin/metal-cli) accordingly to ensure rebuilds when
local sources change.

---

Outside diff comments:
In `@infra/makefile/app.mk`:
- Around line 112-128: The current case block rejects valid relative secret
paths by only accepting values that start with /, ./, or ../; update the logic
around secret_value/secret_name/missing_files so any non-empty override is
treated as a file path and validated with [ -f "$$secret_value" ] (instead of
the case pattern). Replace the case branches so that if secret_value is
non-empty you check file existence and append to missing_files if the file is
missing; otherwise treat empty values as literal/missing as before—use the
existing variables secret_value, secret_name and missing_files to build the same
messages.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7b69d325-46ac-4c11-bbf5-7d0c2b5a6c42

📥 Commits

Reviewing files that changed from the base of the PR and between a65b729 and 443929f.

📒 Files selected for processing (8)
  • Makefile
  • docker-compose.yml
  • docs/SETUP.md
  • infra/caddy/caddyfile_prod_test.go
  • infra/caddy/docker_compose_test.go
  • infra/makefile/app.mk
  • infra/makefile/build.mk
  • infra/makefile/db.mk

Comment thread infra/caddy/caddyfile_prod_test.go Outdated
Comment thread infra/makefile/build.mk Outdated
@gocanto gocanto marked this pull request as draft May 2, 2026 03:32
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@handler/keep_alive_db.go`:
- Around line 37-47: The DB ping failure is being logged twice: once with
slog.Error and again inside endpoint.LogInternalError; remove the explicit
slog.Error call in the keep-alive handler and rely on endpoint.LogInternalError
to perform the single error logging (ensure endpoint.LogInternalError is passed
the same error and context such as "database ping failed", request_id and other
metadata if needed so no diagnostic info is lost).

In `@pkg/endpoint/server.go`:
- Around line 155-157: The handler is currently appending the raw query string
(r.URL.RawQuery) into attrs which can leak sensitive data; replace this by
parsing r.URL.Query() and either whitelist allowed parameter names or redact
values before appending to attrs (e.g., only append specific keys or append keys
with masked values like "<redacted>"); update the code around the spots
referencing r.URL.RawQuery and attrs to use the sanitized map/values instead of
the raw query string.
- Around line 107-137: statusRecorder currently embeds http.ResponseWriter but
does not forward optional interfaces, causing downstream asserts (e.g.,
http.Hijacker, http.Flusher, http.Pusher) to fail; update the statusRecorder
type (and its creation in ServeHTTP) to either (a) add wrapper methods that
implement these optional interfaces by type-asserting the embedded
ResponseWriter and forwarding calls (e.g., func (r *statusRecorder) Hijack()
(net.Conn, *bufio.ReadWriter, error) { if h, ok :=
r.ResponseWriter.(http.Hijacker); ok { return h.Hijack() } return nil, nil,
fmt.Errorf("hijack not supported") }), and similarly implement Flush(), Push(),
etc., or (b) add an Unwrap() ResponseWriter method that returns the underlying
ResponseWriter so callers can perform their own assertions; implement whichever
approach you prefer and ensure ServeHTTP still constructs rec :=
&statusRecorder{ResponseWriter: w} so the forwarding/unwrapping works for
downstream handlers like h.next.ServeHTTP(rec, r).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 076d15ae-6377-439a-b8dd-2e0d52e88b69

📥 Commits

Reviewing files that changed from the base of the PR and between 443929f and 4c61776.

📒 Files selected for processing (12)
  • .env.example
  • docker-compose.yml
  • docs/SETUP.md
  • handler/health.go
  • handler/health_test.go
  • handler/keep_alive_db.go
  • infra/caddy/docker_compose_test.go
  • metal/kernel/app.go
  • metal/router/router.go
  • metal/router/router_health_test.go
  • pkg/endpoint/server.go
  • pkg/endpoint/server_test.go

Comment thread handler/keep_alive_db.go Outdated
Comment thread pkg/endpoint/server.go Outdated
Comment thread pkg/endpoint/server.go Outdated
@gocanto gocanto marked this pull request as ready for review May 2, 2026 03:43
@gocanto gocanto changed the title Chore/investigate api 502 error fix: API - DNS May 2, 2026
@gocanto gocanto marked this pull request as draft May 2, 2026 03:50
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infra/docker/base-images/Dockerfile.builder`:
- Line 40: The Dockerfile.builder currently installs openssl directly
(openssl=3.5.6-r0), breaking reproducible builds; update generate-checksums.sh
to add "openssl-3.5.6-r0.apk" to the BUILDER_PACKAGES array, include its
checksum in the verified artifacts list, ensure the script downloads that .apk
with the other packages and runs the same checksum verification flow, and then
update Dockerfile.builder/bootstrap steps to consume the verified openssl .apk
(rather than pulling from the live index) so the bootstrap
key-generation/signing steps use the checksummed, verified package.

In `@infra/makefile/build.mk`:
- Line 25: The bin/metal-cli target only watches Go sources, so add
docker-compose.yml, infra/docker/dockerfile-api and the BASE_IMAGE_VERSION into
the CLI_DOCKER_BUILD_INPUTS and make the build invocation force image rebuild;
update the CLI_DOCKER_BUILD_INPUTS shell find to also include
"docker-compose*.yml" and "infra/docker/dockerfile-api" (or the exact Dockerfile
path) and append a token that reflects BASE_IMAGE_VERSION (e.g. include
$(BASE_IMAGE_VERSION) in the variable expansion) so changes to the base image
version change the prerequisite checksum; then change the recipe that runs
docker-compose up for the api-runner (the rule that builds the image) to call
docker-compose build or docker-compose up --build so the api-runner image is
rebuilt when those files or BASE_IMAGE_VERSION change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 60a51cf0-5e25-48af-8cef-bd5d6435f274

📥 Commits

Reviewing files that changed from the base of the PR and between 4c61776 and 9a35bcd.

📒 Files selected for processing (12)
  • docker-compose.yml
  • infra/caddy/caddyfile_prod_test.go
  • infra/caddy/docker_compose_test.go
  • infra/docker/base-images/Dockerfile.builder
  • infra/docker/base-images/Dockerfile.runtime
  • infra/docker/base-images/checksums/builder-aarch64.sha256
  • infra/docker/base-images/checksums/builder-x86_64.sha256
  • infra/docker/base-images/checksums/runtime-aarch64.sha256
  • infra/docker/base-images/checksums/runtime-x86_64.sha256
  • infra/docker/base-images/generate-checksums.sh
  • infra/docker/dockerfile-api
  • infra/makefile/build.mk

Comment thread infra/docker/base-images/Dockerfile.builder
Comment thread infra/makefile/build.mk
@gocanto gocanto marked this pull request as ready for review May 2, 2026 03:54
@gocanto gocanto merged commit be7a269 into main May 2, 2026
5 of 6 checks passed
@gocanto gocanto deleted the chore/investigate-api-502-error branch May 2, 2026 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant