Skip to content

Merge feature/ci-cd-optimization improvements#4845

Merged
arkid15r merged 24 commits into
mainfrom
feature/ci-cd-optimization-merge
Jun 6, 2026
Merged

Merge feature/ci-cd-optimization improvements#4845
arkid15r merged 24 commits into
mainfrom
feature/ci-cd-optimization-merge

Conversation

@arkid15r
Copy link
Copy Markdown
Collaborator

@arkid15r arkid15r commented Jun 6, 2026

Proposed change

Apply CI/CD optimization improvements.

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

ahmedxgouda and others added 21 commits May 9, 2026 21:11
* Extract checks

* Remove docker layer from cspell

* Give read permission to the checks

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
* Extract backend tests workflow and remove docker layer

* Add permissions

* Add FORCE_COLOR env

* Fix ordering

* Optimize dependecies installation

* Update cache path and comments identation

* Remove dead code

* Update code

* Update permissions

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract codecov upload to a separate workflow

* Add permissions

* Add checkout

* Update code

* Add back the checkout step

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Remove docker layer from fuzz tests and setup-backend-environment workflow

* Update code

* Update code

* Add poetry run in entrypoint.fuzz.sh

* Update code

* Update code

* Refactor

* Update code

* Apply rabbit's suggestions

* Remove redundant BACKEND_PORT from migartion step

* Update caching

* Disable collecting coverage for fuzz tests

* Reorder steps in setup-backend-environment action

* Apply suggestions

* Remove code-quality-checks dependency temporarily

* Update code

* Update code

* Add run-code-quality-checks dependency

* Update code

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract infrastructure tests

* Remove the code-quality-checks dependency temporarily

* Apply rabbit suggestions and fix terraform error

* Add caching

* Reorder

* Update path

* Add code-quality-checks as dependency

* Update code

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract tests and remove docker layer

* Update code

* Update code

* Update permissions

* Remove code-quality-checks dependency temporarily

* Update permissions

* Update code

* Add caching

* Add run-code-quality-checks dependency

* Update code

* Reorder

* Update code

* Update code

* Update code

* Update code

* Update caching

* Update caching

* Refactor

* Add checkout

* Update setup-frontend-environment action description

* Update code

* Remove redundant manual caching step

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract e2e tests into a separate workflow

* Remove docker layer from frontend

* Use custom action

* Update code

* Add next.js caching

* Remove docker layer from e2e tests

* Update code

* Add e2e dependencies installation step

* Add playwright installation and caching

* Update code

* Update code

* Update envs

* Update code

* Update playwright config and add upload artifact step

* Pin service container images by digest

* Update e2e tests name

* Apply rabbit suggestions

* Fix syntax

* Update code

* Add playwright apt caching

* Try playwright container

* Fix pipx

* Drop set up Python cache

* Bump playwright version

* Update code

* Update install poetry action

* Update code

* Update code

* Update browsers

* Update code

* Revert some changes

* Clean up some steps

* Try chromium only

* Update code

* Add logs

* Update smoke test

* Update code

* Update code

* Clean up code

* Bump workers count

* Revert workers number change

* Rebalance CPUs

* More CPUs to playwright

* Update code

* Update code

* Update code

* Remove apt caching

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract set-release-version

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
* Extract build-images

* Update build-production-images

* Update code

* Remove unused docker hub

* Update code

* Update code

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract run-lighthouse-ci into a separate reusable workflow

* Update .github/workflows/run-lighthouse-ci.yaml

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Extract run-zap-baseline-scan into a reusable workflow

* Update code
* Extract bootstrap-infrastructure

* Update CI/CD

* Update CI/CD

* Add terraform caching

* Refactor terraform bootstraping

* Update code

* Update code

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract scan-images

* Update scan-production-images

* Update code

* Update Trivy caching

* Apply cubic suggestion

* Update code

* Update code

* Refactor trivy

* Update code

* Update code

* Update code

* Update code

* Update code

* Update code

* Update code

* Update code

* Revert "Upload SBOM action creation"

This reverts commit 51132e4.

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract deploy-nest into a separate reusable workflow

* Update code

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
* Extract production jobs into a separate workflow

* Reorder

Update pnpm
* Generalize production and staging workflows

* Reorder

* Extract checks and tests into a reusable workflow and update run-ci-cd

* Update run-fuzz-tests.yaml

* Update code

* Update code

* Update naming

* Update code

* Update code

* Update e2e/playwright.config.ts

* Update code

* Update code

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
@github-actions github-actions Bot added docs Improvements or additions to documentation backend frontend ci infrastructure labels Jun 6, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 6, 2026

Review Change Stack

Summary by CodeRabbit

  • New Features

    • Added ECR cache repositories for optimized container image building.
    • Introduced reusable CI/CD workflow architecture for production and staging deployments.
  • Infrastructure

    • New infrastructure bootstrap and deployment capabilities.
    • Enhanced image scanning and SBOM generation for security compliance.
  • Tests

    • Updated E2E tests with improved device emulation support.
    • Removed explicit timeout overrides for better default behavior.
  • Chores

    • Updated security scanning ignore rules and spell-check dictionary.
    • Updated CI/CD pipeline status badges in documentation.

Walkthrough

Rewrites CI into reusable workflows and composite actions, adds image build/scan/deploy/infrastructure bootstrap flows, introduces an ECR cache Terraform module with tests/docs, updates E2E Playwright presets and removes explicit per-test navigation timeouts, and applies small docs/script/dictionary tweaks.

Changes

GitHub Actions CI/CD Refactor (single cohort)

Layer / File(s) Summary
Composite actions & tooling
.github/actions/install-poetry/action.yaml, .github/actions/install-backend-dependencies/action.yaml, .github/actions/install-frontend-dependencies/action.yaml, .github/actions/run-trivy-scan/action.yaml, .github/actions/apply-infrastructure-changes/action.yaml
Adds composite actions for Poetry/pip caching and install, backend/frontend dependency install, Trivy scanning with cache, and a Terraform apply action that writes backend/tfvars, init/plan/validate/apply with plugin cache.
Backend environment setup (composite)
.github/actions/setup-backend-environment/action.yaml
Composite action that bootstraps backend test env: installs deps, fetches/restores DB dump, runs migrations, starts Gunicorn, and waits for health endpoint.
Reusable checks & tests
.github/workflows/run-code-checks.yaml, .github/workflows/run-code-tests.yaml, .github/workflows/run-backend-tests.yaml, .github/workflows/run-frontend-tests.yaml, .github/workflows/run-coverage-upload.yaml, .github/workflows/dependency-review.yaml, .github/workflows/code-ql.yaml
Introduces reusable workflows for code checks, tests, dependency review, CodeQL, and coverage upload; adjusts job permissions and dependency-install steps.
CI/CD orchestrator & entrypoints
.github/workflows/run-ci-cd.yaml, .github/workflows/ci.yaml, .github/workflows/ci-cd-staging.yaml, .github/workflows/ci-cd-production.yaml
Rewrites run-ci-cd into a workflow_call orchestrator with typed inputs, and adds CI, staging, and production entry workflows that call the orchestrator with environment-specific inputs.
Image build/scan & deploy
.github/workflows/run-image-build.yaml, .github/workflows/run-image-scan.yaml, .github/workflows/run-deploy.yaml, .github/workflows/run-release-version-resolution.yaml, .github/workflows/run-lighthouse-ci.yaml, .github/workflows/run-zap-baseline-scan.yaml
Adds reusable workflows to build/push images with cache, scan images and generate SBOMs, run deploy (assume role, apply TF, run ECS tasks, wait for service stability), resolve release version, and run post-deploy Lighthouse/ZAP scans.
Infrastructure bootstrap & tests
.github/workflows/run-infrastructure-bootstrap.yaml, .github/workflows/run-infrastructure-tests.yaml
Adds infrastructure bootstrap workflow that calls the Terraform apply composite action and an infra-tests workflow that installs Terraform, caches plugin dir, and runs test target.
ECR cache Terraform module
infrastructure/modules/ecr-cache/*, infrastructure/live/main.tf, infrastructure/live/README.md
Adds ecr-cache Terraform module (repository + lifecycle policy), variables/outputs/docs/tests, pins provider lockfile, and wires backend/frontend cache modules into live main.tf; updates .trivyignore.yaml for cache-specific rules.
E2E Playwright updates
e2e/helpers/devices.ts, e2e/playwright.config.ts, e2e/**
Adds an iPhone13 Chromium device preset, switches mobile project to Chromium preset, and removes explicit per-test page.goto timeout overrides across many e2e specs.
Misc small changes
README.md, cspell/custom-dict.txt, backend/entrypoint.fuzz.sh, removed/updated workflow files
Updates CI badges in README, adjusts cspell dictionary, quotes fuzz test paths in entrypoint, and removes/rewires some legacy workflow files.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • OWASP/Nest#4807: overlaps on adding/using run-lighthouse-ci.yaml and wiring base_url into the CI flow.
  • OWASP/Nest#4711: introduces the same Terraform composite action apply-infrastructure-changes and similar bootstrap wiring.
  • OWASP/Nest#4687: related extraction of E2E/test workflows and changes to run-ci-cd.yaml delegation.

Suggested reviewers

  • kasya
  • rudransh-shrivastava
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Merge feature/ci-cd-optimization improvements' is overly vague and uses the word 'improvements' without specifying what core aspect of CI/CD is being optimized. Replace with a more specific title that highlights the primary optimization (e.g., 'Refactor CI/CD workflows into reusable composites' or 'Extract composite GitHub Actions for CI/CD workflows').
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description is related to the changeset—it mentions applying CI/CD optimization improvements, which aligns with the extensive GitHub Actions workflow restructuring present in the changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/ci-cd-optimization-merge

@arkid15r arkid15r marked this pull request as ready for review June 6, 2026 17:36
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

19 issues found across 63 files

Confidence score: 1/5

  • High merge risk: there are multiple high-confidence, high-severity CI/CD breakages where reusable workflows in .github/workflows/ci-cd-staging.yaml and .github/workflows/run-ci-cd.yaml do not forward secrets, which can cause downstream jobs to fail at runtime.
  • Most severe issue is a security concern in .github/workflows/run-release-version-resolution.yaml: unquoted interpolation of ${{ github.event.release.tag_name }} in a shell run: block can allow command injection if a tag name is attacker-controlled.
  • There are additional reliability regressions in deployment/scanning paths (e.g., .github/workflows/run-deploy.yaml task-response handling and .github/actions/run-trivy-scan/action.yaml cache/fail-fast behavior), increasing the chance of silent failures or flaky pipelines.
  • Pay close attention to .github/workflows/run-release-version-resolution.yaml, .github/workflows/ci-cd-staging.yaml, .github/workflows/run-ci-cd.yaml, .github/workflows/run-deploy.yaml, .github/actions/run-trivy-scan/action.yaml - security exposure and likely CI/CD execution failures need to be addressed before merge.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/actions/run-trivy-scan/action.yaml">

<violation number="1" location=".github/actions/run-trivy-scan/action.yaml:23">
P1: Cache path mismatch: `.trivy-cache` is cached but Trivy uses `~/.cache/trivy` by default. Without setting `TRIVY_CACHE_DIR=.trivy-cache`, the cache step is ineffective and Trivy will not benefit from it.</violation>

<violation number="2" location=".github/actions/run-trivy-scan/action.yaml:29">
P1: Trivy composite action can mask failed scan commands because it runs input script with `bash -c` without fail-fast flags.</violation>
</file>

<file name=".github/actions/setup-backend-environment/action.yaml">

<violation number="1" location=".github/actions/setup-backend-environment/action.yaml:98">
P2: Readiness check uses `/a` instead of dedicated `/status/` health check endpoint. The `/a` path works only through a fragile chain of Django redirects (APPEND_SLASH + admin auth redirect). The repo already has `/status/` — a purpose-built, lightweight health check endpoint.</violation>
</file>

<file name=".github/workflows/run-code-tests.yaml">

<violation number="1" location=".github/workflows/run-code-tests.yaml:64">
P2: Dead code: `run-fuzz-graphql-tests` job is permanently disabled via `if: false`</violation>
</file>

<file name=".github/actions/install-poetry/action.yaml">

<violation number="1" location=".github/actions/install-poetry/action.yaml:23">
P3: Unnecessary `chown` to the same user — files are already owned by the runner user. This adds a potential failure point on systems where the runner lacks `CAP_CHOWN`, with no benefit.</violation>
</file>

<file name=".github/workflows/run-code-checks.yaml">

<violation number="1" location=".github/workflows/run-code-checks.yaml:16">
P2: Missing timeout-minutes on three jobs: run-frontend-checks, run-pre-commit-checks, and run-spelling-checks. Every other job in the repository's workflows includes timeout-minutes; without it, a hung step will consume the full GitHub Actions runner allocation before being killed.</violation>

<violation number="2" location=".github/workflows/run-code-checks.yaml:143">
P2: pnpm version mismatch in run-spelling-checks job: uses 10.33.3 but cspell/package.json declares pnpm@11.5.1. Using `--frozen-lockfile` with a lockfile generated by a different major pnpm version can cause lockfile parsing errors or silently produce a different dependency tree.</violation>
</file>

<file name=".github/workflows/run-frontend-tests.yaml">

<violation number="1" location=".github/workflows/run-frontend-tests.yaml:27">
P2: Five-minute timeout on `run-a11y-tests` may be too tight. The two prerequisite steps (checkout + install-frontend-dependencies) can consume 2+ minutes on a cold cache, leaving minimal headroom for the actual accessibility test suite. Consider raising to 10 minutes or monitoring CI runtimes post-merge to validate the window.</violation>
</file>

<file name=".github/workflows/run-lighthouse-ci.yaml">

<violation number="1" location=".github/workflows/run-lighthouse-ci.yaml:36">
P2: Job-level timeout of 5 minutes is too tight for the full pipeline (checkout + dependency installation + Lighthouse CI audit of 8 URLs). The dependency install step alone can take 1-2 minutes on a cold cache, leaving only 3-4 minutes for auditing 8 pages — each of which typically takes 20-40 seconds. On slower GitHub Actions runners this can cause intermittent timeout failures.</violation>
</file>

<file name=".github/actions/apply-infrastructure-changes/action.yaml">

<violation number="1" location=".github/actions/apply-infrastructure-changes/action.yaml:46">
P3: Cache step is missing `restore-keys`: without a fallback prefix-based key, every cache miss results in a full plugin download with no partial restore from prior cached runs.</violation>
</file>

<file name="infrastructure/live/main.tf">

<violation number="1" location="infrastructure/live/main.tf:127">
P1: Chicken-and-egg: image build needs cache repos before Terraform creates them. Run-image-build runs before run-deploy in the CI/CD pipeline, but the cache repos (`backend_build_cache`, `frontend_build_cache`) are only provisioned when Terraform is applied during `run-deploy`. Docker's `cache-to: type=registry` will fail because the ECR repo doesn't exist yet.</violation>
</file>

<file name="frontend/next.config.ts">

<violation number="1" location="frontend/next.config.ts:65">
P2: Rewrite source paths changed from `/csrf`/`/graphql` to `/csrf/`/`/graphql/`, but `skipTrailingSlashRedirect: true` prevents Next.js from normalizing trailing slashes in E2E mode. Requests to `/csrf` or `/graphql` without trailing slashes will no longer match the rewrite and won't be proxied to the backend, breaking those E2E flows.</violation>
</file>

<file name="infrastructure/modules/ecr-cache/main.tf">

<violation number="1" location="infrastructure/modules/ecr-cache/main.tf:15">
P3: Missing `force_delete = true` on the ECR cache repository. Since this repo will always contain images during normal operation, `terraform destroy` will fail with a `RepositoryNotEmptyException`. Consider setting `force_delete = true` since cache images are rebuildable and the lifecycle policy already manages image retention.</violation>
</file>

<file name=".github/workflows/run-deploy.yaml">

<violation number="1" location=".github/workflows/run-deploy.yaml:158">
P1: Migrate task does not validate ECS run-task response before extracting task ARN, unlike the index-data task which properly checks for failures and empty tasks. If run-task fails (e.g., insufficient capacity, bad networking), downstream steps will get "null" as the task ARN and fail with confusing errors rather than a clear failure message.</violation>

<violation number="2" location=".github/workflows/run-deploy.yaml:221">
P2: Index-data task is fire-and-forget: the step validates it started but never waits for it to finish or checks its exit code. A failure in this critical data-indexing step (which runs ~30 min async) will go completely undetected — the deployment reports success while the index task silently fails.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread .github/workflows/ci-cd-staging.yaml
Comment thread .github/workflows/run-ci-cd.yaml
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
key: ${{ env.TRIVY_CACHE_KEY }}
path: .trivy-cache
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Cache path mismatch: .trivy-cache is cached but Trivy uses ~/.cache/trivy by default. Without setting TRIVY_CACHE_DIR=.trivy-cache, the cache step is ineffective and Trivy will not benefit from it.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/actions/run-trivy-scan/action.yaml, line 23:

<comment>Cache path mismatch: `.trivy-cache` is cached but Trivy uses `~/.cache/trivy` by default. Without setting `TRIVY_CACHE_DIR=.trivy-cache`, the cache step is ineffective and Trivy will not benefit from it.</comment>

<file context>
@@ -0,0 +1,30 @@
+      uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae  # v5.0.5
+      with:
+        key: ${{ env.TRIVY_CACHE_KEY }}
+        path: .trivy-cache
+        restore-keys: trivy-${{ runner.os }}-${{ hashFiles('docker/trivy/Dockerfile') }}-
+
</file context>

Comment thread .github/workflows/run-release-version-resolution.yaml Outdated
use_fargate_spot = var.frontend_use_fargate_spot
}

module "backend_build_cache" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Chicken-and-egg: image build needs cache repos before Terraform creates them. Run-image-build runs before run-deploy in the CI/CD pipeline, but the cache repos (backend_build_cache, frontend_build_cache) are only provisioned when Terraform is applied during run-deploy. Docker's cache-to: type=registry will fail because the ECR repo doesn't exist yet.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At infrastructure/live/main.tf, line 127:

<comment>Chicken-and-egg: image build needs cache repos before Terraform creates them. Run-image-build runs before run-deploy in the CI/CD pipeline, but the cache repos (`backend_build_cache`, `frontend_build_cache`) are only provisioned when Terraform is applied during `run-deploy`. Docker's `cache-to: type=registry` will fail because the ECR repo doesn't exist yet.</comment>

<file context>
@@ -124,6 +124,20 @@ module "frontend" {
   use_fargate_spot    = var.frontend_use_fargate_spot
 }
 
+module "backend_build_cache" {
+  source = "../modules/ecr-cache"
+
</file context>

exit 1
fi

echo "Index-data task started successfully (runs async, ~30 min)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Index-data task is fire-and-forget: the step validates it started but never waits for it to finish or checks its exit code. A failure in this critical data-indexing step (which runs ~30 min async) will go completely undetected — the deployment reports success while the index task silently fails.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/workflows/run-deploy.yaml, line 221:

<comment>Index-data task is fire-and-forget: the step validates it started but never waits for it to finish or checks its exit code. A failure in this critical data-indexing step (which runs ~30 min async) will go completely undetected — the deployment reports success while the index task silently fails.</comment>

<file context>
@@ -0,0 +1,240 @@
+            exit 1
+          fi
+
+          echo "Index-data task started successfully (runs async, ~30 min)"
+
+      - name: Wait for backend service stability
</file context>

run: |
set -euo pipefail
mkdir -p "$POETRY_CI_HOME/.cache/pip"
chown -R "$(id -u):$(id -g)" "$POETRY_CI_HOME"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Unnecessary chown to the same user — files are already owned by the runner user. This adds a potential failure point on systems where the runner lacks CAP_CHOWN, with no benefit.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/actions/install-poetry/action.yaml, line 23:

<comment>Unnecessary `chown` to the same user — files are already owned by the runner user. This adds a potential failure point on systems where the runner lacks `CAP_CHOWN`, with no benefit.</comment>

<file context>
@@ -0,0 +1,32 @@
+      run: |
+        set -euo pipefail
+        mkdir -p "$POETRY_CI_HOME/.cache/pip"
+        chown -R "$(id -u):$(id -g)" "$POETRY_CI_HOME"
+        chmod -R u+rwX "$POETRY_CI_HOME"
+      shell: bash
</file context>

Comment thread .github/actions/install-backend-dependencies/action.yaml
- name: Cache Terraform plugins
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
key: ${{ runner.os }}-infrastructure-${{ hashFiles('infrastructure/**/.terraform.lock.hcl') }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Cache step is missing restore-keys: without a fallback prefix-based key, every cache miss results in a full plugin download with no partial restore from prior cached runs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/actions/apply-infrastructure-changes/action.yaml, line 46:

<comment>Cache step is missing `restore-keys`: without a fallback prefix-based key, every cache miss results in a full plugin download with no partial restore from prior cached runs.</comment>

<file context>
@@ -0,0 +1,98 @@
+    - name: Cache Terraform plugins
+      uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae  # v5.0.5
+      with:
+        key: ${{ runner.os }}-infrastructure-${{ hashFiles('infrastructure/**/.terraform.lock.hcl') }}
+        path: ${{ inputs.plugin-cache-dir }}
+
</file context>

# BuildKit registry cache reuses a single tag (e.g. :cache) and overwrites it each build.
# App release images use IMMUTABLE repos with scan-on-push in modules/service; this repo is cache-only.
# NOSEMGREP: terraform.aws.security.aws-ecr-mutable-image-tags.aws-ecr-mutable-image-tags, terraform.lang.security.ecr-image-scan-on-push.ecr-image-scan-on-push
resource "aws_ecr_repository" "main" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Missing force_delete = true on the ECR cache repository. Since this repo will always contain images during normal operation, terraform destroy will fail with a RepositoryNotEmptyException. Consider setting force_delete = true since cache images are rebuildable and the lifecycle policy already manages image retention.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At infrastructure/modules/ecr-cache/main.tf, line 15:

<comment>Missing `force_delete = true` on the ECR cache repository. Since this repo will always contain images during normal operation, `terraform destroy` will fail with a `RepositoryNotEmptyException`. Consider setting `force_delete = true` since cache images are rebuildable and the lifecycle policy already manages image retention.</comment>

<file context>
@@ -0,0 +1,46 @@
+# BuildKit registry cache reuses a single tag (e.g. :cache) and overwrites it each build.
+# App release images use IMMUTABLE repos with scan-on-push in modules/service; this repo is cache-only.
+# NOSEMGREP: terraform.aws.security.aws-ecr-mutable-image-tags.aws-ecr-mutable-image-tags, terraform.lang.security.ecr-image-scan-on-push.ecr-image-scan-on-push
+resource "aws_ecr_repository" "main" {
+  image_tag_mutability = "MUTABLE"
+  name                 = var.name
</file context>

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/actions/apply-infrastructure-changes/action.yaml:
- Around line 13-24: The inputs tfbackend-path and tfvars-path are declared but
not used in the Terraform commands; update the action steps that run terraform
so they reference the provided inputs instead of hardcoded filenames: use the
tfbackend-path value for backend config when running terraform init (e.g., pass
-backend-config="${{ inputs.tfbackend-path }}" or the equivalent reference used
in the action) and pass tfvars-path to terraform plan/apply via -var-file="${{
inputs.tfvars-path }}"; ensure any prior steps that write files use
tfbackend-content and tfvars-content to create files at those input paths so the
referenced paths exist.

In @.github/actions/run-trivy-scan/action.yaml:
- Line 24: The restore-keys entry currently uses the full Dockerfile hash
(restore-keys: trivy-${{ runner.os }}-${{ hashFiles('docker/trivy/Dockerfile')
}}-) which prevents fallback cache hits when that file changes; change the
restore-keys to a broader prefix such as trivy-${{ runner.os }}- (or use a less
specific hash like the directory) so the cache key for trivy can fall back to
older caches when the Dockerfile hash differs, updating the restore-keys value
accordingly.

In @.github/workflows/dependency-review.yaml:
- Around line 3-15: Add a concurrency group to the GitHub Actions workflow so
superseded dependency-review runs are canceled: update the workflow that defines
the pull_request trigger (the block with keys "on", "pull_request", "branches",
"paths", and "permissions") by adding a top-level "concurrency" key with a
unique group expression that ties runs to the PR (e.g., use github.workflow with
github.ref or github.event.pull_request.number) and set cancel-in-progress:
true; this ensures only the latest dependency-review job for a given PR proceeds
and earlier queued runs are cancelled.

In @.github/workflows/run-backend-tests.yaml:
- Around line 19-20: The actions/checkout step currently uses
actions/checkout@de0fac2e... without disabling persisted credentials; update the
checkout step (the "Check out repository" step that uses actions/checkout) to
add persist-credentials: false so the workflow token is not written to git
config and will not persist to subsequent steps.

In @.github/workflows/run-code-checks.yaml:
- Around line 18-19: Every checkout step that uses the actions/checkout action
should disable persisted Git credentials to reduce token exposure: update each
occurrence of "uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd"
(and the other checkout occurrences) to include the input "persist-credentials:
false" under that step so the checkout action does not write credentials to the
runner for downstream steps; apply this change to all checkout steps in the
workflow.

In @.github/workflows/run-e2e-tests.yaml:
- Around line 45-46: Update the GitHub Actions checkout step (the
actions/checkout@... usage) to disable persisted credentials by adding the
option persist-credentials: false to that step; locate the "Check out
repository" step where uses: actions/checkout@de0fac2e... is declared and add
the persist-credentials: false key under that step so subsequent build/test
commands won't inherit the checkout token.

In @.github/workflows/run-frontend-tests.yaml:
- Around line 18-19: The checkout steps using "uses:
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd" must be hardened by
adding persist-credentials: false to their step definitions; locate both
checkout steps (the ones with uses: actions/checkout@...) and add a
persist-credentials: false top-level key under each step to disable saved
GITHUB_TOKEN credentials.

In @.github/workflows/run-image-build.yaml:
- Around line 68-69: The checkout step named "Check out repository" currently
uses actions/checkout@de0fac2e4500d... with default credential persistence;
update that step to explicitly set persist-credentials: false to avoid
persisting GITHUB_TOKEN credentials for later git commands (i.e., add the
persist-credentials: false key under the checkout step for actions/checkout).

In @.github/workflows/run-infrastructure-tests.yaml:
- Around line 19-20: Update the GitHub Actions checkout step (the
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd usage) to harden
credentials by adding the persist-credentials: false option; locate the step
named "Check out repository" and add the persist-credentials: false key under
that step so the checkout action does not pass the workflow token to subsequent
steps.

In @.github/workflows/run-release-version-resolution.yaml:
- Around line 25-26: Replace the direct interpolation of `${{
github.event.release.tag_name }}` in the shell run block with an environment
variable (e.g., RELEASE_TAG_NAME) so the value is passed via env and not
expanded into the shell command, and then write the output using a safe
formatted write like printf 'release_version=%s\n' "$RELEASE_TAG_NAME" >>
"$GITHUB_OUTPUT" (use RELEASE_TAG_NAME and GITHUB_OUTPUT symbols to locate where
to change the if check and output write). Ensure the if test references the env
variable (non-empty check of RELEASE_TAG_NAME) and that all uses of `${{
github.event.release.tag_name }}` are replaced by the env var to mitigate shell
injection.

In `@infrastructure/modules/ecr-cache/main.tf`:
- Around line 15-26: The aws_ecr_repository.main resource is missing an
encryption_configuration causing it to use AWS-managed keys; add an
encryption_configuration block with encryption_type = "KMS" and kms_key =
var.kms_key_arn in the aws_ecr_repository.main resource, add a new module
variable (e.g., variable "kms_key_arn" with no default and a short description)
to accept the CMK ARN from the calling layer, and update callers to pass the CMK
ARN; ensure the CMK's IAM policy and grants follow the repo least-privilege
hardening baseline (only allow kms:Encrypt/Decrypt/GenerateDataKey for the ECR
service principal and the minimal principals that need access).

In `@infrastructure/modules/ecr-cache/tests/ecr-cache.tftest.hcl`:
- Around line 38-44: The test run "test_lifecycle_policy_retains_three_images"
only asserts aws_ecr_lifecycle_policy.main.policy.rules[0].selection.countNumber
== 3, which allows semantic drift if selection.countType or selection.tagStatus
change; update the assert block to also decode and check
rules[0].selection.countType (e.g. equals the expected value like
"imageCountMoreThan") and rules[0].selection.tagStatus (e.g. equals the expected
value like "untagged"/"tagged" as appropriate for your policy) so all three
selection fields are validated together.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 357815c8-8082-45fd-9cf0-2d5e7b08cf16

📥 Commits

Reviewing files that changed from the base of the PR and between 1691329 and cc11eb0.

📒 Files selected for processing (63)
  • .github/actions/apply-infrastructure-changes/action.yaml
  • .github/actions/install-backend-dependencies/action.yaml
  • .github/actions/install-frontend-dependencies/action.yaml
  • .github/actions/install-poetry/action.yaml
  • .github/actions/run-trivy-scan/action.yaml
  • .github/actions/setup-backend-environment/action.yaml
  • .github/pr-labeler.yml
  • .github/workflows/ci-cd-production.yaml
  • .github/workflows/ci-cd-staging.yaml
  • .github/workflows/ci.yaml
  • .github/workflows/code-ql.yaml
  • .github/workflows/dependency-review.yaml
  • .github/workflows/label-issues.yaml
  • .github/workflows/pr-labeler.yaml
  • .github/workflows/run-backend-tests.yaml
  • .github/workflows/run-ci-cd.yaml
  • .github/workflows/run-code-checks.yaml
  • .github/workflows/run-code-tests.yaml
  • .github/workflows/run-coverage-upload.yaml
  • .github/workflows/run-deploy.yaml
  • .github/workflows/run-e2e-tests.yaml
  • .github/workflows/run-frontend-tests.yaml
  • .github/workflows/run-fuzz-tests.yaml
  • .github/workflows/run-image-build.yaml
  • .github/workflows/run-image-scan.yaml
  • .github/workflows/run-infrastructure-bootstrap.yaml
  • .github/workflows/run-infrastructure-tests.yaml
  • .github/workflows/run-lighthouse-ci.yaml
  • .github/workflows/run-release-version-resolution.yaml
  • .github/workflows/run-zap-baseline-scan.yaml
  • .github/workflows/setup-backend-environment/action.yaml
  • .github/workflows/update-nest-test-images.yaml
  • .trivyignore.yaml
  • README.md
  • backend/entrypoint.fuzz.sh
  • cspell/custom-dict.txt
  • e2e/components/Footer.spec.ts
  • e2e/components/Header.spec.ts
  • e2e/helpers/devices.ts
  • e2e/pages/About.spec.ts
  • e2e/pages/CalendarButton.spec.ts
  • e2e/pages/ChapterDetails.spec.ts
  • e2e/pages/Chapters.spec.ts
  • e2e/pages/CommitteeDetails.spec.ts
  • e2e/pages/Committees.spec.ts
  • e2e/pages/Community.spec.ts
  • e2e/pages/Contribute.spec.ts
  • e2e/pages/Home.spec.ts
  • e2e/pages/MentorshipPrograms.spec.ts
  • e2e/pages/OrganizationDetails.spec.ts
  • e2e/pages/Organizations.spec.ts
  • e2e/pages/ProjectDetails.spec.ts
  • e2e/pages/UserDetails.spec.ts
  • e2e/playwright.config.ts
  • frontend/next.config.ts
  • infrastructure/live/README.md
  • infrastructure/live/main.tf
  • infrastructure/modules/ecr-cache/.terraform.lock.hcl
  • infrastructure/modules/ecr-cache/README.md
  • infrastructure/modules/ecr-cache/main.tf
  • infrastructure/modules/ecr-cache/outputs.tf
  • infrastructure/modules/ecr-cache/tests/ecr-cache.tftest.hcl
  • infrastructure/modules/ecr-cache/variables.tf
💤 Files with no reviewable changes (3)
  • .github/workflows/update-nest-test-images.yaml
  • .github/workflows/label-issues.yaml
  • .github/workflows/setup-backend-environment/action.yaml

Comment on lines +13 to +24
tfbackend-path:
description: Path to write Terraform backend config
required: true
tfbackend-content:
description: Content for Terraform backend config
required: true
tfvars-path:
description: Path to write Terraform variables
required: true
tfvars-content:
description: Content for Terraform variables
required: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Wire tfbackend-path and tfvars-path into Terraform commands.

The action requires these path inputs, but Line 70 hardcodes the backend file and Line 80 does not consume tfvars-path. This can produce plans/applies with unintended backend/state or variable values.

Suggested fix
     - name: Initialize Terraform
       env:
         TF_PLUGIN_CACHE_DIR: ${{ inputs.plugin-cache-dir }}
+        TF_BACKEND_PATH: ${{ inputs.tfbackend-path }}
       run: terraform init -backend-config=terraform.tfbackend
       shell: bash
       working-directory: ${{ inputs.working-directory }}
@@
     - name: Plan Terraform changes
-      run: terraform plan -out=tfplan
+      env:
+        TF_VARS_PATH: ${{ inputs.tfvars-path }}
+      run: terraform plan -var-file="$TF_VARS_PATH" -out=tfplan
       shell: bash
       working-directory: ${{ inputs.working-directory }}

Also applies to: 67-83

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/actions/apply-infrastructure-changes/action.yaml around lines 13 -
24, The inputs tfbackend-path and tfvars-path are declared but not used in the
Terraform commands; update the action steps that run terraform so they reference
the provided inputs instead of hardcoded filenames: use the tfbackend-path value
for backend config when running terraform init (e.g., pass -backend-config="${{
inputs.tfbackend-path }}" or the equivalent reference used in the action) and
pass tfvars-path to terraform plan/apply via -var-file="${{ inputs.tfvars-path
}}"; ensure any prior steps that write files use tfbackend-content and
tfvars-content to create files at those input paths so the referenced paths
exist.

Comment thread .github/actions/run-trivy-scan/action.yaml Outdated
Comment thread .github/workflows/dependency-review.yaml
Comment thread .github/workflows/run-backend-tests.yaml
Comment thread .github/workflows/run-code-checks.yaml
Comment thread .github/workflows/run-image-build.yaml
Comment thread .github/workflows/run-infrastructure-tests.yaml
Comment thread .github/workflows/run-release-version-resolution.yaml Outdated
Comment on lines +15 to +26
resource "aws_ecr_repository" "main" {
image_tag_mutability = "MUTABLE"
name = var.name

tags = merge(var.common_tags, {
Name = "${var.name}-ecr"
})

image_scanning_configuration {
scan_on_push = false
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify ECR repositories across infrastructure modules define explicit encryption configuration.
rg -n --type=tf 'resource "aws_ecr_repository"|encryption_configuration|kms_key' infrastructure/modules

Repository: OWASP/Nest

Length of output: 1923


🏁 Script executed:

#!/bin/bash
# Minimal verification: confirm the current contents of ecr-cache module and search for other ECR repository resources.
set -euo pipefail

echo "=== Target file: infrastructure/modules/ecr-cache/main.tf ==="
if [ -f infrastructure/modules/ecr-cache/main.tf ]; then
  nl -ba infrastructure/modules/ecr-cache/main.tf | sed -n '1,120p'
else
  echo "Missing: infrastructure/modules/ecr-cache/main.tf"
fi

echo
echo "=== Find aws_ecr_repository resources under infrastructure/modules ==="
rg -n --type=tf 'resource\s+"aws_ecr_repository"' infrastructure/modules || true

echo
echo "=== Check for encryption_configuration / kms_key usage around ECR repositories ==="
rg -n --type=tf 'aws_ecr_repository|encryption_configuration|kms_key' infrastructure/modules || true

Repository: OWASP/Nest

Length of output: 1942


Add explicit KMS encryption to the ECR cache repository

infrastructure/modules/ecr-cache/main.tf (aws_ecr_repository.main) has no encryption_configuration, so it falls back to default AWS-managed encryption; wire in encryption_configuration { encryption_type = "KMS"; kms_key = <CMK ARN> } using a parameter/variable from the calling layer, and keep the CMK’s least-privilege posture consistent with the rest of the infra hardening baseline.

🧰 Tools
🪛 Checkov (3.2.530)

[high] 15-26: Ensure ECR image scanning on push is enabled

(CKV_AWS_163)


[low] 15-26: Ensure ECR Image Tags are immutable

(CKV_AWS_51)


[low] 15-26: Ensure that ECR repositories are encrypted using KMS

(CKV_AWS_136)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@infrastructure/modules/ecr-cache/main.tf` around lines 15 - 26, The
aws_ecr_repository.main resource is missing an encryption_configuration causing
it to use AWS-managed keys; add an encryption_configuration block with
encryption_type = "KMS" and kms_key = var.kms_key_arn in the
aws_ecr_repository.main resource, add a new module variable (e.g., variable
"kms_key_arn" with no default and a short description) to accept the CMK ARN
from the calling layer, and update callers to pass the CMK ARN; ensure the CMK's
IAM policy and grants follow the repo least-privilege hardening baseline (only
allow kms:Encrypt/Decrypt/GenerateDataKey for the ECR service principal and the
minimal principals that need access).

Sources: Learnings, Linters/SAST tools

Comment on lines +38 to +44
run "test_lifecycle_policy_retains_three_images" {
command = plan

assert {
condition = jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countNumber == 3
error_message = "ECR cache lifecycle policy must retain the last 3 images."
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Strengthen lifecycle-policy assertion to prevent semantic drift.

The test currently checks only countNumber; it won’t fail if countType or tagStatus changes.

✅ Suggested test hardening
   assert {
-    condition     = jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countNumber == 3
+    condition     = jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countNumber == 3
+      && jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countType == "imageCountMoreThan"
+      && jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.tagStatus == "any"
     error_message = "ECR cache lifecycle policy must retain the last 3 images."
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
run "test_lifecycle_policy_retains_three_images" {
command = plan
assert {
condition = jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countNumber == 3
error_message = "ECR cache lifecycle policy must retain the last 3 images."
}
run "test_lifecycle_policy_retains_three_images" {
command = plan
assert {
condition = jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countNumber == 3
&& jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.countType == "imageCountMoreThan"
&& jsondecode(aws_ecr_lifecycle_policy.main.policy).rules[0].selection.tagStatus == "any"
error_message = "ECR cache lifecycle policy must retain the last 3 images."
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@infrastructure/modules/ecr-cache/tests/ecr-cache.tftest.hcl` around lines 38
- 44, The test run "test_lifecycle_policy_retains_three_images" only asserts
aws_ecr_lifecycle_policy.main.policy.rules[0].selection.countNumber == 3, which
allows semantic drift if selection.countType or selection.tagStatus change;
update the assert block to also decode and check rules[0].selection.countType
(e.g. equals the expected value like "imageCountMoreThan") and
rules[0].selection.tagStatus (e.g. equals the expected value like
"untagged"/"tagged" as appropriate for your policy) so all three selection
fields are validated together.

@arkid15r arkid15r marked this pull request as draft June 6, 2026 17:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/run-image-build.yaml (1)

76-87: ⚠️ Potential issue | 🟠 Major

Switch AWS credential setup to GitHub OIDC (remove static access keys)

  • .github/workflows/run-image-build.yaml (Configure AWS credentials): still uses aws-access-key-id / aws-secret-access-key with aws-actions/configure-aws-credentials even though it assumes role-to-assume; replace with OIDC by adding permissions: { id-token: write, contents: read } and removing the static AWS secrets.
  • Same static-keys pattern exists in .github/workflows/run-image-scan.yaml, .github/workflows/run-deploy.yaml, and .github/workflows/run-infrastructure-bootstrap.yaml (bootstrap uses BOOTSTRAP_AWS_ACCESS_KEY_ID / BOOTSTRAP_AWS_SECRET_ACCESS_KEY).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/run-image-build.yaml around lines 76 - 87, Update the
"Configure AWS credentials" step to use GitHub OIDC instead of static keys:
remove the aws-access-key-id and aws-secret-access-key inputs from the
aws-actions/configure-aws-credentials step (keep role-to-assume,
role-session-name, role-external-id, role-duration-seconds,
role-skip-session-tagging), and add repository-level permissions: set
permissions: id-token: write and contents: read at the workflow level so the
action can exchange an OIDC token for the target role; apply the same change
(remove static AWS secrets and add OIDC permissions) to the other workflows
mentioned (run-image-scan.yaml, run-deploy.yaml,
run-infrastructure-bootstrap.yaml — also remove BOOTSTRAP_AWS_ACCESS_KEY_ID /
BOOTSTRAP_AWS_SECRET_ACCESS_KEY in the bootstrap workflow).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/run-release-version-resolution.yaml:
- Around line 24-30: Validate RELEASE_TAG before exporting release_version: in
the run block that currently sets release_version from RELEASE_TAG, add a
validation step to ensure RELEASE_TAG matches Docker tag rules (regex
^[A-Za-z0-9_][A-Za-z0-9_.-]{0,127}$ and length <=128) and only export echo
"release_version=$RELEASE_TAG" >> "$GITHUB_OUTPUT" if it passes; otherwise fall
back to the existing date+SHA fallback. Update the conditional around
RELEASE_TAG so the pipeline never forwards an invalid tag (used later as
BACKEND_IMAGE/FRONTEND_IMAGE via inputs.release_version) — reference the
RELEASE_TAG variable and the release_version output in the run block to locate
and change the logic.

---

Outside diff comments:
In @.github/workflows/run-image-build.yaml:
- Around line 76-87: Update the "Configure AWS credentials" step to use GitHub
OIDC instead of static keys: remove the aws-access-key-id and
aws-secret-access-key inputs from the aws-actions/configure-aws-credentials step
(keep role-to-assume, role-session-name, role-external-id,
role-duration-seconds, role-skip-session-tagging), and add repository-level
permissions: set permissions: id-token: write and contents: read at the workflow
level so the action can exchange an OIDC token for the target role; apply the
same change (remove static AWS secrets and add OIDC permissions) to the other
workflows mentioned (run-image-scan.yaml, run-deploy.yaml,
run-infrastructure-bootstrap.yaml — also remove BOOTSTRAP_AWS_ACCESS_KEY_ID /
BOOTSTRAP_AWS_SECRET_ACCESS_KEY in the bootstrap workflow).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 823237cf-60dc-4b21-a5d5-b8508c5eae03

📥 Commits

Reviewing files that changed from the base of the PR and between cc11eb0 and 74bd672.

📒 Files selected for processing (15)
  • .github/actions/install-backend-dependencies/action.yaml
  • .github/actions/run-trivy-scan/action.yaml
  • .github/workflows/ci-cd-production.yaml
  • .github/workflows/ci-cd-staging.yaml
  • .github/workflows/dependency-review.yaml
  • .github/workflows/run-backend-tests.yaml
  • .github/workflows/run-ci-cd.yaml
  • .github/workflows/run-code-checks.yaml
  • .github/workflows/run-coverage-upload.yaml
  • .github/workflows/run-e2e-tests.yaml
  • .github/workflows/run-frontend-tests.yaml
  • .github/workflows/run-image-build.yaml
  • .github/workflows/run-infrastructure-bootstrap.yaml
  • .github/workflows/run-infrastructure-tests.yaml
  • .github/workflows/run-release-version-resolution.yaml

Comment thread .github/workflows/run-release-version-resolution.yaml Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 15 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/actions/run-trivy-scan/action.yaml">

<violation number="1" location=".github/actions/run-trivy-scan/action.yaml:23">
P1: Cache path mismatch: `.trivy-cache` is cached but Trivy uses `~/.cache/trivy` by default. Without setting `TRIVY_CACHE_DIR=.trivy-cache`, the cache step is ineffective and Trivy will not benefit from it.</violation>

<violation number="2" location=".github/actions/run-trivy-scan/action.yaml:29">
P1: Trivy composite action can mask failed scan commands because it runs input script with `bash -c` without fail-fast flags.</violation>
</file>

<file name=".github/actions/setup-backend-environment/action.yaml">

<violation number="1" location=".github/actions/setup-backend-environment/action.yaml:98">
P2: Readiness check uses `/a` instead of dedicated `/status/` health check endpoint. The `/a` path works only through a fragile chain of Django redirects (APPEND_SLASH + admin auth redirect). The repo already has `/status/` — a purpose-built, lightweight health check endpoint.</violation>
</file>

<file name=".github/workflows/run-code-tests.yaml">

<violation number="1" location=".github/workflows/run-code-tests.yaml:64">
P2: Dead code: `run-fuzz-graphql-tests` job is permanently disabled via `if: false`</violation>
</file>

<file name=".github/actions/install-poetry/action.yaml">

<violation number="1" location=".github/actions/install-poetry/action.yaml:23">
P3: Unnecessary `chown` to the same user — files are already owned by the runner user. This adds a potential failure point on systems where the runner lacks `CAP_CHOWN`, with no benefit.</violation>
</file>

<file name=".github/workflows/run-code-checks.yaml">

<violation number="1" location=".github/workflows/run-code-checks.yaml:16">
P2: Missing timeout-minutes on three jobs: run-frontend-checks, run-pre-commit-checks, and run-spelling-checks. Every other job in the repository's workflows includes timeout-minutes; without it, a hung step will consume the full GitHub Actions runner allocation before being killed.</violation>

<violation number="2" location=".github/workflows/run-code-checks.yaml:143">
P2: pnpm version mismatch in run-spelling-checks job: uses 10.33.3 but cspell/package.json declares pnpm@11.5.1. Using `--frozen-lockfile` with a lockfile generated by a different major pnpm version can cause lockfile parsing errors or silently produce a different dependency tree.</violation>
</file>

<file name=".github/workflows/run-frontend-tests.yaml">

<violation number="1" location=".github/workflows/run-frontend-tests.yaml:27">
P2: Five-minute timeout on `run-a11y-tests` may be too tight. The two prerequisite steps (checkout + install-frontend-dependencies) can consume 2+ minutes on a cold cache, leaving minimal headroom for the actual accessibility test suite. Consider raising to 10 minutes or monitoring CI runtimes post-merge to validate the window.</violation>
</file>

<file name=".github/workflows/run-lighthouse-ci.yaml">

<violation number="1" location=".github/workflows/run-lighthouse-ci.yaml:36">
P2: Job-level timeout of 5 minutes is too tight for the full pipeline (checkout + dependency installation + Lighthouse CI audit of 8 URLs). The dependency install step alone can take 1-2 minutes on a cold cache, leaving only 3-4 minutes for auditing 8 pages — each of which typically takes 20-40 seconds. On slower GitHub Actions runners this can cause intermittent timeout failures.</violation>
</file>

<file name=".github/actions/apply-infrastructure-changes/action.yaml">

<violation number="1" location=".github/actions/apply-infrastructure-changes/action.yaml:46">
P3: Cache step is missing `restore-keys`: without a fallback prefix-based key, every cache miss results in a full plugin download with no partial restore from prior cached runs.</violation>
</file>

<file name="infrastructure/live/main.tf">

<violation number="1" location="infrastructure/live/main.tf:127">
P1: Chicken-and-egg: image build needs cache repos before Terraform creates them. Run-image-build runs before run-deploy in the CI/CD pipeline, but the cache repos (`backend_build_cache`, `frontend_build_cache`) are only provisioned when Terraform is applied during `run-deploy`. Docker's `cache-to: type=registry` will fail because the ECR repo doesn't exist yet.</violation>
</file>

<file name="frontend/next.config.ts">

<violation number="1" location="frontend/next.config.ts:65">
P2: Rewrite source paths changed from `/csrf`/`/graphql` to `/csrf/`/`/graphql/`, but `skipTrailingSlashRedirect: true` prevents Next.js from normalizing trailing slashes in E2E mode. Requests to `/csrf` or `/graphql` without trailing slashes will no longer match the rewrite and won't be proxied to the backend, breaking those E2E flows.</violation>
</file>

<file name="infrastructure/modules/ecr-cache/main.tf">

<violation number="1" location="infrastructure/modules/ecr-cache/main.tf:15">
P3: Missing `force_delete = true` on the ECR cache repository. Since this repo will always contain images during normal operation, `terraform destroy` will fail with a `RepositoryNotEmptyException`. Consider setting `force_delete = true` since cache images are rebuildable and the lifecycle policy already manages image retention.</violation>
</file>

<file name=".github/workflows/run-deploy.yaml">

<violation number="1" location=".github/workflows/run-deploy.yaml:158">
P1: Migrate task does not validate ECS run-task response before extracting task ARN, unlike the index-data task which properly checks for failures and empty tasks. If run-task fails (e.g., insufficient capacity, bad networking), downstream steps will get "null" as the task ARN and fail with confusing errors rather than a clear failure message.</violation>

<violation number="2" location=".github/workflows/run-deploy.yaml:221">
P2: Index-data task is fire-and-forget: the step validates it started but never waits for it to finish or checks its exit code. A failure in this critical data-indexing step (which runs ~30 min async) will go completely undetected — the deployment reports success while the index task silently fails.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread .github/workflows/ci-cd-production.yaml Outdated
Comment thread .github/workflows/run-ci-cd.yaml
Comment thread .github/workflows/run-release-version-resolution.yaml Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 7 files (changes from recent commits).

Requires human review: Auto-approval blocked by 17 unresolved issues from previous reviews.

Re-trigger cubic

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/run-image-scan.yaml (1)

129-136: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

gh release upload requires repository context.

The gh release upload command needs either a git repository context (via checkout) or an explicit --repo flag. Without either, this step will fail because gh cannot determine which repository to upload to.

🐛 Proposed fix: Add --repo flag
       - name: Upload SBOM to release
         env:
           GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
           RELEASE_TAG: ${{ inputs.release_tag }}
         run: |
-          gh release upload "$RELEASE_TAG" --clobber \
+          gh release upload "$RELEASE_TAG" --repo "${{ github.repository }}" --clobber \
             "backend-sbom-$RELEASE_VERSION.cdx.json" \
             "frontend-sbom-$RELEASE_VERSION.cdx.json"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/run-image-scan.yaml around lines 129 - 136, The "Upload
SBOM to release" step uses the gh release upload command without repository
context; update the step to provide repo context by either ensuring a prior
actions/checkout@v3 runs in the job or adding the --repo flag to the gh release
upload invocation (use the current repository via the GITHUB_REPOSITORY env or a
literal owner/repo), keeping GH_TOKEN and RELEASE_TAG env usage intact so gh can
authenticate and upload the files.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/run-image-scan.yaml:
- Around line 129-136: The "Upload SBOM to release" step uses the gh release
upload command without repository context; update the step to provide repo
context by either ensuring a prior actions/checkout@v3 runs in the job or adding
the --repo flag to the gh release upload invocation (use the current repository
via the GITHUB_REPOSITORY env or a literal owner/repo), keeping GH_TOKEN and
RELEASE_TAG env usage intact so gh can authenticate and upload the files.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: da6c6ed1-ed9c-4d64-8128-78721dd881e4

📥 Commits

Reviewing files that changed from the base of the PR and between 74bd672 and e9ac5d9.

📒 Files selected for processing (7)
  • .github/workflows/ci-cd-production.yaml
  • .github/workflows/ci-cd-staging.yaml
  • .github/workflows/run-ci-cd.yaml
  • .github/workflows/run-deploy.yaml
  • .github/workflows/run-image-build.yaml
  • .github/workflows/run-image-scan.yaml
  • .github/workflows/run-infrastructure-bootstrap.yaml

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 3 files (changes from recent commits).

Requires human review: Auto-approval blocked by 16 unresolved issues from previous reviews.

Re-trigger cubic

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.76%. Comparing base (1691329) to head (6081960).
⚠️ Report is 26 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4845      +/-   ##
==========================================
- Coverage   98.86%   98.76%   -0.11%     
==========================================
  Files         538      538              
  Lines       17157    16993     -164     
  Branches     2406     2406              
==========================================
- Hits        16963    16783     -180     
- Misses        105      121      +16     
  Partials       89       89              
Flag Coverage Δ
backend 99.45% <ø> (ø)
frontend 96.80% <ø> (-0.47%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 62 files with indirect coverage changes


Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1691329...6081960. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@arkid15r arkid15r marked this pull request as ready for review June 6, 2026 18:47
@arkid15r arkid15r enabled auto-merge June 6, 2026 18:47
coderabbitai[bot]
coderabbitai Bot previously requested changes Jun 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/run-coverage-upload.yaml:
- Around line 47-49: The "Import Codecov GPG verification key" step currently
imports the key blindly; change it to download the key to a file, verify its
fingerprint equals 27034E7FDB850E0BBC2C62FF806BB28AED779869 using gpg
--with-fingerprint (or gpg --fingerprint) and exit non‑zero if the fingerprint
does not match, then only import the key into the keyring after successful
verification; update the step that runs curl ... | gpg --import to perform these
download->verify->import actions and fail the job on mismatch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 85a9b5a3-acff-417d-a611-5c75ddbfc543

📥 Commits

Reviewing files that changed from the base of the PR and between e9ac5d9 and 6081960.

📒 Files selected for processing (3)
  • .github/workflows/run-coverage-upload.yaml
  • .github/workflows/run-release-version-resolution.yaml
  • cspell/custom-dict.txt

Comment on lines +47 to +49
- name: Import Codecov GPG verification key
run: curl -sf https://uploader.codecov.io/verification.gpg | gpg --import

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What is the official current GPG fingerprint for Codecov uploader verification key (uploader.codecov.io/verification.gpg), as documented by Codecov?

💡 Result:

The official current GPG key fingerprint Codecov documents for the Codecov uploader verification key (used to verify uploader.codecov.io/verification.gpg) is: 2703 4E7F DB85 0E0B BC2C 62FF 806B B28A ED77 9869 [1][2] Codecov also documents the corresponding Key ID as ED779869. [1][2]

Citations:


Pin and verify the imported Codecov GPG key fingerprint before trusting it.

File: .github/workflows/run-coverage-upload.yaml (lines 47-49) — the workflow currently imports https://uploader.codecov.io/verification.gpg directly from the network (curl ... | gpg --import). Add a fingerprint check against Codecov’s published value (27034E7FDB850E0BBC2C62FF806BB28AED779869) and fail if it doesn’t match.

Suggested hardening
-      - name: Import Codecov GPG verification key
-        run: curl -sf https://uploader.codecov.io/verification.gpg | gpg --import
+      - name: Import and verify Codecov GPG verification key
+        shell: bash
+        run: |
+          set -euo pipefail
+          curl -fsSL https://uploader.codecov.io/verification.gpg -o /tmp/codecov-verification.gpg
+          gpg --batch --import /tmp/codecov-verification.gpg
+          # Codecov-published fingerprint for the uploader verification key:
+          # 2703 4E7F DB85 0E0B BC2C 62FF 806B B28A ED77 9869
+          test "$(gpg --show-keys --with-colons /tmp/codecov-verification.gpg | awk -F: '/^fpr:/ {print $10; exit}')" = "27034E7FDB850E0BBC2C62FF806BB28AED779869"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Import Codecov GPG verification key
run: curl -sf https://uploader.codecov.io/verification.gpg | gpg --import
- name: Import and verify Codecov GPG verification key
shell: bash
run: |
set -euo pipefail
curl -fsSL https://uploader.codecov.io/verification.gpg -o /tmp/codecov-verification.gpg
gpg --batch --import /tmp/codecov-verification.gpg
# Codecov-published fingerprint for the uploader verification key:
# 2703 4E7F DB85 0E0B BC2C 62FF 806B B28A ED77 9869
test "$(gpg --show-keys --with-colons /tmp/codecov-verification.gpg | awk -F: '/^fpr:/ {print $10; exit}')" = "27034E7FDB850E0BBC2C62FF806BB28AED779869"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/run-coverage-upload.yaml around lines 47 - 49, The "Import
Codecov GPG verification key" step currently imports the key blindly; change it
to download the key to a file, verify its fingerprint equals
27034E7FDB850E0BBC2C62FF806BB28AED779869 using gpg --with-fingerprint (or gpg
--fingerprint) and exit non‑zero if the fingerprint does not match, then only
import the key into the keyring after successful verification; update the step
that runs curl ... | gpg --import to perform these download->verify->import
actions and fail the job on mismatch.

@arkid15r arkid15r added this pull request to the merge queue Jun 6, 2026
Merged via the queue into main with commit 53e5799 Jun 6, 2026
59 of 60 checks passed
@arkid15r arkid15r deleted the feature/ci-cd-optimization-merge branch June 6, 2026 20:04
@arkid15r arkid15r restored the feature/ci-cd-optimization-merge branch June 6, 2026 21:03
@arkid15r arkid15r deleted the feature/ci-cd-optimization-merge branch June 6, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend ci docs Improvements or additions to documentation frontend infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants