Skip to content

[DO NOT MERGE] CI validation for #562 (llm-proxy behind hub, oidc images)#566

Draft
fabianvf wants to merge 23 commits into
konveyor:mainfrom
fabianvf:ci-562-images
Draft

[DO NOT MERGE] CI validation for #562 (llm-proxy behind hub, oidc images)#566
fabianvf wants to merge 23 commits into
konveyor:mainfrom
fabianvf:ci-562-images

Conversation

@fabianvf
Copy link
Copy Markdown
Contributor

Draft / do-not-merge. Scratch PR to run the real llm-proxy-test CI against jortel's oidc branch (#562) plus fixes, iterating to green before #562 merges.

Contains, on top of #562:

  1. :bug: fix for the line-575 reconcile crash (skipped Keycloak-detection register lacks .resources).
  2. :bug: fix for the line-753 reconcile crash (host-less UI Ingress has no .spec.rules[0].host).
  3. TEST-ONLY image pins → quay.io/jortel/tackle2-hub:oidc and quay.io/jortel/tackle2-ui:oidc, so CI exercises the real hub-OIDC auth flow instead of the stale keycloak-based :latest images.

(1) and (2) land cleanly via jortel#2jortel:oidc. (3) is throwaway. Base is main so the diff shows all of #562 + these.

Refs: #562, konveyor/tackle2-hub#1042

jortel and others added 17 commits May 18, 2026 17:56
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
Signed-off-by: Jeff Ortel <jortel@redhat.com>
llm-proxy is now reached only through tackle2-hub's /services/llm-proxy
reverse-proxy endpoint. The hub validates the bearer token, enforces the
llm-proxy RBAC scope, and forwards user identity in the X-Hub-User /
X-Hub-User-Attributes headers it injects in ServiceHandler.Forward.

Operator changes:
- Hub Deployment gains LLM_PROXY_URL env (only when kai_llm_proxy_enabled),
  pointing at the cluster-internal Service URL. Hub reads this at startup
  to register the llm-proxy route in its serviceRoutes map.
- UI Deployment drops KAI_LLM_PROXY_URL; the new UI image's server-side
  proxy targets TACKLE_HUB_URL with pathRewrite /llm-proxy -> /services/llm-proxy.
- llm-proxy ConfigMap drops the oauth2_token Keycloak JWKS auth block.
  llm-proxy itself does no token validation; the namespace NetworkPolicy
  is the only barrier against in-namespace callers bypassing the hub.
  Same trust posture as kai-solution-server today.
- Renamed kai_llm_proxy_url -> kai_llm_proxy_internal_url to signal
  "only hub should call this directly".
- test-llm-proxy.sh rewritten for the new auth flow: uses HTTP Basic Auth
  against hub's seeded admin user and hits the LLM through
  /hub/services/llm-proxy/* so the test path goes through the existing
  /hub UI proxy and does not depend on the new tackle2-ui image.

Companion changes required in:
- tackle2-hub: register llm-proxy in serviceRoutes + inject X-Hub-User
  identity headers in ServiceHandler.Forward.
- tackle2-ui: server/src/proxies.js retargets /llm-proxy at TACKLE_HUB_URL
  with pathRewrite to /services/llm-proxy. (Browser use only; not needed
  for the e2e test in this PR.)

Based on PR konveyor#562 (Keycloak dependency removal / hub OIDC provider).
✨ Route llm-proxy through hub /services/llm-proxy
The "Set federated IDP issuer" set_fact guarded the conditionally-
registered rhsso_keycloak_cr / rhbk_keycloak_cr vars with `is defined`.
Those k8s_info tasks only run `when: app_profile == 'mta'`; on other
profiles they are skipped, but a skipped task still registers a result
dict that has no `.resources` key. `is defined` is True for that dict,
so the template then evaluated `.resources` and failed with
"'dict object' has no attribute 'resources'", aborting the reconcile
before any workloads were created.

Keep the `is defined` guard (short-circuits a genuinely undefined var)
and add `default([])` on `.resources` so the skipped-register case
(defined dict, missing attribute) collapses to an empty list.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
"Set app base URL from UI Ingress" (and the matching Hub OIDC issuer
task) read ui_ingress_info.resources[0].spec.rules[0].host, guarded
only by `resources | length > 0`. The default UI Ingress
(ingress-ui.yml.j2) defines a host-less rule (`- http:` with no
`host:`), so on any non-OpenShift cluster (e.g. minikube CI) the rule
exists but has no `host` key, and the template failed with
"'dict object' has no attribute 'host'", aborting the reconcile before
llm-proxy / kai-db were created.

Add a `...rules[0].host is defined` guard to both tasks so a host-less
Ingress leaves app_base_url / hub_oidc_issuer at their defaults instead
of crashing.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a2bbe926-3391-4032-bf5c-ecc7b1343cb2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Do not merge. Points the hub/ui images at jortel's personal quay :oidc
tags (the OIDC-provider hub from tackle2-hub#1042 and the matching UI)
so the draft CI on operator#562 can exercise the real hub-OIDC auth
flow instead of crashing on the stale keycloak-based :latest images.

Source of truth is helm/values.yaml (make bundle regenerates the CSV
from the helm chart); the bundle CSV is updated to match. The real
bundle must pin published konveyor images.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
fabianvf added 5 commits May 28, 2026 14:28
Two changes so the OIDC stack comes up on minikube CI:

- deployment-ui.yml.j2: re-add KAI_LLM_PROXY_URL (gated on
  kai_llm_proxy_enabled) so the current UI image, which still proxies
  /llm-proxy directly, boots. Marked TODO to remove once a UI image that
  routes /llm-proxy through the hub is published. The hub reaches
  llm-proxy via LLM_PROXY_URL, not this.

- llm-proxy-test.yml: the hub's builtin OIDC provider requires an issuer
  derived from app_base_url; minikube's Ingress is host-less, so supply
  app_base_url from the minikube IP via the test CR.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
- main.yml: derive hub_oidc_issuer whenever app_base_url is set (not only
  when an ingress host exists), and re-apply the Hub Deployment after the
  issuer is derived. The hub is created earlier in the play (before the
  UI Route/Ingress), so without the re-apply it always starts with an
  empty OIDC_ISSUER and the builtin provider refuses to boot.

- test-llm-proxy.sh: port-forward to the hub Service and hit
  /services/llm-proxy directly, instead of via the UI's /hub proxy, so
  the llm-proxy test validates the hub auth+route path without depending
  on the UI image.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
kai_database_image_fqin aliased keycloak_database_image_fqin to reuse
the shared postgres image for the llm-proxy DB. 562 deleted that var,
leaving a dangling reference that failed reconcile with
"AnsibleUndefinedVariable: keycloak_database_image_fqin is undefined".
Source the same image from RELATED_IMAGE_TACKLE_POSTGRES directly.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
The llm-proxy test reaches the hub directly and does not exercise the
UI. Until a UI image matching 562's OIDC env contract is published, run
the UI at 0 replicas so the install's deployment-readiness gate isn't
blocked by the crashlooping UI image. Test-only; revert when a working
UI image exists.

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
- workflow debug step: also dump tackle-hub logs on failure
- test-llm-proxy.sh: capture and print the HTTP status code + body on a
  failed chat-completion request, so auth (401/403) vs route/forward
  (5xx) failures are distinguishable

Signed-off-by: Fabian von Feilitzsch <fabian@fabianism.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants