Skip to content

feat(security): SSO/JWT authentication migration (Phase 1)#1569

Draft
jsell-rh wants to merge 41 commits into
mainfrom
jsell/spec/sso-authentication
Draft

feat(security): SSO/JWT authentication migration (Phase 1)#1569
jsell-rh wants to merge 41 commits into
mainfrom
jsell/spec/sso-authentication

Conversation

@jsell-rh
Copy link
Copy Markdown
Contributor

@jsell-rh jsell-rh commented May 12, 2026

Summary

Phase 1 of SSO authentication migration — independently shippable. Replaces OpenShift OAuth proxy with direct OIDC authentication via Keycloak.

What's implemented

  • Frontend BFF OIDC: Next.js acts as confidential OIDC client. Browser gets httpOnly session cookie, never sees a raw JWT. Authorization Code Flow with PKCE. Transparent token refresh (5-min access tokens, 30-min sessions).
  • Backend JWT validation: JWKS-based validation via lestrrat-go/jwx/v2. Validates signature, expiration, issuer, and audience.
  • K8s impersonation: Backend SA + Impersonate-User/Impersonate-Group headers preserve all existing RBAC enforcement without cluster OIDC federation.
  • Dual-path auth: JWT validation first, K8s TokenReview fallback for API keys (SA tokens). Both paths use impersonation.
  • SSAR cache fix: Cache key includes impersonated identity to prevent cross-user authorization leaks.
  • Local Keycloak: Kind cluster includes Keycloak with pre-configured realm, dev users, and protocol mappers. Version-controlled realm export.
  • Feature-flagged migration: sso-authentication Unleash flag (infrastructure, not user-facing). make kind-sso-toggle switches between SSO and legacy mode.
  • Session expired UX: Global 401 detection via React Query cache, blocking dialog with login redirect, no retry storms.
  • E2E test auth: client_credentials grant from Keycloak with K8s SA fallback.
  • Keycloak proxy: /sso/* catch-all route proxies Keycloak through the frontend origin, combined with KC_HOSTNAME_BACKCHANNEL_DYNAMIC for consistent token issuers.

Key files

Area Files
Spec specs/security/sso-authentication.spec.md (12 requirements, 30 scenarios)
Workflow workflows/security/sso-migration.workflow.md
Backend JWT components/backend/jwtauth/validator.go, validator_test.go
Backend SSO components/backend/handlers/sso.go, middleware.go, server/server.go, server/k8s.go
Frontend OIDC components/frontend/src/lib/oidc.ts, session.ts, auth.ts
Frontend routes src/app/api/auth/sso/{login,callback,logout}/route.ts, src/app/sso/[...path]/route.ts
Frontend UX src/components/session-expired-dialog.tsx, src/lib/query-client.ts
Manifests overlays/kind/keycloak-*.yaml, sso-credentials.yaml, backend-sso-patch.yaml
RBAC base/rbac/backend-clusterrole.yaml (impersonate verb added)

Default behavior

  • make kind-up deploys with legacy auth (SA token, no Keycloak redirect)
  • make kind-sso-toggle enables Keycloak OIDC for both frontend and backend
  • Dev credentials: developer / developer

Production deployment prerequisites

To deploy SSO in a non-Kind environment, you need:

  1. An OIDC provider — either Red Hat SSO directly, or a Keycloak instance (with optional Identity Brokering to RH SSO in Phase 2+). Any OIDC-compliant provider works.
  2. A confidential client registered in the provider with:
    • Authorization Code grant enabled
    • Valid redirect URI pointing to the frontend callback (https://<frontend>/api/auth/sso/callback)
    • Post-logout redirect URI to frontend root
    • Web origins matching the frontend host
  3. A K8s Secret (sso-credentials) with SSO_ISSUER_URL, SSO_CLIENT_ID, SSO_CLIENT_SECRET, SSO_AUDIENCE, and SESSION_SECRET
  4. Backend env vars: SSO_ISSUER_URL and SSO_AUDIENCE (from the secret)
  5. Frontend env vars: SSO_ENABLED=true, SSO_ISSUER_URL, SSO_CLIENT_ID, SSO_CLIENT_SECRET, SSO_REDIRECT_URI, SESSION_SECRET
  6. Unleash flag: sso-authentication enabled for the target environment
  7. Existing RoleBindings work as-is — they already use email addresses as subjects, which matches the JWT email claim used for impersonation

Identity Brokering (running your own Keycloak that federates login to RH SSO) is not required for Phase 1. It is a Phase 2+ convenience for environments that want client management autonomy without RH SSO realm admin access.

The /sso/* proxy route and KC_HOSTNAME_BACKCHANNEL_DYNAMIC config are only needed in Kind (where the browser and server reach Keycloak via different URLs). In production, the browser and server use the same URL, so standard openid-client discovery works without the proxy.

What's NOT in scope (by design)

  • OAuth proxy removal (deferred to flag removal phase)
  • CLI OIDC implementation (Keycloak client configured, CLI code is separate)
  • Identity Brokering setup (ops task, not code — see spec § Roadmap)

Test plan

  • Frontend login via Keycloak → session cookie → JWT forwarded to backend
  • Backend validates JWT, impersonates user, RBAC enforced
  • API key fallback via TokenReview still works
  • Token refresh works silently (verified via pod logs)
  • Session expired dialog appears on refresh token expiry
  • Logout destroys session + Keycloak single sign-out
  • make kind-sso-toggle switches between SSO and legacy mode
  • Legacy mode (SA token auth) works when SSO is off
  • E2E token extraction via Keycloak client_credentials
  • All integration auth routes (GitHub, GitLab, Jira, etc.) unaffected

🤖 Generated with Claude Code

Define desired state for migrating from OpenShift OAuth proxy to direct
SSO/JWT authentication. Key decisions:

- BFF pattern: Next.js as OIDC confidential client, browser gets session cookie
- K8s impersonation: backend SA + Impersonate-User/Group preserves RBAC
- Dual-path auth: JWT first, TokenReview fallback for API keys
- Feature-flagged migration for incremental rollout
- Supersedes ADR-0002 (raw token passthrough → impersonation)

Includes migration workflow with consumer impact map and implementation notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented May 12, 2026

Deploy Preview for cheerful-kitten-f556a0 canceled.

Name Link
🔨 Latest commit 622d993
🔍 Latest deploy log https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/6a07457b40260000086e5555

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9cd04d79-aa61-461f-ac9d-efacc6c711a2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jsell/spec/sso-authentication
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch jsell/spec/sso-authentication

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

jsell-rh and others added 27 commits May 12, 2026 15:05
Reference the IAM consolidation proposal (PR #1466) as the long-term
direction. This spec is Phase 1; future phases cover API keys → SSO
service accounts, runner → OIDC token exchange, DB RBAC reconciler,
and credential consolidation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… requirements

- OIDC callback must coexist with existing integration auth routes
- SSO client configuration requirements (one per environment, audience isolation)
- Post-logout redirect URI and web origins specified

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Kind/local-dev environments include Keycloak with pre-configured realm
- Replaces static JWKS ConfigMap, DISABLE_AUTH mock mode, and OC_TOKEN
- Same JWT validation code path as production (no dev-only auth logic)
- Realm config version-controlled as JSON export
- E2E tests use local Keycloak in Kind environments
- Design decision: Keycloak Identity Brokering for deployed environments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Keycloak Deployment, realm JSON, and env var config for Kind overlay
- Maps what it replaces (static JWKS, DISABLE_AUTH, test-user SA)
- Identity Brokering section for deployed environments
- Updated manifest changes to include Kind overlay additions/removals

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d impersonation RBAC

Slice 1 of SSO authentication migration (Phase 1):

- Deploy Keycloak to Kind cluster with pre-configured realm (ambient-code)
  including confidential frontend client, public CLI client, and E2E
  client_credentials client. Dev users: developer/developer, admin/admin.
- Add jwtauth package with JWKS-based JWT validation using lestrrat-go/jwx/v2.
  Validates signature, expiration, issuer, and audience. Extracts OIDC claims
  (sub, email, preferred_username, groups).
- Add impersonate verb on users, groups, and serviceaccounts to backend-api
  ClusterRole for K8s impersonation under SSO auth.
- Fix Kind overlay: relax runAsNonRoot for ambient-api-server, make
  control-plane OIDC env vars optional.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mpersonation

Slice 2 of SSO authentication migration (Phase 1):

- Wire JWT validation into backend middleware: forwardedIdentityMiddleware
  validates JWT against Keycloak JWKS, extracts identity from OIDC claims
  (sub, email, preferred_username, groups), and stores validated claims
  in Gin context for reuse by handlers.
- Add dual-path auth in getK8sClientsDefault: JWT validation first, then
  TokenReview fallback for API keys (K8s ServiceAccount tokens).
- Use K8s impersonation (Impersonate-User/Group) instead of raw bearer
  token when SSO is enabled. Backend SA token + impersonation preserves
  all existing RBAC enforcement.
- Fix SSAR cache key to include impersonated identity instead of shared
  SA token, preventing cross-user authorization cache leaks.
- Gate SSO path behind "sso-authentication" Unleash feature flag.
- Add SSO env vars (SSO_ISSUER_URL, SSO_AUDIENCE) to backend Kind overlay.
- Fix Keycloak realm: add audience mapper and protocol mappers for sub,
  email, preferred_username claims in access token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Slice 3 of SSO authentication migration (Phase 1):

- Add openid-client v6, iron-session v8, and jose as dependencies
- Create OIDC client layer (src/lib/oidc.ts): discovery, authorization URL
  construction with PKCE, code exchange, token refresh, end-session URL
- Create encrypted session cookie management (src/lib/session.ts):
  iron-session with httpOnly/secure/sameSite cookies, transparent token
  refresh when access token is within 60s of expiry
- Add SSO API routes:
  - /api/auth/sso/login: generates PKCE, stores verifier/state in cookies,
    redirects to Keycloak authorization endpoint
  - /api/auth/sso/callback: exchanges code for tokens, stores in session
  - /api/auth/sso/logout: destroys session, redirects to Keycloak logout
- Add Next.js middleware: redirects unauthenticated page requests to SSO
  login when SSO_ENABLED=true
- Modify buildForwardHeadersAsync: SSO path extracts JWT from session,
  sets Authorization: Bearer and X-Forwarded-* headers from JWT claims.
  All 97+ consumers are unaffected.
- Update navigation logout to use SSO logout route when enabled
- Update /api/me to accept Authorization header for auth check
- Add SSO env vars to Kind frontend deployment patch
- Support SSO_PUBLIC_ISSUER_URL for Kind dev (browser vs cluster URLs)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keycloak supports any port for localhost redirect URIs per RFC 8252
section 7.3. Registering http://localhost/* (without port) accepts
callbacks on any ephemeral port, eliminating port-forward mismatches.

Also set webOrigins to "+" (all valid redirect origins) for CORS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
openid-client v6 requires a standard URL instance (not NextURL).
Construct callback URL from SSO_REDIRECT_URI base to match the
redirect_uri sent during authorization, since request.url inside
the container resolves to 0.0.0.0:3000 rather than localhost:11646.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In Kind, Keycloak's iss response parameter uses the public URL
(localhost:30090) while openid-client validates against the internal
URL (keycloak-service:8080). Remap the iss param before passing to
authorizationCodeGrant so RFC 9207 issuer validation passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Set KC_HOSTNAME to the internal service URL so Keycloak uses a
consistent issuer in all tokens and OIDC responses, regardless of
whether the browser reaches it via localhost:30090 or the server
reaches it via keycloak-service:8080. This eliminates issuer
mismatches in ID token validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In Kind, the browser reaches Keycloak via localhost:30090 while backend
and frontend servers use keycloak-service:8080. Keycloak sets the token
issuer based on the authorization session URL, causing mismatches.

Fixes:
- Add alt issuer support to JWT validator (AddAltIssuer) so the backend
  accepts tokens from both internal and public Keycloak URLs. Production
  environments use a single URL and don't need alt issuers.
- Use standard openid-client authorizationCodeGrant in production (full
  ID token validation). Fall back to manual token exchange in dev when
  SSO_PUBLIC_ISSUER_URL differs from SSO_ISSUER_URL.
- Set cookies directly on redirect response in login route (cookies()
  API mutations don't transfer to NextResponse.redirect).
- Derive post-login redirect origin from SSO_REDIRECT_URI to avoid
  container-internal 0.0.0.0:3000 address.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ints

The OIDC discovery config was cached as a module-level singleton with no
expiry. If Keycloak restarted and got a new ClusterIP, token refresh
calls would fail silently (ECONNREFUSED) and the session would be
destroyed, logging the user out.

Add a 5-minute TTL so the config is re-discovered periodically. This
matches the Keycloak JWKS cache interval and ensures endpoint URLs
stay current after dependency restarts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getUserSubjectFromContext now prefers userEmail (matching the
Impersonate-User header) when creating RoleBindings. Previously it
used userName (preferred_username), causing a mismatch: the
RoleBinding subject would be "developer" but impersonation would
use "developer@local.dev", so RBAC checks would fail.

This ensures lazy RoleBinding creation in CreateProject works
correctly with SSO impersonation — no manual RoleBindings needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Callback route: redirect to /api/auth/sso/login instead of showing
  JSON error when OIDC state cookies are missing or exchange fails.
  Handles stale Keycloak sessions that skip the login page.
- Logout route: derive post-logout redirect URI from SSO_REDIRECT_URI
  to avoid 0.0.0.0:3000 container address.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NEXT_PUBLIC_* env vars are inlined at build time in Next.js client
components, so they're unavailable when the image is built without
them. Instead, expose ssoEnabled from the /api/me server route and
read it in the navigation component via useCurrentUser().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The project layout had its own handleLogout hardcoded to /oauth/sign_out,
separate from the main navigation. Unified both to use the runtime
ssoEnabled flag from useCurrentUser().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Slice 4 of SSO authentication migration (Phase 1):

- Update extract-token.sh to obtain JWT from Keycloak via
  client_credentials grant (ambient-e2e client). Falls back to K8s
  SA token when Keycloak is not available.
- Add audience and sub protocol mappers to ambient-e2e Keycloak client
  so tokens have proper aud claim for backend validation.
- Add ClusterRoleBinding for e2e service account identity
  (service-account-ambient-e2e) so E2E tests can access projects.
- No developer RoleBindings — JIT provisioning via CreateProject
  handles first-time access correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the OIDC session expires and token refresh fails, the user now
sees a blocking dialog instead of silent 401 errors:

- Global 401 detection via QueryCache/MutationCache onError handlers
- Skip retries on 401 to prevent request storms against the IdP
- Non-dismissable AlertDialog with "Log in" button that preserves
  returnTo path so users land back on the same page
- No "expiring soon" warning — server-side refresh handles access
  token renewal transparently; only surfaces when refresh token dies

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add logging to getAccessToken so token refresh attempts and failures
  are visible in pod logs (was silently swallowing errors).
- Fix middleware to return 401 JSON for RSC/fetch requests instead of
  redirecting to Keycloak. Cross-origin redirects fail as XHR and cause
  CORS errors. Full page navigations still redirect to login.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
openid-client's refreshTokenGrant validates the ID token iss claim
in the refresh response, which fails when the token was issued by
localhost:30090 but the refresh goes through keycloak-service:8080.

Use manual fetch to the token endpoint in split-URL mode (same
approach as code exchange). Production uses the library's standard
refreshTokenGrant with full validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The split-URL problem (browser→localhost:30090, server→keycloak-service:8080)
caused token issuer mismatches that broke refresh tokens and ID token
validation. Every workaround added complexity.

Root fix: proxy Keycloak through the frontend at /sso/* so browser and
server both reach Keycloak through the same origin. Combined with
KC_HOSTNAME=http://keycloak-service:8080, all tokens now have a
consistent issuer that matches the discovery endpoint.

Changes:
- Add /sso/[...path] catch-all route that proxies to Keycloak,
  rewriting Location headers on redirects
- Set KC_HOSTNAME to internal service URL for consistent token issuer
- Update SSO_PUBLIC_ISSUER_URL to use the proxy path
- Exclude /sso from auth middleware matcher
- Remove unused next.config.js rewrites (build-time, not runtime)

This eliminates: alt issuers on the backend, manual token exchange
fallbacks, iss parameter remapping in callbacks, and CORS errors on
session expiry redirects. Production deployments use a single URL
and don't need the proxy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eliminates the split-URL issuer mismatch by properly configuring
Keycloak's hostname-backchannel-dynamic feature:

- KC_HOSTNAME=http://localhost:11646/sso — all tokens use the
  public URL as issuer, login pages render with proxy URLs
- KC_HOSTNAME_BACKCHANNEL_DYNAMIC=true — internal services get
  backchannel URLs (token_endpoint, jwks_uri) via keycloak-service:8080

Frontend changes:
- Manual OIDC discovery to bypass openid-client v6's issuer validation
  (known issue: github.com/panva/openid-client/issues/737)
- Remove all split-URL workarounds (manual token exchange, iss
  remapping, URL rewriting in auth/logout/refresh)
- openid-client's standard authorizationCodeGrant and refreshTokenGrant
  now work correctly for all flows

Backend changes:
- JWT validator uses discovered issuer from OIDC metadata (not the
  discovery URL) so it accepts the public issuer in tokens

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Frontend README: replace OC_TOKEN/OAuth proxy header docs with SSO
  env vars and OIDC session model description
- Frontend .env.example: add SSO_* vars, move OC_* to legacy section
- Backend README: replace DISABLE_AUTH migration guide with Keycloak
  dev auth instructions (JWT and SA token examples)
- E2E README: update quick start to use extract-token.sh (Keycloak
  client_credentials with K8s SA fallback)
- Kind dev guide: add Keycloak to bootstrap steps, document dev
  credentials and session lifetimes
- CONTRIBUTING.md: add Keycloak to kind-up description, update
  access instructions with login info
- OPENSHIFT_OAUTH.md: mark as legacy, link to SSO spec

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…facing

The sso-authentication Unleash flag controls which auth path the
backend uses. It is not visible in workspace settings and is not
user-configurable — ops enables it per-environment during migration.
Kind dev cluster creates and enables it automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jsell-rh and others added 2 commits May 14, 2026 17:39
Toggles SSO on/off for both frontend (SSO_ENABLED env var) and backend
(sso-authentication Unleash flag) in a single command. Legacy mode
(SA token auth) is the default after kind-up; run kind-sso-toggle to
enable Keycloak OIDC.

Also updates Kind dev guide and backend README to document the toggle
and clarify that legacy mode is the default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jsell-rh jsell-rh changed the title spec(security): SSO/JWT authentication migration feat(security): SSO/JWT authentication migration (Phase 1) May 14, 2026
@jsell-rh jsell-rh requested a review from markturansky May 14, 2026 21:45
@jsell-rh jsell-rh self-assigned this May 14, 2026
Extract fileTabs.updateTaskStatus to a const so the useEffect
dependency array references the stable callback directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jsell-rh
Copy link
Copy Markdown
Contributor Author

CI note: 3 pre-existing flaky test failures

The following 3 tests fail intermittently due to shared mutable state (BaseKubeConfig) between Ginkgo tests in the handler suite:

  • Projects Handler > Project Lifecycle Management > CreateProject > Should handle existing project gracefully
  • Projects Handler > Project Namespace Management > Should create namespace with proper labels
  • Projects Handler > Error Scenarios > Should handle concurrent project creation

These tests pass in isolation but fail when run as part of the full suite due to test ordering. The SSO changes do not affect this path — SSOEnabled() returns false without Unleash (test environment), so getK8sClientsDefault takes the legacy code path unchanged.

jsell-rh and others added 10 commits May 14, 2026 17:58
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes:
- models_test.go captured K8sClientMw/DynamicClient at module load time
  (when nil), then restored them to nil in AfterEach, poisoning
  subsequent tests. Move capture to BeforeEach so values are saved
  after SetupHandlerDependencies runs.
- getUserSubjectFromContext now falls back to userID context value
  (set by SetTestToken in tests) after checking userEmail, userIDOriginal,
  and userName. This ensures tests that only set userID still work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
buildForwardHeadersSSO now falls back to Bearer token from request
when no SSO session cookie exists. This enables:
- SSO users: session cookie → JWT forwarded
- E2E tests / API clients: Bearer token in request → forwarded directly

Also adds Keycloak to wait-for-ready.sh to prevent race conditions
where frontend starts before Keycloak is ready.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The ambient-e2e service account JWT was missing the preferred_username
claim, causing the backend to fall back to the 'sub' claim (a UUID)
for K8s impersonation. The RBAC expects 'service-account-ambient-e2e'.

This adds a protocol mapper to include the service account's username
in JWT tokens, enabling proper identity mapping for E2E tests in SSO mode.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The E2E tests failed because:
1. frontend-test-patch.yaml hardcoded SSO_ENABLED=true, but the backend
   SSO flag was off — causing Keycloak JWTs to be sent to K8s API which
   rejected them
2. extract-token.sh preferred Keycloak tokens, but the backend wasn't
   configured to validate them

Fixes:
- Set SSO_ENABLED=false by default in the Kind overlay. Use
  `make kind-sso-toggle` to enable SSO explicitly.
- extract-token.sh now defaults to K8s SA token (works in both modes).
  Set E2E_USE_SSO=true to use Keycloak client_credentials instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a second E2E pass that enables SSO (frontend env + Unleash flag),
re-extracts a Keycloak JWT via client_credentials, and runs the full
Cypress suite again. This ensures both auth paths are exercised in CI.

The SSO pass reuses the same Kind cluster — just toggles the auth
mode, restarts affected deployments, and re-runs tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cy.visit() page navigations can't carry custom headers, so the SSO
middleware redirects to Keycloak which Cypress can't handle. Fix by
adding /api/auth/sso/e2e-login route that accepts a token and creates
a session cookie (non-production only).

Changes:
- New /api/auth/sso/e2e-login POST route: accepts {token}, creates
  iron-session cookie. Returns 404 in production.
- Cypress beforeEach: calls e2e-login route when SSO_MODE is true
  to create session cookie before page visits.
- cypress.config.ts: passes E2E_USE_SSO env var as SSO_MODE to tests.
- e2e.yml: adds frontend health check before SSO E2E pass.
- Revert middleware Bearer token check (doesn't help for navigations).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two root causes for SSO E2E failures:

1. The e2e-login route checked NODE_ENV !== "production", but the
   Docker image sets NODE_ENV=production. Changed to check
   E2E_TEST_HELPERS env var (opt-in, added to Kind overlay).

2. All SSO public URLs used port 11646 (port-forward for local dev),
   but CI uses NodePort on port 80. Changed KC_HOSTNAME,
   SSO_REDIRECT_URI, and SSO_PUBLIC_ISSUER_URL to use http://localhost
   (no port = port 80).

Also added Keycloak readiness check and backend JWT validator
verification to the CI toggle step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Kind overlay defaults SSO URLs to http://localhost (port 80) for
CI, but local dev uses dynamic port-forward ports (11000+offset).

kind-sso-toggle now patches SSO_REDIRECT_URI, SSO_PUBLIC_ISSUER_URL,
and KC_HOSTNAME with the correct KIND_FWD_FRONTEND_PORT when enabling
SSO. This makes SSO work seamlessly in local dev without manual URL
configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The backend's InitJWTValidator needs Keycloak for OIDC discovery.
When toggling SSO on, KC_HOSTNAME changes cause Keycloak to restart.
Wait for Keycloak to be ready, then restart the backend so OIDC
discovery succeeds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant