Skip to content

Add Control Plane GitHub flow#5

Open
justin808 wants to merge 5 commits intomainfrom
add-cpflow-github-flow
Open

Add Control Plane GitHub flow#5
justin808 wants to merge 5 commits intomainfrom
add-cpflow-github-flow

Conversation

@justin808
Copy link
Copy Markdown
Member

@justin808 justin808 commented Apr 15, 2026

Summary

  • add Control Plane configuration and release script for split Rails and renderer workloads
  • add generated GitHub Actions for review apps, staging deploys, and production promotion
  • update the demo Docker image and renderer bootstrap for containerized production-style runtime

Validation

  • git diff --check
  • YAML parse for .github and .controlplane files
  • cpflow config render for review app
  • bundle install
  • npm ci
  • bin/rails db:prepare
  • bin/rails test
  • docker build -t react-on-rails-demo-16-4-0-rc5-control-plane .
  • split-container smoke for /hello_server and /hello_world with forwarded HTTPS headers

Note

Medium Risk
Adds new CI/CD automation (review apps, staging deploys, production promotion) and changes the production Docker image/runtime (bundled Node + renderer host/workers), which could impact build/deploy behavior if misconfigured.

Overview
Adds Control Plane deployment scaffolding via .controlplane/ templates and config, including a persistent /rails/storage volume, split rails/renderer workloads, and a release phase script that runs bin/rails db:prepare.

Introduces GitHub Actions workflows and composite actions to build/push images, opt-in deploy/delete PR review apps (with fork safety checks and PR comment status updates), auto-deploy staging on configured branches, nightly cleanup of stale review apps, and manual staging-to-production promotion with health checks and rollback.

Updates the Dockerfile to bundle a Node runtime in the production image (runs npm ci, installs YAML libs, adjusts permissions, and binds Rails to 0.0.0.0:3000), and extends client/node-renderer.js to support env-driven host/cache/workers plus a Fastify listen host patch for containerized deployments.

Reviewed by Cursor Bugbot for commit 19afc08. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Control Plane deployment infrastructure supporting staging, review, and production environments
    • Introduced automated GitHub Actions workflows for building, deploying, and promoting applications
    • Enabled pull request review apps with deployment and cleanup commands
  • Documentation

    • Added Control Plane configuration and setup documentation
    • Updated README with deployment workflow guidance
  • Chores

    • Updated Docker image to include Node.js for asset compilation and rendering support
    • Configured Node renderer with environment-driven host, port, and worker settings

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Walkthrough

This change introduces comprehensive Control Plane deployment infrastructure for a React on Rails application. It adds configuration files for staging, review, and production environments, GitHub Actions workflows for deploying and promoting applications, composite actions for building and managing Control Plane resources, and updates the Dockerfile and Node renderer to support the new deployment architecture.

Changes

Cohort / File(s) Summary
Control Plane Configuration
.controlplane/controlplane.yml, .controlplane/readme.md
Main configuration file defining staging, review, and production app targets with environment overrides and reusable aliases; documentation describing deployment setup, runtime secrets, and workflow sequences.
Control Plane Scripts
.controlplane/release_script.sh
Bash script executing bin/rails db:prepare during the release phase with strict error handling.
Control Plane Workload Templates
.controlplane/templates/app.yml, rails.yml, renderer.yml, storage.yml
GVC template setting Rails environment variables and port configuration; Rails workload with persistent volume and autoscaling disabled; Node renderer workload on port 3800 with same-GVC firewall rules; persistent volume definition for Rails storage.
GitHub Actions Setup & Build
.github/actions/cpflow-setup-environment/action.yml, cpflow-build-docker-image/action.yml
Composite action installing Ruby, Control Plane CLI, cpflow gem, and configuring authentication; composite action building and tagging Docker images with optional SSH key support and extra build arguments.
GitHub Actions Delete Action
.github/actions/cpflow-delete-control-plane-app/action.yml, delete-app.sh
Composite action for removing Control Plane apps; Bash script validating app names against review prefix and executing deletion via cpflow CLI.
Review App Workflows
.github/workflows/cpflow-deploy-review-app.yml, cpflow-delete-review-app.yml, cpflow-review-app-help.yml, cpflow-help-command.yml
Workflows for deploying/deleting review apps from PR comments or manual triggers, with status tracking via GitHub deployments and PR comments; help workflows displaying available commands.
Staging & Production Workflows
.github/workflows/cpflow-deploy-staging.yml, cpflow-promote-staging-to-production.yml, cpflow-cleanup-stale-review-apps.yml
Workflow deploying staging on push with optional branch gating; promotion workflow with environment validation, health checks, and automatic rollback; cleanup workflow for stale review apps on schedule or manual trigger.
Application Updates
Dockerfile, client/node-renderer.js, README.md
Dockerfile adds Node.js build stage and npm installation; renderer script supports environment-driven host configuration and worker count validation; README documents new Control Plane deployment capabilities.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Control Plane flows through our burrows now,
Staging hops, review apps take a bow,
Production promotion with health-checks so keen,
The finest deployment infra we've seen!
Rollback magic if things go awry,
Our Rails app reaches for the sky! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add Control Plane GitHub flow' is clear, specific, and directly summarizes the primary change: adding Control Plane infrastructure and GitHub Actions workflows for deployment automation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add-cpflow-github-flow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Control Plane review app commands

/deploy-review-app
Create the review app or redeploy the PR branch to it.

/delete-review-app
Delete the review app and its temporary resources.

/help
Show the required GitHub variables, secrets, and workflow behavior.

Comment thread .github/workflows/cpflow-deploy-staging.yml
Comment thread .github/workflows/cpflow-promote-staging-to-production.yml Outdated
Comment thread client/node-renderer.js
return app;
};

Object.assign(patchedFastify, originalFastify, { __cpflowDefaultHostPatched: true });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fastify patch leaves named exports pointing to unpatched original

Low Severity

Object.assign(patchedFastify, originalFastify, ...) copies Fastify's self-referencing .fastify and .default properties from the original unpatched function. After patching, require('fastify').fastify and require('fastify').default still return originalFastify, so any code (in the renderer or its dependencies) using const { fastify } = require('fastify') silently bypasses the host-binding patch entirely.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 275e16c. Configure here.

Comment thread .github/actions/cpflow-setup-environment/action.yml Outdated
@justin808 justin808 marked this pull request as ready for review April 15, 2026 14:28
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 88dc100d1a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

cpflow_version:
description: cpflow gem version
required: false
default: "__CPFLOW_VERSION__"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Set cpflow_version default to a real gem version

This action leaves cpflow_version as the unreplaced template token __CPFLOW_VERSION__, and all new workflows call this action without overriding that input. As a result, setup runs gem install cpflow -v __CPFLOW_VERSION__, which fails before any deploy/cleanup logic can execute, effectively breaking review-app deploy/delete, staging deploy, stale-app cleanup, and production promotion flows unless every caller manually supplies a valid version.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
.controlplane/templates/rails.yml (1)

18-23: Parameterize the scale settings before using this template for production.

minScale: 1, maxScale: 1, and capacityAI: false force a single Rails replica everywhere this template is used. If production reuses this workload, a rollout or node loss leaves no redundancy. Consider making these values environment-specific so production can run at least two instances.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.controlplane/templates/rails.yml around lines 18 - 23, The template
currently hardcodes defaultOptions.autoscaling.minScale: 1,
defaultOptions.autoscaling.maxScale: 1 and defaultOptions.capacityAI: false
which forces a single Rails replica; change these fields to be parameterized
(e.g., read from environment/template variables or a values map) and set
production-safe defaults (e.g., minScale >= 2, maxScale > minScale, capacityAI
true for prod) so production deployments get redundancy; update the code that
renders this template to pass environment-specific values for
defaultOptions.autoscaling.minScale, defaultOptions.autoscaling.maxScale and
defaultOptions.capacityAI and ensure any CI/dev defaults remain
backward-compatible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.controlplane/templates/app.yml:
- Around line 27-30: Make RENDERER_PASSWORD mandatory for production by updating
the app.yml template to require the RENDERER_PASSWORD env var instead of leaving
it commented/optional; ensure the production environment block always includes
RENDERER_PASSWORD (remove any fallback to the predictable "devPassword" in your
renderer or Rails config) and validate startup to fail when RENDERER_PASSWORD is
not set so services cannot start with a default password.

In @.github/actions/cpflow-build-docker-image/action.yml:
- Around line 45-50: The docker arg parsing treats multi-word flags as single
tokens, so change handling to require or convert flags-with-values into
key=value form rather than space-separated tokens: update the
docker_build_extra_args read loop and the place that appends the hardcoded
"--ssh default" so flags are emitted as "--ssh=default" (or require users to
provide "--build-arg=FOO=bar") and ensure docker_build_args contains separate
array elements without embedded spaces used by cpflow build-image expansion;
adjust logic that builds docker_build_args and the hardcoded ssh addition to
produce key=value style tokens instead of space-separated pairs.

In @.github/actions/cpflow-setup-environment/action.yml:
- Around line 19-22: The action.yml sets cpflow_version default to the
unresolved placeholder "__CPFLOW_VERSION__", causing the subsequent gem install
command that uses cpflow_version to fail; fix by replacing the placeholder
default for the input cpflow_version with a real version string (e.g., a
specific semver like "0.x.y") or remove the default and mark cpflow_version
required so callers must provide it, and ensure the installer logic that runs
`gem install cpflow -v ...` (which references cpflow_version) will receive a
valid version string.

In @.github/workflows/cpflow-delete-review-app.yml:
- Around line 73-84: Make the initial PR comment step (id create-comment)
best-effort by adding continue-on-error: true to that step so failures don't
stop the job, and update the finalizer step that reads
steps.create-comment.outputs.comment-id to only run when the create-comment step
succeeded (use a conditional that checks success and existence of
steps.create-comment.outputs.comment-id) so the finalizer doesn't assume the
output exists.

In @.github/workflows/cpflow-deploy-review-app.yml:
- Around line 200-213: The create-comment step (id: create-comment, uses:
actions/github-script@v7) must not abort the workflow on transient failures: add
continue-on-error: true to that step so deploy continues even if
issues.createComment fails; then guard any later steps that reference the
created comment (the updateComment steps and the finalizer that use the
comment-id output) by adding an extra condition like &&
steps.create-comment.outcome == 'success' before they access
steps.create-comment.outputs.comment-id to avoid referencing an undefined
output.

In @.github/workflows/cpflow-promote-staging-to-production.yml:
- Around line 153-208: The workflow allows running the release phase
(release-phase.outputs.flag used with cpflow deploy-image) which executes
bin/rails db:prepare before traffic switch, but the "Roll back on failure" step
(reads steps.capture-current.outputs.rollback_state and only restores images)
doesn't revert DB schema, risking incompatibility; fix by either (A) enforcing
migrations are backward-compatible before allowing the release-phase flag (add a
gating check that release-phase did not run destructive migrations or require an
explicit BACKWARD_COMPATIBLE=true output from that step), or (B) implement a DB
rollback/manual-stop in the failure path: extend the "Roll back on failure" step
to detect that db:prepare ran (e.g., a release-phase output flag) and then run a
safe rollback action (invoke a rails db:rollback job/helm/cpln task or scale
down new release to halt traffic and surface a manual intervention requirement)
so schema and app image are restored together; reference cpflow deploy-image,
release-phase.outputs.flag, steps.capture-current.outputs.rollback_state and the
"Roll back on failure" step when making the change.

---

Nitpick comments:
In @.controlplane/templates/rails.yml:
- Around line 18-23: The template currently hardcodes
defaultOptions.autoscaling.minScale: 1, defaultOptions.autoscaling.maxScale: 1
and defaultOptions.capacityAI: false which forces a single Rails replica; change
these fields to be parameterized (e.g., read from environment/template variables
or a values map) and set production-safe defaults (e.g., minScale >= 2, maxScale
> minScale, capacityAI true for prod) so production deployments get redundancy;
update the code that renders this template to pass environment-specific values
for defaultOptions.autoscaling.minScale, defaultOptions.autoscaling.maxScale and
defaultOptions.capacityAI and ensure any CI/dev defaults remain
backward-compatible.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5ac8b2d-9820-4b3b-a1ff-9977f49db3bf

📥 Commits

Reviewing files that changed from the base of the PR and between da414ff and 88dc100.

📒 Files selected for processing (21)
  • .controlplane/controlplane.yml
  • .controlplane/readme.md
  • .controlplane/release_script.sh
  • .controlplane/templates/app.yml
  • .controlplane/templates/rails.yml
  • .controlplane/templates/renderer.yml
  • .controlplane/templates/storage.yml
  • .github/actions/cpflow-build-docker-image/action.yml
  • .github/actions/cpflow-delete-control-plane-app/action.yml
  • .github/actions/cpflow-delete-control-plane-app/delete-app.sh
  • .github/actions/cpflow-setup-environment/action.yml
  • .github/workflows/cpflow-cleanup-stale-review-apps.yml
  • .github/workflows/cpflow-delete-review-app.yml
  • .github/workflows/cpflow-deploy-review-app.yml
  • .github/workflows/cpflow-deploy-staging.yml
  • .github/workflows/cpflow-help-command.yml
  • .github/workflows/cpflow-promote-staging-to-production.yml
  • .github/workflows/cpflow-review-app-help.yml
  • Dockerfile
  • README.md
  • client/node-renderer.js

Comment on lines +27 to +30
# Optional renderer secret:
# - name: RENDERER_PASSWORD
# value: cpln://secret/react-on-rails-demo-16-4-0-rc5-secrets/RENDERER_PASSWORD
#
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make RENDERER_PASSWORD required for production deployments.

Line 27 marks this as optional, but without it the app falls back to a predictable default password (devPassword) shared by Rails and renderer. That weakens auth between internal services.

🔐 Suggested template change
-    # Optional renderer secret:
-    # - name: RENDERER_PASSWORD
-    #   value: cpln://secret/react-on-rails-demo-16-4-0-rc5-secrets/RENDERER_PASSWORD
+    # Required renderer secret:
+    - name: RENDERER_PASSWORD
+      value: cpln://secret/react-on-rails-demo-16-4-0-rc5-secrets/RENDERER_PASSWORD
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.controlplane/templates/app.yml around lines 27 - 30, Make RENDERER_PASSWORD
mandatory for production by updating the app.yml template to require the
RENDERER_PASSWORD env var instead of leaving it commented/optional; ensure the
production environment block always includes RENDERER_PASSWORD (remove any
fallback to the predictable "devPassword" in your renderer or Rails config) and
validate startup to fail when RENDERER_PASSWORD is not set so services cannot
start with a default password.

Comment thread .github/actions/cpflow-build-docker-image/action.yml
Comment thread .github/actions/cpflow-setup-environment/action.yml Outdated
Comment on lines +73 to +84
- name: Create initial PR comment
id: create-comment
uses: actions/github-script@v7
with:
script: |
const comment = await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: Number(process.env.PR_NUMBER),
body: "🗑️ Deleting Control Plane review app..."
});
core.setOutput("comment-id", comment.data.id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/cpflow-delete-review-app.yml

Repository: shakacode/react_on_rails-demo-16-4-0-rc5

Length of output: 5218


Prevent comment creation from blocking the deletion.

If the issues.createComment call fails, the job halts before the deletion step (line 86) runs. The finalizer also assumes steps.create-comment.outputs.comment-id exists (line 116), so a comment failure cascades into a failed update. Make the comment best-effort by adding continue-on-error: true, and guard the finalizer to only run when the comment succeeded.

Suggested fix
       - name: Create initial PR comment
         id: create-comment
+        continue-on-error: true
         uses: actions/github-script@v7
         with:
           script: |
             const comment = await github.rest.issues.createComment({
               owner: context.repo.owner,
               repo: context.repo.repo,
               issue_number: Number(process.env.PR_NUMBER),
               body: "🗑️ Deleting Control Plane review app..."
             });
             core.setOutput("comment-id", comment.data.id);

       - name: Delete review app
         uses: ./.github/actions/cpflow-delete-control-plane-app
         with:
           app_name: ${{ env.APP_NAME }}
           cpln_org: ${{ vars.CPLN_ORG_STAGING }}
           review_app_prefix: ${{ vars.REVIEW_APP_PREFIX }}

       - name: Finalize delete status
-        if: always()
+        if: always() && steps.create-comment.outcome == 'success'
         uses: actions/github-script@v7
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpflow-delete-review-app.yml around lines 73 - 84, Make
the initial PR comment step (id create-comment) best-effort by adding
continue-on-error: true to that step so failures don't stop the job, and update
the finalizer step that reads steps.create-comment.outputs.comment-id to only
run when the create-comment step succeeded (use a conditional that checks
success and existence of steps.create-comment.outputs.comment-id) so the
finalizer doesn't assume the output exists.

Comment on lines +200 to +213
- name: Create initial PR comment
if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success')
id: create-comment
uses: actions/github-script@v7
with:
script: |
const result = await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: Number(process.env.PR_NUMBER),
body: "🚀 Starting Control Plane review app deployment..."
});
core.setOutput("comment-id", result.data.id);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/cpflow-deploy-review-app.yml

Repository: shakacode/react_on_rails-demo-16-4-0-rc5

Length of output: 17877


Don't let PR comment failures abort the deploy.

A transient failure in issues.createComment on line 200 stops the workflow before the build and deploy steps run, since subsequent steps use the implicit success() condition and skip if a previous step fails. Commenting is purely status reporting and should not block deployment. Add continue-on-error: true to the initial comment step and gate the two updateComment calls (lines 252 and 283) with && steps.create-comment.outcome == 'success' to avoid referencing an undefined output. The finalizer at line 320 also needs a guard check before calling updateComment with the commentId.

Suggested fix
       - name: Create initial PR comment
         if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success')
         id: create-comment
+        continue-on-error: true
         uses: actions/github-script@v7
         with:
           script: |
             const result = await github.rest.issues.createComment({
               owner: context.repo.owner,
               repo: context.repo.repo,
               issue_number: Number(process.env.PR_NUMBER),
               body: "🚀 Starting Control Plane review app deployment..."
             });
             core.setOutput("comment-id", result.data.id);

       - name: Update PR comment with build status
-        if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success')
+        if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success') && steps.create-comment.outcome == 'success'
         uses: actions/github-script@v7

       - name: Update PR comment with deploy status
-        if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success')
+        if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success') && steps.create-comment.outcome == 'success'
         uses: actions/github-script@v7

       - name: Finalize deployment status
         if: always() && steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success')
         uses: actions/github-script@v7
         with:
           script: |
             const commentId = Number("${{ steps.create-comment.outputs.comment-id }}");
             const deploymentId = "${{ steps.init-deployment.outputs.result }}";
             const appUrl = "${{ steps.workload.outputs.workload_url }}";
             const success = "${{ job.status }}" === "success";

             if (deploymentId) {
               await github.rest.repos.createDeploymentStatus({
                 owner: context.repo.owner,
                 repo: context.repo.repo,
                 deployment_id: Number(deploymentId),
                 state: success ? "success" : "failure",
                 environment: `review/${process.env.APP_NAME}`,
                 environment_url: success && appUrl ? appUrl : undefined,
                 log_url: process.env.WORKFLOW_URL,
                 description: success ? "Review app ready" : "Review app deployment failed"
               });
             }

+            if (!commentId) {
+              return;
+            }
+
             await github.rest.issues.updateComment({
               owner: context.repo.owner,
               repo: context.repo.repo,
               comment_id: commentId,
               body: success ? successBody : failureBody
             });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpflow-deploy-review-app.yml around lines 200 - 213, The
create-comment step (id: create-comment, uses: actions/github-script@v7) must
not abort the workflow on transient failures: add continue-on-error: true to
that step so deploy continues even if issues.createComment fails; then guard any
later steps that reference the created comment (the updateComment steps and the
finalizer that use the comment-id output) by adding an extra condition like &&
steps.create-comment.outcome == 'success' before they access
steps.create-comment.outputs.comment-id to avoid referencing an undefined
output.

Comment on lines +153 to +208
- name: Deploy image to production
shell: bash
run: |
set -euo pipefail
cpflow deploy-image -a "${{ vars.PRODUCTION_APP_NAME }}" ${{ steps.release-phase.outputs.flag }} --org "${{ vars.CPLN_ORG_PRODUCTION }}" --verbose

- name: Wait for deployment health
id: health-check
shell: bash
run: |
set -euo pipefail

workload_name="${PRIMARY_WORKLOAD:-rails}"

for attempt in $(seq 1 "${HEALTH_CHECK_RETRIES}"); do
echo "Health check attempt ${attempt}/${HEALTH_CHECK_RETRIES}"

endpoint="$(cpln workload get "${workload_name}" --gvc "${{ vars.PRODUCTION_APP_NAME }}" --org "${{ vars.CPLN_ORG_PRODUCTION }}" -o json | jq -r '.status.endpoint // empty')"
if [[ -n "${endpoint}" ]]; then
http_status="$(curl -s -o /dev/null -w '%{http_code}' --max-time 10 "${endpoint}" 2>/dev/null || echo 000)"
echo "Endpoint: ${endpoint}, HTTP status: ${http_status}"

if [[ "${http_status}" == "200" || "${http_status}" == "301" || "${http_status}" == "302" ]]; then
echo "healthy=true" >> "$GITHUB_OUTPUT"
exit 0
fi
fi

if [[ "${attempt}" -lt "${HEALTH_CHECK_RETRIES}" ]]; then
sleep "${HEALTH_CHECK_INTERVAL}"
fi
done

echo "healthy=false" >> "$GITHUB_OUTPUT"
exit 1

- name: Roll back on failure
if: failure() && steps.capture-current.outputs.rollback_state != ''
env:
ROLLBACK_STATE: ${{ steps.capture-current.outputs.rollback_state }}
shell: bash
run: |
set -euo pipefail

while IFS=$'\t' read -r workload_name previous_images; do
rollback_args=()

while IFS=$'\t' read -r index image; do
rollback_args+=(--set "spec.containers[${index}].image=${image}")
done < <(echo "${previous_images}" | jq -r 'to_entries[] | "\(.key)\t\(.value)"')

cpln workload update "${workload_name}" \
--gvc "${{ vars.PRODUCTION_APP_NAME }}" \
--org "${{ vars.CPLN_ORG_PRODUCTION }}" \
"${rollback_args[@]}"
done < <(echo "${ROLLBACK_STATE}" | jq -r 'to_entries[] | "\(.key)\t\(.value.images | @json)"')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Image-only rollback is not sufficient after db:prepare.

Line 157 can run the release phase, and this PR’s release script runs bin/rails db:prepare before traffic switches. The rollback path at Lines 189-208 only restores workload images, so a failed promotion can still leave production on a newer schema that the previous image may not support. This needs either a hard requirement for backward-compatible migrations on this path or an explicit DB rollback/manual-stop strategy instead of presenting it as a full rollback.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpflow-promote-staging-to-production.yml around lines 153
- 208, The workflow allows running the release phase (release-phase.outputs.flag
used with cpflow deploy-image) which executes bin/rails db:prepare before
traffic switch, but the "Roll back on failure" step (reads
steps.capture-current.outputs.rollback_state and only restores images) doesn't
revert DB schema, risking incompatibility; fix by either (A) enforcing
migrations are backward-compatible before allowing the release-phase flag (add a
gating check that release-phase did not run destructive migrations or require an
explicit BACKWARD_COMPATIBLE=true output from that step), or (B) implement a DB
rollback/manual-stop in the failure path: extend the "Roll back on failure" step
to detect that db:prepare ran (e.g., a release-phase output flag) and then run a
safe rollback action (invoke a rails db:rollback job/helm/cpln task or scale
down new release to halt traffic and surface a manual intervention requirement)
so schema and app image are restored together; reference cpflow deploy-image,
release-phase.outputs.flag, steps.capture-current.outputs.rollback_state and the
"Roll back on failure" step when making the change.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 4 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Bugbot Autofix is ON, but it could not run because on-demand usage is turned off. To enable Bugbot Autofix, turn on on-demand usage and set a spend limit in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit a1791d9. Configure here.

Comment thread Dockerfile

# Install application gems
COPY Gemfile Gemfile.lock ./
COPY package.json package-lock.json ./
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dockerfile layer ordering invalidates gem cache unnecessarily

Low Severity

COPY package.json package-lock.json ./ is placed before the bundle install RUN layer. Any change to package.json or package-lock.json invalidates Docker's build cache for all subsequent layers, including the expensive bundle install step, even though gem dependencies haven't changed. Moving the COPY and npm ci after bundle install would keep the gem layer cached independently.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit a1791d9. Configure here.

ruby -e 'require "yaml"; app = ARGV.fetch(0); data = YAML.load_file(".controlplane/controlplane.yml", aliases: true); app_config = data.fetch("apps").fetch(app); workloads = Array(app_config["app_workloads"]); workloads = ["rails"] if workloads.empty?; puts workloads.join(",")' "${{ vars.PRODUCTION_APP_NAME }}"
)"

echo "names=${workloads}" >> "$GITHUB_OUTPUT"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YAML parsing runs before Ruby version is set up

Medium Severity

The Resolve production app workloads step calls ruby -e with YAML.load_file(..., aliases: true) before the Setup production environment step installs Ruby 3.4.6. The aliases: keyword parameter requires Ruby 3.1+, but the system Ruby on GitHub Actions runners may be older (e.g., Ruby 3.0.x on Ubuntu 22.04), causing an ArgumentError. Moving this step after the environment setup — or swapping the two steps — would ensure the correct Ruby version is available.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit a1791d9. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant