feat: incident detection automation by waltergalvao · Pull Request #92 · sweetr-dev/sweetr.dev

waltergalvao · 2026-02-23T04:40:28Z

Greptile Summary

This PR implements automated incident detection by analyzing deployments for rollbacks, reverts, and hotfixes. The implementation detects incidents through three strategies: version-based rollback detection, GitHub revert PR pattern matching, and configurable hotfix pattern matching (via PR title, branch name, or labels).

Key changes:

Extended PullRequest schema with sourceBranch, body, and labels for incident detection
Implemented detection service using RE2 library for safe regex matching with 255-char limit
Created worker to process incident detection jobs triggered after deployment creation/linking
Refactored BullMQ queue initialization to support lazy loading and testing disablement
Added comprehensive integration tests with 1400+ lines covering all detection scenarios

Issues found:

Missing unique constraint on Incident model creates race condition risk for duplicate incidents (critical)
Hardcoded revert pattern only matches GitHub's default format, limiting detection coverage
Minor type safety concern with labels array casting

Test environment changes:

Changed integration test NODE_ENV from test to production - verify environment-specific logic still behaves correctly
Added Redis/Dragonfly container and disabled BullMQ during tests

Confidence Score: 3/5

This PR has solid implementation but requires addressing the database constraint issue before merge
Score reflects well-tested detection logic with proper security measures (RE2 for regex), but the missing unique constraint on Incident model creates a real risk of duplicate incidents under concurrent deployments. The hardcoded revert pattern and NODE_ENV change are lower priority concerns.
Pay close attention to apps/api/prisma/schema.prisma (missing unique constraint) and verify apps/api/package.json NODE_ENV change doesn't affect environment-specific behavior

Important Files Changed

Filename	Overview
apps/api/prisma/schema.prisma	Added `sourceBranch`, `body`, and `labels` columns to PullRequest model for incident detection. Missing unique constraint on Incident model creates race condition risk for duplicate incidents.
apps/api/src/app/incidents/services/incident-detection.service.ts	New service implementing three incident detection methods (rollback, revert, hotfix). Uses safe regex with RE2 library. Hardcoded revert pattern and minor type safety concerns with labels.
apps/api/src/bull-mq/queues.ts	Refactored queue initialization to be lazy-loaded via `initQueues()` function. Added new AUTOMATION_INCIDENT_DETECTION queue. Safe implementation with proper initialization guards.
apps/api/src/lib/string.ts	New `safeRegex()` utility using RE2 library with 255-char limit and error handling. Prevents ReDoS attacks and handles invalid patterns gracefully.
apps/api/src/app/incidents/services/incident-detection.integration.test.ts	Comprehensive integration tests covering all detection methods (rollback, revert, hotfix) with positive and negative test cases. Well-structured test setup.
apps/api/package.json	Changed integration test NODE_ENV from `test` to `production` and added `LOG_LEVEL=warn`. Added `re2` dependency for safe regex. Verify environment-specific logic still works correctly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    Start[Deployment Created/Linked] --> Queue[Add to AUTOMATION_INCIDENT_DETECTION Queue]
    Queue --> Worker[Worker: automation-incident-detection]
    Worker --> CheckAutomation{Automation<br/>Enabled?}
    CheckAutomation -->|No| End[End]
    CheckAutomation -->|Yes| CheckCustomer{Active<br/>Customer?}
    CheckCustomer -->|No| End
    CheckCustomer -->|Yes| FetchDeployment[Fetch Deployment<br/>with PRs]
    
    FetchDeployment --> DetectRollback{Detect<br/>Rollback?}
    DetectRollback -->|Yes| ValidateSame{Cause == Fix<br/>Deployment?}
    DetectRollback -->|No| DetectRevert{Detect<br/>Revert?}
    
    DetectRevert -->|Yes| ValidateSame
    DetectRevert -->|No| DetectHotfix{Detect<br/>Hotfix?}
    
    DetectHotfix -->|Yes| ValidateSame
    DetectHotfix -->|No| End
    
    ValidateSame -->|Yes| Error[Throw DataIntegrityException]
    ValidateSame -->|No| CheckDuplicate{Incident<br/>Exists?}
    
    CheckDuplicate -->|Yes| Log[Log: Already Exists]
    CheckDuplicate -->|No| CreateIncident[Create Incident Record]
    
    Log --> End
    CreateIncident --> End
    Error --> End
    
    subgraph Detection Methods
        RollbackLogic[Rollback: Find previous<br/>deployment with same version]
        RevertLogic[Revert: Match PR title<br/>with revert pattern]
        HotfixLogic[Hotfix: Match configurable<br/>regex patterns]
    end

_{Last reviewed commit: c9d7a08}

coderabbitai · 2026-02-23T04:40:43Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds sourceBranch, body, and labels to PullRequest; wires them through GitHub sync, tests, and seeds; implements incident-detection automation with detectors, incident creation, worker, and queue; enqueues detection jobs from deployment flows; switches BullMQ to lazy init and adds safeRegex utility.

Changes

Cohort / File(s)	Summary
Database schema & migration `apps/api/prisma/schema.prisma`, `apps/api/prisma/migrations/.../migration.sql`	Adds `PullRequest` columns: `sourceBranch` (text, default ""), `body` (text, default ""), `labels` (jsonb, default []).
GitHub PR sync `apps/api/src/app/github/services/github-pull-request.service.ts`	Fetches `headRefName`, `body`, `labels` from GitHub and upserts into Prisma fields `sourceBranch`, `body`, `labels`.
Deployment flows `apps/api/src/app/deployment/services/deployment-create-from-merge.service.ts`, `apps/api/src/app/deployment/services/deployment-pr-linking.service.ts`	Enqueues `AUTOMATION_INCIDENT_DETECTION` jobs (deploymentId, workspaceId) after deployment creation or PR linking.
Incident detection service & types `apps/api/src/app/incidents/services/incident-detection.service.ts`, `apps/api/src/app/incidents/services/incident-detection.types.ts`	New automation entry `handleIncidentDetectionAutomation({workspaceId, deploymentId})` with detectors: `detectRollback`, `detectRevert`, `detectHotfix`; creates incidents and deduplicates by cause/fix.
Worker & queue infra `apps/api/src/app/incidents/workers/automation-incident-detection.worker.ts`, `apps/api/src/bull-mq/queues.ts`, `apps/api/src/bull-mq/init-bull-mq.ts`	Adds `AUTOMATION_INCIDENT_DETECTION` queue enum, introduces `initQueues()` lazy initialization, updates addJob helpers, and registers the new worker.
Bull board router `apps/api/src/bull-mq/bull-board.router.ts`	Now calls `initQueues()` when constructing BullMQ adapters instead of using an eagerly-initialized `queues` constant.
Utilities `apps/api/src/lib/string.ts`	Adds `safeRegex(pattern: string): RE2
Tests & seeds `apps/api/src/app/activity-events/services/work-log.unit.test.ts`, `apps/api/src/app/incidents/services/incident-detection.integration.test.ts`, `apps/api/test/seed/index.ts`	Updated test builders to include new PR fields; added comprehensive incident-detection integration tests; extended seed helpers (`seedPullRequest` fields and new `seedAutomation`).
Config & CI `apps/api/package.json`, `apps/api/src/env.ts`, `.github/workflows/test-integration.yml`	Added `re2` dependency; adjusted integration test env and scripts; added `BULLMQ_ENABLED` flag and allowed empty values for several env vars.
Misc wiring `apps/api/src/bull-mq/bull-board.router.ts`, `apps/api/src/bull-mq/init-bull-mq.ts`	Init flow now calls `initQueues()` early and guards startup with `BULLMQ_ENABLED`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: targetBranch setting for merge deployments #68 — modifies PullRequest model and PR upsert wiring (related to added sourceBranch/body/labels).
feat(api): create deployment API #60 — touches BullMQ queue definitions and deployment job wiring (related to new queue and enqueue points).
feat: add incident detection automation #86 — implements incident-detection automation and related types/settings (directly related to new detection service and tests).

Suggested reviewers

sweetrdev

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: incident detection automation' clearly and concisely summarizes the main change—adding automated incident detection functionality with three detection methods (rollback, revert, hotfix).
Description check	✅ Passed	The PR description comprehensively describes the changeset, detailing the incident detection automation feature with rollback, revert, and hotfix detection methods, database schema changes, BullMQ integration, and testing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/incident-detection-implementation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cubic-dev-ai

1 issue found across 9 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/api/src/app/incidents/services/incident-detection.service.ts">

<violation number="1" location="apps/api/src/app/incidents/services/incident-detection.service.ts:213">
P2: Invalid regex strings in hotfix settings will throw in `new RegExp(...)` and crash incident detection. Guard regex compilation (e.g., try/catch or pre-validate) so a bad pattern doesn't break the automation run.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

greptile-apps

_{19 files reviewed, 10 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{10 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/api/src/app/deployment/services/deployment-pr-linking.service.ts (1)
108-165: ⚠️ Potential issue | 🟡 Minor

detectRollback may be bypassed for API-triggered deployments.

The AUTOMATION_INCIDENT_DETECTION job is only enqueued when filteredPullRequests is non-empty (the empty-list early-return at line 133 exits before line 162). For an API-triggered rollback, GitHub's compareCommits returns an empty commits array when the head is behind the base (status: "behind" → changeType: ROLLBACK), so findMergedPullRequestsByCommitHashes finds no PRs, filteredPullRequests is empty, and the function returns before enqueuing the job. Since detectRollback operates solely on deployment versions (no PR dependency), this silently skips rollback detection for all API-triggered rollback deployments.

Consider unconditionally enqueueing the job at the end of handleDeploymentPullRequestAutoLinking after updateDeploymentChangeType is called (and skipping only for BASELINE), or at least enqueuing it even when no PRs are found so that detectRollback can still run.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/app/deployment/services/deployment-pr-linking.service.ts` around
lines 108 - 165, The code currently returns early when filteredPullRequests is
empty which prevents enqueuing the AUT0MATION_INCIDENT_DETECTION job and thus
skips detectRollback for API-triggered rollbacks; update
handleDeploymentPullRequestAutoLinking so that after computing changeType (or
after calling updateDeploymentChangeType) you always call
addJob(SweetQueue.AUTOMATION_INCIDENT_DETECTION, { deploymentId, workspaceId })
unless changeType === DeploymentChangeType.BASELINE (or other explicit skip),
instead of only calling addJob when filteredPullRequests is non-empty—ensure you
keep the existing linking logic (findMergedPullRequestsByCommitHashes,
filterPullRequestsBySubdirectory, linkPullRequestsToDeployment) but remove the
early return that prevents the job enqueue for empty filteredPullRequests so
detectRollback can run for API-triggered rollbacks.

🧹 Nitpick comments (1)

apps/api/src/app/github/services/github-pull-request.service.ts (1)
130-134: Consider using GITHUB_MAX_PAGE_LIMIT instead of the hardcoded 100 for consistency.

The rest of the file uses the GITHUB_MAX_PAGE_LIMIT constant (e.g., files(first: ${GITHUB_MAX_PAGE_LIMIT})). Using the same constant here keeps the limit in one place and avoids a silent mismatch if the constant ever changes.
♻️ Proposed change
-            labels(first: 100) {
+            labels(first: ${GITHUB_MAX_PAGE_LIMIT}) {
               nodes {
                 name
               }
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/app/github/services/github-pull-request.service.ts` around lines
130 - 134, Replace the hardcoded labels(first: 100) limit with the shared
constant GITHUB_MAX_PAGE_LIMIT to match the rest of the file; locate the GraphQL
labels selection in github-pull-request.service.ts (the labels(...) block used
when building the pull request query) and change the literal 100 to use the
GITHUB_MAX_PAGE_LIMIT constant (same interpolation/style used for files(first:
${GITHUB_MAX_PAGE_LIMIT})), ensuring consistency and a single source of truth
for page limits.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/src/app/incidents/services/incident-detection.service.ts`:
- Around line 175-191: In detectRevert, originalPr.deploymentEvents is currently
searched with de => de.deploymentId !== deployment.id which can pick a
deployment from the wrong environment; change the search to filter
deploymentEvents to those whose applicationId and environmentId match the
current deployment (i.e., de.applicationId === deployment.applicationId &&
de.environmentId === deployment.environmentId && de.deploymentId !==
deployment.id) before selecting a cause, and ensure the Prisma query that loads
originalPr.deploymentEvents includes applicationId and environmentId (or the
full deployment relation) so those fields are available for this check; update
the variable name deploymentLink accordingly.
- Around line 63-82: Add a DB-level uniqueness constraint on Incident
(@@unique([workspaceId, causeDeploymentId])) in the schema via a migration, then
change the non-atomic check-then-create in handleDeploymentIncidentDetection to
a safe operation: either use Prisma's upsert on
getPrisma(workspaceId).incident.upsert(...) keyed by
workspaceId+causeDeploymentId, or call
getPrisma(workspaceId).incident.create(...) and catch & ignore the
unique-constraint violation for causeDeploymentId; reference the existing
getPrisma(...).incident.findFirst/check and getPrisma(...).incident.create calls
to locate the code to replace and handle the conflict.
- Around line 208-228: The detectHotfix logic currently compiles user-supplied
regexes inside the pullRequests.some() callback, risking ReDoS and repeated
compilation; hoist and validate/compile prTitleRegEx, branchRegEx, and
prLabelRegEx once before the .some loop (e.g., create compiledTitleRegex,
compiledBranchRegex, compiledLabelRegex), wrap each new RegExp(...) in a
try/catch to reject or skip invalid/unsafe patterns, and if possible switch to a
safe engine like re2 or implement a timeout/bounded-execution wrapper for regex
tests; then use the precompiled regex variables inside the some() callback
instead of reconstructing them per PR.

---

Outside diff comments:
In `@apps/api/src/app/deployment/services/deployment-pr-linking.service.ts`:
- Around line 108-165: The code currently returns early when
filteredPullRequests is empty which prevents enqueuing the
AUT0MATION_INCIDENT_DETECTION job and thus skips detectRollback for
API-triggered rollbacks; update handleDeploymentPullRequestAutoLinking so that
after computing changeType (or after calling updateDeploymentChangeType) you
always call addJob(SweetQueue.AUTOMATION_INCIDENT_DETECTION, { deploymentId,
workspaceId }) unless changeType === DeploymentChangeType.BASELINE (or other
explicit skip), instead of only calling addJob when filteredPullRequests is
non-empty—ensure you keep the existing linking logic
(findMergedPullRequestsByCommitHashes, filterPullRequestsBySubdirectory,
linkPullRequestsToDeployment) but remove the early return that prevents the job
enqueue for empty filteredPullRequests so detectRollback can run for
API-triggered rollbacks.

---

Nitpick comments:
In `@apps/api/src/app/github/services/github-pull-request.service.ts`:
- Around line 130-134: Replace the hardcoded labels(first: 100) limit with the
shared constant GITHUB_MAX_PAGE_LIMIT to match the rest of the file; locate the
GraphQL labels selection in github-pull-request.service.ts (the labels(...)
block used when building the pull request query) and change the literal 100 to
use the GITHUB_MAX_PAGE_LIMIT constant (same interpolation/style used for
files(first: ${GITHUB_MAX_PAGE_LIMIT})), ensuring consistency and a single
source of truth for page limits.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@apps/api/prisma/migrations/20260223044415_alter_table_pull_requests_add_columns/migration.sql`:
- Around line 1-4: This PR includes a schema migration that should be split into
a dedicated migrations PR: remove the migration file that alters the
"PullRequest" table (the migration adding columns "body", "labels", and
"sourceBranch") from this feature branch/commit, then create a new branch only
containing the generated migration SQL (the ALTER TABLE ... ADD COLUMN
statements) and open a separate PR for that migration; once the migration PR is
merged and applied, rebase this feature branch onto main so the feature code
(which references PullRequest.body, PullRequest.labels,
PullRequest.sourceBranch) depends on the already-merged migration.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

apps/api/src/app/incidents/services/incident-detection.service.ts (1)

64-87: ⚠️ Potential issue | 🟡 Minor

Non-atomic dedup check still allows duplicate incident inserts under concurrency.

findFirst then create is race-prone when two workers process the same payload concurrently.

Use a DB unique constraint for the dedup key and switch to conflict-safe write (upsert or create+unique-violation handling).

#!/bin/bash
set -euo pipefail

echo "== Check current incident dedup logic =="
rg -n -C3 'incident\.findFirst|incident\.create|incident\.upsert' apps/api/src/app/incidents/services/incident-detection.service.ts

echo
echo "== Check Incident model uniqueness constraints in Prisma schema files =="
fd 'schema\.prisma$' -x sh -c '
  echo "-- {} --"
  rg -n -C8 "model Incident|@@unique|workspaceId|causeDeploymentId|fixDeploymentId" "{}"
'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/app/incidents/services/incident-detection.service.ts` around
lines 64 - 87, Current findFirst then create flow
(getPrisma(...).incident.findFirst / .create) is racy; add a DB-level unique
constraint on the dedup key (e.g. Incident model @@unique on workspaceId,
causeDeploymentId, fixDeploymentId) and change the service to a conflict-safe
write: either use Prisma's upsert on the incident model (referencing
incident.upsert) or perform create inside a try/catch and swallow/handle
unique-constraint errors (detect Prisma unique constraint error code) so
concurrent workers cannot insert duplicates; keep references to
getPrisma(workspaceId), incident.create, and incident.upsert when making the
change.

🧹 Nitpick comments (1)

apps/api/test/seed/index.ts (1)

330-341: Type settings as Prisma.InputJsonValue at the API boundary instead of casting.

Using settings?: object + as Prisma.InputJsonValue cast defers type validation to runtime and weakens compile-time safety. All callsites pass JSON-compatible objects, so the signature should declare the intended type directly.

♻️ Proposed fix

 export async function seedAutomation(
   ctx: SeedWorkspace,
   input: {
     type: AutomationType;
     enabled?: boolean;
-    settings?: object;
+    settings?: Prisma.InputJsonValue;
   }
 ): Promise<{ automationId: number }> {
   const automation = await getPrisma(ctx.workspaceId).automation.create({
     data: {
       type: input.type,
       enabled: input.enabled ?? true,
-      settings: (input.settings ?? {}) as Prisma.InputJsonValue,
+      settings: input.settings ?? {},
       workspaceId: ctx.workspaceId,
     },
   });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/api/test/seed/index.ts` around lines 330 - 341, Change the input
parameter type so settings is declared as Prisma.InputJsonValue instead of a
plain object and remove the runtime cast where the value is passed to
getPrisma(...).automation.create; update the function signature (the input param
in the function that calls getPrisma(...).automation.create) from settings?:
object to settings?: Prisma.InputJsonValue so TypeScript enforces
JSON-compatible types for settings and you can pass settings directly to the
create data (used in the automation.create call) without using "as
Prisma.InputJsonValue".

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/package.json`:
- Line 23: The "test:integration" npm script currently sets NODE_ENV=production
causing integration env validation failures; update the "test:integration"
script (the npm script named "test:integration" in package.json) to set
NODE_ENV=test instead of NODE_ENV=production so integration runs use the test
environment and avoid production-only behavior during tests.

In `@apps/api/src/app/incidents/services/incident-detection.service.ts`:
- Around line 124-133: Update the query that finds causeDeployment in
detectRollback so it cannot pick deployments that occurred after the current
deployment: modify the where.deployedAt clause in
getPrisma(...).deployment.findFirst to include an upper bound (e.g., deployedAt:
{ gt: rolledBackTo.deployedAt, lte: deployment.deployedAt } or deployedAt: { gt:
rolledBackTo.deployedAt, lt: deployment.deployedAt }) so the result is between
rolledBackTo.deployedAt and the current deployment's deployedAt; keep the
existing id not: deployment.id and other filters intact.

In `@apps/api/src/lib/string.ts`:
- Around line 3-8: safeRegex currently only catches syntax errors but allows
catastrophic backtracking patterns; update safeRegex to run a complexity check
with safe-regex2 (or equivalent) before constructing the RegExp: import the
checker, call it on the incoming pattern and if it flags unsafe, logger.warn
with the pattern and return null; only if the checker passes, proceed to new
RegExp(pattern, "i") and return it. Keep the function name safeRegex and its
return behavior (RegExp|null) so callers in incident-detection service (where
.test() is used against PR titles/branches/labels) get null for unsafe patterns
and avoid ReDoS; ensure the case-insensitive flag is preserved when compiling.

---

Duplicate comments:
In `@apps/api/src/app/incidents/services/incident-detection.service.ts`:
- Around line 64-87: Current findFirst then create flow
(getPrisma(...).incident.findFirst / .create) is racy; add a DB-level unique
constraint on the dedup key (e.g. Incident model @@unique on workspaceId,
causeDeploymentId, fixDeploymentId) and change the service to a conflict-safe
write: either use Prisma's upsert on the incident model (referencing
incident.upsert) or perform create inside a try/catch and swallow/handle
unique-constraint errors (detect Prisma unique constraint error code) so
concurrent workers cannot insert duplicates; keep references to
getPrisma(workspaceId), incident.create, and incident.upsert when making the
change.

---

Nitpick comments:
In `@apps/api/test/seed/index.ts`:
- Around line 330-341: Change the input parameter type so settings is declared
as Prisma.InputJsonValue instead of a plain object and remove the runtime cast
where the value is passed to getPrisma(...).automation.create; update the
function signature (the input param in the function that calls
getPrisma(...).automation.create) from settings?: object to settings?:
Prisma.InputJsonValue so TypeScript enforces JSON-compatible types for settings
and you can pass settings directly to the create data (used in the
automation.create call) without using "as Prisma.InputJsonValue".

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17460ad and 0bb7b72.

📒 Files selected for processing (10)

apps/api/package.json
apps/api/src/app/deployment/services/deployment-pr-linking.service.ts
apps/api/src/app/github/services/github-pull-request.service.ts
apps/api/src/app/incidents/services/incident-detection.integration.test.ts
apps/api/src/app/incidents/services/incident-detection.service.ts
apps/api/src/bull-mq/bull-board.router.ts
apps/api/src/bull-mq/init-bull-mq.ts
apps/api/src/bull-mq/queues.ts
apps/api/src/lib/string.ts
apps/api/test/seed/index.ts

🚧 Files skipped from review as they are similar to previous changes (1)

apps/api/src/app/deployment/services/deployment-pr-linking.service.ts

cubic-dev-ai

1 issue found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/api/src/lib/string.ts">

<violation number="1" location="apps/api/src/lib/string.ts:11">
P2: safeRegex now builds RE2 without the previous case-insensitive flag, so hotfix detection becomes case-sensitive and can miss existing patterns that relied on `i`. Preserve the original `i` behavior by passing the flag when constructing RE2.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

apps/api/package.json (1)

23-23: ⚠️ Potential issue | 🔴 Critical

Set integration runs back to test environment.

Line 23 still runs integration tests with NODE_ENV=production, which can trigger prod-only paths and is consistent with the current integration pipeline failure.

🔧 Proposed fix

-    "test:integration": "cross-env LOG_LEVEL=warn NODE_ENV=production DATABASE_URL=postgresql://app_user:app_user@localhost:5433/sweetr_test SUPERUSER_DATABASE_URL=postgresql://postgres:postgres@localhost:5433/sweetr_test vitest run --project integration"
+    "test:integration": "cross-env LOG_LEVEL=warn NODE_ENV=test DATABASE_URL=postgresql://app_user:app_user@localhost:5433/sweetr_test SUPERUSER_DATABASE_URL=postgresql://postgres:postgres@localhost:5433/sweetr_test vitest run --project integration"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/api/package.json` at line 23, The "test:integration" npm script is
setting NODE_ENV=production which triggers prod-only behavior; update the script
(the "test:integration" entry in package.json) to use NODE_ENV=test (or remove
NODE_ENV so it defaults to test) instead of NODE_ENV=production so integration
tests run in the test environment; keep the other env vars (LOG_LEVEL,
DATABASE_URL, SUPERUSER_DATABASE_URL) and the vitest command unchanged.

🧹 Nitpick comments (2)

apps/api/src/app/incidents/services/incident-detection.service.ts (2)

64-87: Duplicate check may conflict with the schema's unique constraint.

The code checks for duplicates using both causeDeploymentId AND fixDeploymentId, but the schema has @@unique([workspaceId, causeDeploymentId]). If a unique constraint violation occurs (due to stricter DB constraint), this code path won't handle it gracefully—it would throw an unhandled exception.

Consider wrapping the create in a try-catch to handle constraint violations, or align the constraint with the code logic as noted in the schema review.

🛡️ Suggested defensive handling

+  try {
     await getPrisma(workspaceId).incident.create({
       data: {
         causeDeploymentId: result.causeDeploymentId,
         fixDeploymentId: result.fixDeploymentId,
         detectedAt: deployment.deployedAt,
         workspaceId,
       },
     });
+  } catch (error) {
+    if (error.code === 'P2002') {
+      logger.info("handleDeploymentIncidentDetection: Incident already exists (constraint)", {
+        causeDeploymentId: result.causeDeploymentId,
+      });
+      return;
+    }
+    throw error;
+  }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/app/incidents/services/incident-detection.service.ts` around
lines 64 - 87, The current duplicate check uses incident.findFirst by
causeDeploymentId+fixDeploymentId but the DB enforces @@unique([workspaceId,
causeDeploymentId]), so incident.create(...) can still raise a unique-constraint
error; wrap the create call in a try-catch around
getPrisma(workspaceId).incident.create(...) (the block that writes
causeDeploymentId, fixDeploymentId, detectedAt: deployment.deployedAt,
workspaceId), catch PrismaClient known request errors
(Prisma.PrismaClientKnownRequestError) and handle code 'P2002' by logging a
clear message via logger.info (including causeDeploymentId, workspaceId and the
existingIncident context) and returning gracefully; for other errors rethrow or
log as error. Ensure you import/identify Prisma error class or check error.code
=== 'P2002' so the race-condition duplicate is handled safely.

236-239: Type cast for pr.labels could silently fail on malformed data.

The cast (pr.labels as string[]) assumes the JSON field always contains a string array. If the data is malformed (e.g., stored as an object), it would not throw but could cause unexpected behavior during label iteration.

Consider adding a runtime guard for additional safety, though this is low risk if data ingestion is controlled:

🛡️ Optional defensive check

     if (compiledLabelRegex) {
-      const labels = (pr.labels as string[]) ?? [];
+      const labels = Array.isArray(pr.labels) ? (pr.labels as string[]) : [];
       if (labels.some((label) => compiledLabelRegex.test(label))) return true;
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/app/incidents/services/incident-detection.service.ts` around
lines 236 - 239, The current cast (pr.labels as string[]) can hide malformed
data; update the check around compiledLabelRegex to first validate pr.labels is
an array of strings (e.g., Array.isArray(pr.labels) and typeof each element ===
'string') before iterating, default to an empty array if validation fails, and
optionally log or metric a warning; apply this change where compiledLabelRegex
and pr.labels are used in incident-detection.service.ts so the labels variable
is built from a safe, validated array.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/test-integration.yml:
- Around line 70-82: The CI is missing required env vars declared in env.ts
(specifically PORT and FRONTEND_URL) causing envsafe failures; update the
test-integration workflow to export values for PORT and FRONTEND_URL (e.g.,
PORT=3000 and FRONTEND_URL=http://localhost:3000) alongside the existing env
entries so envsafe finds defaults, and ensure any other required variables from
env.ts are also present in the env block to satisfy validation.

In `@apps/api/prisma/schema.prisma`:
- Line 772: The database unique constraint on the Incident model currently
enforces @@unique([workspaceId, causeDeploymentId]) but the incident detection
logic checks duplicates using (workspaceId, causeDeploymentId, fixDeploymentId);
update one side to match: either change the Prisma schema unique to
@@unique([workspaceId, causeDeploymentId, fixDeploymentId]) so it allows
multiple fixDeploymentId values for the same cause, or modify the
duplicate-check in the incident detection service (the code that queries by
workspaceId, causeDeploymentId, fixDeploymentId) to only check by workspaceId
and causeDeploymentId; pick the option that matches the intended domain rule and
make the corresponding change to the Incident model or the duplicate-checking
logic.

In `@apps/api/src/app/incidents/services/incident-detection.integration.test.ts`:
- Around line 1-25: The integration tests fail due to missing required
environment variables used by the test setup (triggered during
createTestContextWithGitProfile in incident-detection.integration.test.ts);
update the CI/test setup to provide the envsafe-required variables or modify the
test harness to inject safe defaults/mocks: either add the missing environment
variables to the CI pipeline (repository secrets or workflow env) or update the
test bootstrap (createTestContextWithGitProfile / test/integration-setup env
loader) to load a test-specific .env or provide fallback values so tests can run
in CI without manual env configuration.

---

Duplicate comments:
In `@apps/api/package.json`:
- Line 23: The "test:integration" npm script is setting NODE_ENV=production
which triggers prod-only behavior; update the script (the "test:integration"
entry in package.json) to use NODE_ENV=test (or remove NODE_ENV so it defaults
to test) instead of NODE_ENV=production so integration tests run in the test
environment; keep the other env vars (LOG_LEVEL, DATABASE_URL,
SUPERUSER_DATABASE_URL) and the vitest command unchanged.

---

Nitpick comments:
In `@apps/api/src/app/incidents/services/incident-detection.service.ts`:
- Around line 64-87: The current duplicate check uses incident.findFirst by
causeDeploymentId+fixDeploymentId but the DB enforces @@unique([workspaceId,
causeDeploymentId]), so incident.create(...) can still raise a unique-constraint
error; wrap the create call in a try-catch around
getPrisma(workspaceId).incident.create(...) (the block that writes
causeDeploymentId, fixDeploymentId, detectedAt: deployment.deployedAt,
workspaceId), catch PrismaClient known request errors
(Prisma.PrismaClientKnownRequestError) and handle code 'P2002' by logging a
clear message via logger.info (including causeDeploymentId, workspaceId and the
existingIncident context) and returning gracefully; for other errors rethrow or
log as error. Ensure you import/identify Prisma error class or check error.code
=== 'P2002' so the race-condition duplicate is handled safely.
- Around line 236-239: The current cast (pr.labels as string[]) can hide
malformed data; update the check around compiledLabelRegex to first validate
pr.labels is an array of strings (e.g., Array.isArray(pr.labels) and typeof each
element === 'string') before iterating, default to an empty array if validation
fails, and optionally log or metric a warning; apply this change where
compiledLabelRegex and pr.labels are used in incident-detection.service.ts so
the labels variable is built from a safe, validated array.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bb7b72 and 36057e4.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (8)

.github/workflows/test-integration.yml
apps/api/package.json
apps/api/prisma/schema.prisma
apps/api/src/app/incidents/services/incident-detection.integration.test.ts
apps/api/src/app/incidents/services/incident-detection.service.ts
apps/api/src/bull-mq/init-bull-mq.ts
apps/api/src/env.ts
apps/api/src/lib/string.ts

🚧 Files skipped from review as they are similar to previous changes (1)

apps/api/src/bull-mq/init-bull-mq.ts

greptile-apps · 2026-02-27T05:26:44Z

Additional Comments (1)

apps/api/prisma/schema.prisma
Missing unique constraint allows duplicate incidents from race conditions. If two incident detection jobs run concurrently for the same deployment, both can pass the findFirst check (line 64-70 in incident-detection.service.ts) and create duplicate incidents.

Add unique constraint:

model Incident {
  id Int @id @default(autoincrement())

  teamId Int?
  team   Team? @relation(fields: [teamId], references: [id], onDelete: SetNull)

  leaderId Int?
  leader   GitProfile? @relation(fields: [leaderId], references: [id], onDelete: SetNull)

  detectedAt DateTime
  resolvedAt DateTime?

  causeDeploymentId Int
  causeDeployment   Deployment @relation("CauseDeployment", fields: [causeDeploymentId], references: [id], onDelete: Cascade)

  fixDeploymentId Int?
  fixDeployment   Deployment? @relation("FixDeployment", fields: [fixDeploymentId], references: [id], onDelete: SetNull)

  postmortemUrl String?

  createdAt DateTime @default(now())
  updatedAt DateTime @default(now()) @updatedAt

  workspaceId Int
  workspace   Workspace @relation(fields: [workspaceId], references: [id], onDelete: Cascade)

  archivedAt DateTime?

  @@unique([workspaceId, causeDeploymentId, fixDeploymentId])
  @@index([causeDeploymentId])
  @@index([fixDeploymentId])
  @@index([teamId])
  @@index([leaderId])
  @@index([workspaceId, detectedAt])
}

greptile-apps · 2026-02-27T07:56:26Z

+): Promise<DetectionResult | null> => {
+  if (!settings.revert?.enabled) return null;
+
+  const revertPattern = /^Revert "(.+)"$/;


Hardcoded revert pattern only matches GitHub's exact "Revert "..."" format. Won't detect reverts using conventional commit style (revert:), different casing, or manual revert messages. Unlike hotfix detection (which has configurable regex), revert detection lacks configurability.

greptile-apps · 2026-02-27T07:56:27Z

+    if (compiledBranchRegex?.test(pr.sourceBranch)) return true;
+
+    if (compiledLabelRegex) {
+      const labels = (pr.labels as string[]) ?? [];


Type cast to string[] without runtime validation. If pr.labels contains malformed data (non-array), .some() on line 251 will throw. Consider Array.isArray() check:

const labels = Array.isArray(pr.labels) ? pr.labels : [];

greptile-apps · 2026-02-27T07:56:29Z

Additional Comments (1)

apps/api/prisma/schema.prisma
Missing unique constraint to prevent duplicate incidents. Race condition between duplicate check (line 75-82 in incident-detection.service.ts) and insert (line 91-98) could create duplicate incidents for the same cause/fix deployment pair.

Add:

  @@index([causeDeploymentId])
  @@index([fixDeploymentId])
  @@index([teamId])
  @@index([leaderId])
  @@index([workspaceId, detectedAt])
  @@unique([workspaceId, causeDeploymentId, fixDeploymentId])

feat: incident detection automation

fc4664c

sweetr-dev Bot added the large Large PR - Consider splitting up into smaller PRs to reduce risk and review time label Feb 23, 2026

cubic-dev-ai Bot reviewed Feb 23, 2026

View reviewed changes

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts Outdated

greptile-apps Bot reviewed Feb 23, 2026

View reviewed changes

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts Outdated

feat: add db migration

17460ad

greptile-apps Bot reviewed Feb 23, 2026

View reviewed changes

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts Outdated

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

coderabbitai Bot requested changes Feb 23, 2026

View reviewed changes

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

coderabbitai Bot requested changes Feb 23, 2026

View reviewed changes

Comment thread apps/api/prisma/migrations/20260223044415_alter_table_pull_requests_add_columns/migration.sql

waltergalvao added 2 commits February 26, 2026 02:11

fix: pr feedback + add integration tests

ab36ae3

feat: lazy load queues

0bb7b72

coderabbitai Bot requested changes Feb 26, 2026

View reviewed changes

Comment thread apps/api/package.json Outdated

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

Comment thread apps/api/src/lib/string.ts Outdated

fix: review feedback

36057e4

cubic-dev-ai Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread apps/api/src/lib/string.ts

coderabbitai Bot requested changes Feb 27, 2026

View reviewed changes

Comment thread .github/workflows/test-integration.yml

Comment thread apps/api/prisma/schema.prisma Outdated

Comment thread apps/api/src/app/incidents/services/incident-detection.integration.test.ts

greptile-apps Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread apps/api/prisma/schema.prisma Outdated

Comment thread apps/api/package.json

waltergalvao added 2 commits February 27, 2026 02:16

fix: env defaults

49d506c

fix: review feedback

679c50d

greptile-apps Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

Comment thread apps/api/src/app/incidents/services/incident-detection.service.ts

coderabbitai Bot previously approved these changes Feb 27, 2026

View reviewed changes

chore: add guard

8e50659

waltergalvao dismissed coderabbitai[bot]’s stale review via 8e50659 February 27, 2026 05:55

coderabbitai Bot approved these changes Feb 27, 2026

View reviewed changes

sweetrdev approved these changes Feb 27, 2026

View reviewed changes

Merge branch 'main' into feat/incident-detection-implementation

c9d7a08

waltergalvao merged commit e3dabeb into main Feb 27, 2026
10 of 12 checks passed

waltergalvao deleted the feat/incident-detection-implementation branch February 27, 2026 07:50

greptile-apps Bot reviewed Feb 27, 2026

View reviewed changes

coderabbitai Bot mentioned this pull request Apr 9, 2026

feat: improve onboarding #103

Merged

Conversation

waltergalvao commented Feb 23, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

coderabbitai Bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Feb 27, 2026

Uh oh!

Uh oh!

greptile-apps Bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

waltergalvao commented Feb 23, 2026 •

edited by greptile-apps Bot

Loading

coderabbitai Bot commented Feb 23, 2026 •

edited

Loading

greptile-apps Bot left a comment •

edited

Loading