Skip to content

Commit d8f2c17

Browse files
heyvaldemarVladimir Mikhalevclaude
authored
ci: add lint + Trivy upstream-image scan jobs (#19)
Closes the supply-chain gap between this deployment-template repo and the aws-kubectl-docker image-publishing reference. After this PR, the deployment-verification workflow matches aws-kubectl's publish.yml shape modulo the cosign/SBOM/SLSA steps that don't apply to a repo that doesn't publish its own image. New jobs in .github/workflows/deployment-verification.yml: - `lint` — blocking, runs first: - shellcheck (via koalaman/shellcheck-alpine:stable) on every *.sh in the repo root. Catches regressions of the warnings fixed in PR #18. - actionlint (via rhysd/actionlint:1.7.12) on every workflow YAML. Catches typos, invalid step references, common GitHub Actions footguns the YAML parser doesn't catch. - deploy-and-test now depends on lint via `needs:` so CI fails fast on lint errors before burning the 15-minute compose-up slot. - `scan-trivy` — matrix job, one per pinned upstream image: - postgres:16@sha256:71e27bf6... - traefik:3.2@sha256:e561a37f... - quay.io/keycloak/keycloak:26.2.5@sha256:4883630e... Each scans for CRITICAL/HIGH fixable CVEs via aquasecurity/trivy-action@v0.35.0 (pinned to commit SHA), uploads SARIF to the GitHub Security tab under distinct categories (trivy-postgres / trivy-traefik / trivy-keycloak) for separate triage. continue-on-error: true — CVE findings surface for triage (Dependabot digest bump → PR) but don't hard-block CI. A hard block would cause red CI on every new CVE disclosure which isn't actionable inside this PR. fail-fast: false on the matrix — one image scan failing doesn't cancel the other two. Run-block hygiene: - Two existing `timeout 5m bash -c '...'` blocks (HTTP smoke checks) trip shellcheck SC2016 because `$APP_HOSTNAME` inside single quotes looks like a missed expansion. The deferred-expansion is intentional (inner bash inherits the job-level env:). Added explicit `# shellcheck disable=SC2016` directives with rationale comments immediately before the affected `timeout` invocations so actionlint passes without changing behaviour. CHANGELOG [Unreleased] → Changed updated with detailed bullet covering the new lint + scan-trivy jobs and the PR-18 fixes. No change to deploy-and-test behaviour: same env, same network setup, same compose invocation, same HTTPS + Traefik-dashboard smoke checks, same teardown. Just gated on `lint` success. Co-authored-by: Vladimir Mikhalev <ask@sre.gg> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 9241e87 commit d8f2c17

2 files changed

Lines changed: 100 additions & 0 deletions

File tree

.github/workflows/deployment-verification.yml

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,87 @@ permissions:
2121
contents: read
2222

2323
jobs:
24+
lint:
25+
name: Lint shell scripts and workflow YAML
26+
runs-on: ubuntu-latest
27+
timeout-minutes: 5
28+
permissions:
29+
contents: read
30+
steps:
31+
- name: Checkout repository
32+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
33+
34+
- name: ShellCheck
35+
# Uses the official koalaman/shellcheck-alpine image directly rather
36+
# than an intermediate GitHub Action, so there is one less supply-chain
37+
# layer to pin and review.
38+
run: |
39+
docker run --rm -v "$PWD:/mnt" -w /mnt \
40+
koalaman/shellcheck-alpine:stable \
41+
shellcheck ./*.sh
42+
43+
- name: actionlint (GitHub Actions workflow linting)
44+
# Uses the rhysd/actionlint image directly pinned to a specific
45+
# version. Surfaces workflow typos, invalid references to jobs/
46+
# outputs, and common GitHub Actions footguns the YAML parser
47+
# doesn't catch. actionlint itself is a single Go binary.
48+
run: |
49+
docker run --rm -v "$PWD:/mnt" -w /mnt \
50+
rhysd/actionlint:1.7.12 \
51+
-color
52+
53+
scan-trivy:
54+
name: Scan pinned upstream image with Trivy
55+
runs-on: ubuntu-latest
56+
timeout-minutes: 10
57+
# Trivy findings don't block the pipeline — they surface in the Security
58+
# tab where they can be triaged and fixed via Dependabot upstream-digest
59+
# bumps. A hard block here would cause CI failures on every new CVE
60+
# disclosure, which isn't actionable inside this PR.
61+
continue-on-error: true
62+
permissions:
63+
contents: read
64+
security-events: write
65+
strategy:
66+
# One job per upstream image — findings show up separately in the
67+
# GitHub Security tab under distinct categories (trivy-postgres,
68+
# trivy-traefik, trivy-keycloak).
69+
fail-fast: false
70+
matrix:
71+
include:
72+
- name: postgres
73+
image: "postgres:16@sha256:71e27bf60b70bded003791b5573f8b808365613f341df20ffcf0c1ed7bc13ddf"
74+
- name: traefik
75+
image: "traefik:3.2@sha256:e561a37f8710d9cf41c78bdf421d822b2c0b48267ec0552e644565fb55466ea9"
76+
- name: keycloak
77+
image: "quay.io/keycloak/keycloak:26.2.5@sha256:4883630ef9db14031cde3e60700c9a9a8eaf1b5c24db1589d6a2d43de38ba2a9"
78+
79+
steps:
80+
- name: Checkout repository
81+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
82+
83+
- name: Trivy scan of ${{ matrix.name }}
84+
uses: aquasecurity/trivy-action@57a97c7e7821a5776cebc9bb87c984fa69cba8f1 # v0.35.0
85+
with:
86+
image-ref: ${{ matrix.image }}
87+
format: sarif
88+
output: trivy-${{ matrix.name }}.sarif
89+
severity: CRITICAL,HIGH
90+
ignore-unfixed: true
91+
92+
- name: Upload Trivy SARIF (${{ matrix.name }}) to GitHub Security
93+
uses: github/codeql-action/upload-sarif@95e58e9a2cdfd71adc6e0353d5c52f41a045d225 # v4.35.2
94+
with:
95+
sarif_file: trivy-${{ matrix.name }}.sarif
96+
category: trivy-${{ matrix.name }}
97+
2498
deploy-and-test:
2599
name: docker compose up + HTTPS + Traefik dashboard smoke
26100
runs-on: ubuntu-latest
101+
# Wait for lint to pass so we don't burn the 15-minute compose-up slot
102+
# on a workflow that has shellcheck/actionlint errors. scan-trivy runs
103+
# in parallel (not a dependency) since findings don't block deployment.
104+
needs: lint
27105
timeout-minutes: 15
28106
permissions:
29107
contents: read
@@ -85,6 +163,9 @@ jobs:
85163
- name: Wait for the application to be ready via Traefik
86164
run: |
87165
echo "Checking the routing and availability of the application via Traefik..."
166+
# $APP_HOSTNAME is intentionally expanded by the inner bash -c
167+
# (which inherits the job-level env:), not by the outer shell.
168+
# shellcheck disable=SC2016
88169
timeout 5m bash -c 'while ! curl -fsSLk "https://$APP_HOSTNAME"; do
89170
echo "Waiting for the application to be ready..."
90171
sleep 10
@@ -93,6 +174,8 @@ jobs:
93174
- name: Wait for the Traefik dashboard to be ready
94175
run: |
95176
echo "Checking the routing and availability of the Traefik dashboard..."
177+
# Same deferred-expansion pattern as above.
178+
# shellcheck disable=SC2016
96179
timeout 5m bash -c 'while ! curl -fsSLk --write-out "%{http_code}" --output /dev/null "https://$APP_TRAEFIK_HOSTNAME" | grep -E "200|401"; do
97180
echo "Waiting for the application to be ready..."
98181
sleep 10

CHANGELOG.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2323
- `CHANGELOG.md` — this file, Keep-a-Changelog format.
2424

2525
### Changed
26+
- Deployment-verification workflow gained two new jobs:
27+
- **`lint`** — runs `shellcheck` (via `koalaman/shellcheck-alpine:stable`) on
28+
every `*.sh` in the repo root, and `actionlint` (via
29+
`rhysd/actionlint:1.7.12`) on every workflow YAML. Blocks `deploy-and-test`
30+
via `needs: lint` so CI fails fast on typos and footguns before burning
31+
the 15-minute compose-up slot.
32+
- **`scan-trivy`** — matrix job scanning each pinned upstream image
33+
(`postgres:16@sha256:…`, `traefik:3.2@sha256:…`,
34+
`quay.io/keycloak/keycloak:26.2.5@sha256:…`) for CRITICAL/HIGH fixable
35+
CVEs via `aquasecurity/trivy-action@v0.35.0`, uploading per-image SARIF
36+
to the GitHub Security tab under categories `trivy-postgres`,
37+
`trivy-traefik`, `trivy-keycloak`. Runs parallel to `deploy-and-test`
38+
with `continue-on-error: true` — findings don't block deployment but
39+
surface for triage via Dependabot digest bumps.
40+
- Resolved pre-existing shellcheck warnings (SC2034, SC2086, SC2162) in
41+
`keycloak-restore-database.sh`; fixed the README `server.cfg` leftover
42+
phrasing. Both shipped in PR #18.
2643
- README rewritten for evaluator-first audience. Replaces the prior affiliate-and-socials-heavy footer with a focused technical structure: badges, table of contents, "Why this stack?" comparison table vs manual install / Helm / other compose examples, Getting started quickstart, Features + Typical use cases, Supply chain trust, Production checklist (7-item deployment-readiness check), preserved Backups and Restore sections, Security Notes, compact maintainer footer (YouTube · Blog · LinkedIn). Removes: first-person bio, "My Courses", "My Services", Patreon tiers, affiliate kit.co links, crypto wallet addresses, Discord invite, octocat gif, footer SVG.
2744
- Upstream images pinned by `sha256` digest in addition to tag in `.env.example`
2845
and the CI ephemeral `.env`. Three images are pinned:

0 commit comments

Comments
 (0)