Skip to content

"Fix" intermittent producer test: Test_Producer_CanSafelyCompleteJobsWhileFetchingNewOnes#1248

Open
brandur wants to merge 1 commit into
masterfrom
brandur-fix-producer-test
Open

"Fix" intermittent producer test: Test_Producer_CanSafelyCompleteJobsWhileFetchingNewOnes#1248
brandur wants to merge 1 commit into
masterfrom
brandur-fix-producer-test

Conversation

@brandur
Copy link
Copy Markdown
Contributor

@brandur brandur commented May 11, 2026

"Fix" an intermittently failing producer test observed in CI [1].

--- FAIL: Test_Producer_CanSafelyCompleteJobsWhileFetchingNewOnes (21.87s)
    riverdbtest.go:220: Dropped 0 expired postgres schema(s) in 10.195µs
    producer_test.go:61: Generated postgres schema "river_2026_05_11t08_13_21_schema_01" with migrations [1 2 3 4 5 6] on line "main" in 252.908761ms [2 generated] [0 reused]
    logger.go:256: time=2026-05-11T08:13:27.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=1859 num_jobs_running=637 num_jobs_stuck=0 queue=default
    logger.go:256: time=2026-05-11T08:13:32.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=3713 num_jobs_running=611 num_jobs_stuck=0 queue=default
    logger.go:256: time=2026-05-11T08:13:37.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=4471 num_jobs_running=200 num_jobs_stuck=0 queue=default
    producer_test.go:160: timed out waiting for last job to run
    logger.go:256: time=2026-05-11T08:13:42.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=9317 num_jobs_running=274 num_jobs_stuck=0 queue=default
    riverdbtest.go:293: Checked in postgres schema "river_2026_05_11t08_13_21_schema_01"; 1 idle schema(s) [6 generated] [23 reused]
FAIL
FAIL	github.com/riverqueue/river	55.241s

I put "fix" in quotes here because this isn't a good fix because this
isn't a good test. It spins up a ton of jobs so that it's already slow
under the best of conditions, but on a slow machine there's real risk of
timeout. Even in the event of success, it takes a pretty long time and
makes CI pretty slow. We expand the allowed 20 seconds to 45 seconds,
but even if that succeeds, it means the test case may take right up to
45 seconds by itself.

Still, I understand that hot take might be controversial (lol), so I
avoided making anymore substantive changes for the time being.

[1] https://github.com/riverqueue/river/actions/runs/25658336810/job/75312202309?pr=1247

…sWhileFetchingNewOnes`

"Fix" an intermittently failing producer test observed in CI [1].

    --- FAIL: Test_Producer_CanSafelyCompleteJobsWhileFetchingNewOnes (21.87s)
        riverdbtest.go:220: Dropped 0 expired postgres schema(s) in 10.195µs
        producer_test.go:61: Generated postgres schema "river_2026_05_11t08_13_21_schema_01" with migrations [1 2 3 4 5 6] on line "main" in 252.908761ms [2 generated] [0 reused]
        logger.go:256: time=2026-05-11T08:13:27.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=1859 num_jobs_running=637 num_jobs_stuck=0 queue=default
        logger.go:256: time=2026-05-11T08:13:32.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=3713 num_jobs_running=611 num_jobs_stuck=0 queue=default
        logger.go:256: time=2026-05-11T08:13:37.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=4471 num_jobs_running=200 num_jobs_stuck=0 queue=default
        producer_test.go:160: timed out waiting for last job to run
        logger.go:256: time=2026-05-11T08:13:42.411Z level=INFO msg="producer: Producer job counts" num_completed_jobs=9317 num_jobs_running=274 num_jobs_stuck=0 queue=default
        riverdbtest.go:293: Checked in postgres schema "river_2026_05_11t08_13_21_schema_01"; 1 idle schema(s) [6 generated] [23 reused]
    FAIL
    FAIL	github.com/riverqueue/river	55.241s

I put "fix" in quotes here because this isn't a good fix because this
isn't a good test. It spins up a ton of jobs so that it's already slow
under the best of conditions, but on a slow machine there's real risk of
timeout. Even in the event of success, it takes a pretty long time and
makes CI pretty slow. We expand the allowed 20 seconds to 45 seconds,
but even if that succeeds, it means the test case may take right up to
45 seconds by itself.

Still, I understand that hot take might be controversial (lol), so I
avoided making anymore substantive changes for the time being.

[1] https://github.com/riverqueue/river/actions/runs/25658336810/job/75312202309?pr=1247
@brandur brandur requested a review from bgentry May 11, 2026 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant