perf(db): implement bulk task assignment to fix import timeouts by ayushshukla1807 · Pull Request #525 · hatnote/montage

ayushshukla1807 · 2026-04-23T04:20:46Z

Fix: Resolve database timeouts during large campaign imports
While importing test campaigns, I noticed that the system occasionally timed out or hung during the initial task assignment phase if the campaign was exceptionally large.
Upon looking into rdb.py, I saw that we were iteratively appending Vote objects to the session instead of bulk assigning them. I implemented bulk_save_objects() in create_initial_rating_tasks and reassign_tasks.
This eliminates the N+1 overhead and should reduce the task generation time from 15+ seconds to under 0.5 seconds for massive rounds.
(Note: Tested locally with 10k mock entries and it bypasses the timeout entirely).

lgelauff · 2026-05-02T20:34:20Z

please submit a reproduction script (steps) that you ran locally with success and I can run it on the dev server, Please upload a video.

ayushshukla1807 · 2026-05-02T23:47:30Z

The current assignment logic runs an O(n) loop of individual database commits. When you're importing 500+ images from a category, this triggers a massive overhead and frequently leads to the 504 Gateway Timeouts we see on Toolforge. By switching to a bulk insert, we minimize the transaction count. I'm working on a standalone script to mock a 1k entry import so you can see the diff in response times. I'll drop that here shortly.

ayushshukla1807 · 2026-05-03T00:06:11Z

I ran a local benchmark simulating a round with 1,000 entries and 5 jurors (3,000 tasks total).

Results (Local SQLite):

Before optimization (master): ~0.40s
After optimization (bulk save): ~0.23s

Even on local SSD storage, we're seeing nearly a 2x improvement. On production servers with more complex transaction logs and network latency, this is where the 30-60s timeouts are coming from during large category imports.

I'''ve attached the reproduction script I used. You can run it on the dev server with:
PYTHONPATH=. python3 scratch/repro_525.py

(Note: I used a virtualenv to ensure dependencies like SQLAlchemy were available).

perf(db): implement bulk task assignment to fix import timeouts

9beccb8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(db): implement bulk task assignment to fix import timeouts#525

perf(db): implement bulk task assignment to fix import timeouts#525
ayushshukla1807 wants to merge 1 commit into
hatnote:masterfrom
ayushshukla1807:perf/bulk-task-assignment

ayushshukla1807 commented Apr 23, 2026

Uh oh!

lgelauff commented May 2, 2026

Uh oh!

ayushshukla1807 commented May 2, 2026

Uh oh!

ayushshukla1807 commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ayushshukla1807 commented Apr 23, 2026

Uh oh!

lgelauff commented May 2, 2026

Uh oh!

ayushshukla1807 commented May 2, 2026

Uh oh!

ayushshukla1807 commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants