Skip to content

events: bound webhook delivery and tenant tracker locks#1048

Open
HansDaigle wants to merge 1 commit into
karrioapi:mainfrom
HansDaigle:hans/tame-tracker-worker-backlog
Open

events: bound webhook delivery and tenant tracker locks#1048
HansDaigle wants to merge 1 commit into
karrioapi:mainfrom
HansDaigle:hans/tame-tracker-worker-backlog

Conversation

@HansDaigle
Copy link
Copy Markdown
Contributor

@HansDaigle HansDaigle commented Mar 22, 2026

Summary

  • scope the periodic tracker scheduler Huey lock by tenant schema
  • add a configurable webhook POST timeout via WEBHOOK_REQUEST_TIMEOUT
  • catch requests.RequestException during webhook delivery and treat missing responses as failed delivery attempts

Review updates

  • Rebuilt on current karrioapi/karrio:main as a single focused commit.
  • Dropped the older tracking-pipeline rewrite because current main already has dispatcher/per-carrier tracking tasks, task deduplication, flat delay, aged-out tracker retirement, and bulk tracker saves.
  • Kept only the two deltas called out as useful in review: per-schema scheduler locks and webhook timeout/error handling.

Validation

DJANGO_SETTINGS_MODULE=karrio.server.settings .venv-server/bin/python -m pytest   modules/events/karrio/server/events/tests/test_tracking_tasks.py   modules/events/karrio/server/events/tests/test_webhooks.py -q

Result: 14 passed, 13 warnings.

@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 22, 2026

@HansDaigle is attempting to deploy a commit to the karrio Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown
Member

@danh91 danh91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Hans, the problem you diagnosed is real — cumulative batch sleeps starving notify_webhooks is a genuine issue on self-hosted deployments with large tracker backlogs, and the missing webhook timeout is a real risk. Good catch on both.

That said, this PR is diffing against a version of tracking.py that no longer exists on main. The codebase moved significantly while this was in development:

  • 851d60692replace monolithic tracker update with dispatcher + per-carrier worker tasks
  • 2cd75e5a5clean up tracking pipeline and add task lock for deduplication

The current main already has per-carrier process_carrier_tracking_batch Huey sub-tasks, a huey_instance.lock_task guard on background_trackers_update, and a flat TRACKER_BATCH_DELAY replacing the cumulative sleep. Merging this PR as-is would regress several things that landed since your branch forked: _retire_aged_out_trackers (prevents stale trackers accumulating indefinitely), bulk_save_trackers (single bulk UPDATE per carrier batch instead of N individual saves), and the configurable batch delay.

What you're bringing that is genuinely new and worth landing:

  1. Per-schema Huey lockmain uses a single global lock key ("background_trackers_update") that covers all tenants at once. Your get_scheduler_lock_name(schema) approach locks per-tenant, so a slow schema doesn't block all others. This is a real improvement for multi-tenant deployments.

  2. Webhook timeout + error handlingmain still uses identity(lambda: requests.post(...)) with no timeout and no exception guard. A slow or unresponsive endpoint blocks a worker indefinitely. Your WEBHOOK_REQUEST_TIMEOUT setting, the try/except RequestException, and the response is not None guard in update_notified_webhooks are all correct and the test coverage is solid.

Suggested path: Pull the latest main, and rebuild this PR around just those two contributions — the per-schema lock on __init__.py and the timeout/error handling on webhook.py. The tracking pipeline changes in the middle largely duplicate what's already there; once you're on current main that should become clear. The resulting diff should be small and clean, and both additions have a clear path to merge.

@HansDaigle HansDaigle force-pushed the hans/tame-tracker-worker-backlog branch from 117f78d to 6ce987f Compare May 12, 2026 04:33
@HansDaigle HansDaigle changed the title events: split tracker refresh tasks and time out webhooks events: bound webhook delivery and tenant tracker locks May 12, 2026
@HansDaigle
Copy link
Copy Markdown
Contributor Author

Rebased against current karrioapi/karrio:main and narrowed the PR to the two review-approved deltas: per-schema tracker scheduler locks and webhook timeout/error handling. Local verification: affected events tracking/webhook tests -> 14 passed, 13 warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants