Skip to content

aws_dynamodb_cdc: Support namespace or client_id for checkpoint isolation #4319

@tushar686

Description

@tushar686

Context: In our development environment, multiple engineers frequently run the same Redpanda Connect pipeline against a shared DynamoDB table (CDC stream) for testing. Currently, all instances point to the same checkpoint_table.

Problem: Checkpoints in the DynamoDB table are currently keyed only by (StreamArn, ShardID). There is no concept of a "Consumer Group" or logical scope. When multiple developers run the pipeline simultaneously:

They contend for the same checkpoint rows. Sequence numbers are overwritten by different consumers. This leads to skipped events, duplicates, and inconsistent state across the dev team.

While creating separate checkpoint tables per developer is a workaround, it creates significant infrastructure sprawl and is operationally difficult to manage at scale.

Ask: Could the aws_dynamodb_cdc input be updated to support an optional namespace, consumer_group, or similar identifier?

The goal is to allow multiple independent logical readers to share a single physical checkpoint table without colliding. This would bring the DynamoDB CDC input in line with the "one store, many logical readers" pattern and significantly improve the developer experience by reducing infrastructure sprawl.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions