Skip to content

Mismatch Between Expected Count and (Received + Recovered) Count #137

@zwk4zhendeC

Description

@zwk4zhendeC

Based on the receiver restart tests documented in wp-examples
(https://github.com/wp-labs/wp-examples/blob/main/stability_test/report.md),
I observed that the data deviation rate of some components is non-negligible.

Under normal, bug-free conditions, data deviation should only occur in the following scenario:
after data is sent to the receiver, the receiver has already persisted the data but crashes before responding to the sender. In this case, the sender will retry and cause duplicate data, which is expected and unavoidable.

However, in my actual tests, I observed two unexpected cases:

  • In both the vlogs and Doris restart tests, a negative deviation rate occurred
  • In particular, vlogs showed a deviation rate of −40.79%

These results do not match the expected behavior.

Additionally, while reviewing the source code, I found a potential cause of incorrect deviation rates:

  • During batch inserts, if part of the batch is successfully inserted while the remaining part fails, wparse treats the entire batch as failed, which leads to unnecessary retries and duplicate data being counted.

This behavior may contribute to the incorrect deviation rates observed above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions