Skip to content

Test cases reported multiple times with pytest.mark.flaky #1992

@HuwCheston

Description

@HuwCheston

Hi, thanks for this great library.

I'm running into an issue when marking tests with pytest.mark.flaky. In cases where tests pass e.g. on the second try, the first attempt is logged on the confident-ai.com dashboard as a failure, and the second attempt is logged as a pass.

This leads to many more test cases being shown on the dashboard than were actually run in practice.

Is there a better way to handle retry logic other than using mark.flaky? Should this be handled internally within the test, e.g. catching AssertionErrors N times, and only then throwing when N is exceeded?

Minimal example

This won't work "out-of-the-box", but should show how I'm constructing the test cases.

@pytest.mark.parametrize(
    "query",
    # big list of queries
    [...]
)
@pytest.mark.flaky(reruns=5)
def test_with_flaky(query: str):
    # Assume that this takes the query, makes a generation with the agent
    #  and then returns a `deepeval.LLMTestCase` object
    test_case = create_basic_test_case(query)

    metric = GEval(
        name="flaky test",
        criteria=(
            "..."
        ),
        evaluation_params=[LLMTestCaseParams.INPUT, LLMTestCaseParams.ACTUAL_OUTPUT],
    )

    # Maybe fails the first time, but works the second
    assert_test(test_case, [metric])

and then the command to run the tests from the CLI:

poetry run deepeval login --api-key $CONFIDENT_API_KEY
poetry run deepeval test run test_file.py -n 2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions