Add validation for EOS/FRR BFD check for OSPFv2/v3#3519
Conversation
|
There will still be a bit to write for the underlying BFD that needs to be written. But this is able to show if BFD is working on BGP/OSPF. Further testing such as 'strict' mode is likely to be left for another time. |
Stop. One feature per PR ;)
That's correct.
Still doesn't mean it's correct ;) It's just that the happy path is not broken. You should tweak device configurations to introduce failures and check what happens. |
ipspace
left a comment
There was a problem hiding this comment.
You should:
- Reorder the sequence of checks (adjacency first, then BFD)
- Report only exceptions (otherwise you won't check the other stuff). Alternatively, if you envision BFD check to be used separately after the initial adjacency check, the you can use _common.report_state in BFD check (because we expect the adjacency to be up)
- End the check functions with raising the happy adjacency status message (if you're changing the code, make it consistent ;)
| if not present: | ||
| raise Exception(f'Unexpected {proto_name} neighbor {id} in state {n_state.adjacencyState}') | ||
|
|
||
| if bfd: |
There was a problem hiding this comment.
I would report OSPF adjacency problem before missing BFD because it's more relevant. Also, when checking multiple conditions, you cannot use _common.report_state (because it exits no matter what).
| if not present: | ||
| raise Exception(f'Unexpected OSPFv2 neighbor {id} in state {n_state.nbrState}') | ||
|
|
||
| if bfd: |
There was a problem hiding this comment.
You cannot use the "raise in all cases" approach when checking multiple conditions. You can only raise an error condition (Exception) and continue to the next check otherwise. However, it would be nice to always end with raise log.Result(adjacency_exit_msg) to get a nice success message about adjacency being in the right state.
As before, check adjacency state before checking BFD.
|
|
||
| if bfd: | ||
| exit_msg = f'OSPFv3 neighbor {id} is in BFD state {n_state.peerBfdInfo.status}' | ||
| if n_state.peerBfdInfo and n_state.peerBfdInfo.status == "Up": |
There was a problem hiding this comment.
Here, the sequence of checks is correct, but as above, report only the error and complete the function with raise log.Result(adjacency_msg).
| @@ -24,3 +25,17 @@ links: | |||
| cost: 42 | |||
| network_type: broadcast | |||
| - abr: | |||
|
|
|||
| validate: | |||
| adj: | |||
There was a problem hiding this comment.
If you check adjacency state before checking BFD state, you can roll these tests into one ;)
| nodes: [ r1, bb ] | ||
| plugin: ospf_neighbor(nodes.abr.ospf.router_id) | ||
| bfd: | ||
| description: Check OSPF adjacencies |
There was a problem hiding this comment.
If you want to have two checks, then this one should be named "Check BFD status"
| bfd: | ||
| description: Check OSPF adjacencies | ||
| wait: ospfv2_adj_p2p | ||
| wait_msg: Waiting for OSPF adjacency process to complete |
There was a problem hiding this comment.
... and the wait message should be "Waiting for BFD to start"
| wait_msg: Waiting for OSPF adjacency process to complete | ||
| nodes: [ r1, bb ] | ||
| plugin: ospf6_neighbor(nodes.abr.ospf.router_id) | ||
| bfd: |
|
@ipspace thanks for the many comments, I will work through them. It was mainly a 'get it working state', I did confirm that it will error when BFD is off. Better testing and explaining will be in the next update. |
|
Updated with your suggestions, still kept it as OSPF but added 'BFD' to it. Tested with both EOS and FRR. Both hosts working BFDOne host failure (ABR/DUT) has disabled ospf bfd |
ipspace
left a comment
There was a problem hiding this comment.
Did a bit of polishing and made the integration tests part of the regular testing process. Merging in a few minutes once the integration tests go through.
|
|
||
| ospf.bfd.ipv4: True | ||
|
|
||
| groups: |
There was a problem hiding this comment.
A few notes on the integration test (not your fault, you just used what was there).
You have to define the probe devices as FRR (preferably with CLAB provider), otherwise all lab devices use the same device type.
| ospf: | ||
| area: 1 | ||
| cost: 42 | ||
| network_type: broadcast |
There was a problem hiding this comment.
No idea why this is here, but let's say it makes sense and we want to check BFD on multiple areas and link types ;)
| mtu: 1500 | ||
|
|
||
| nodes: | ||
| dut: |
There was a problem hiding this comment.
The tested device is usually named "dut", so it's easier to figure out what device you have to troubleshoot.
Checked with both FRR/EOS.
Working on the BGP part currently.
Was unsure if adding extra parameter to the show + validate was correct or how it should be done.
Can say it works and does not seem to break existing tests.