Remove txn from pool when contract formation or renewal fails#444
Remove txn from pool when contract formation or renewal fails#444ChrisSchinnerl wants to merge 6 commits into
Conversation
There was a problem hiding this comment.
⚠️ Not ready to approve
The deferred cleanup in server.go may remove the wrong transaction ID if V2TransactionSet updates proofs (changing the ID), which can leave the stuck transaction in the pool and undermine the fix.
Pull request overview
This PR addresses a failure mode in RHPv4 contract formation/renewal/refresh where a transaction set can be added to the host’s transaction pool but not broadcast (e.g., due to a persistence/internal error), leaving the pool thinking relevant outputs are spent and causing subsequent “double-spent” failures.
Changes:
- Extend the RHPv4 server’s chain interface to support explicitly removing v2 transactions from the txpool on RPC failure paths.
- Implement
RemoveV2PoolTransactionsin the chain manager and add coverage verifying dependent removal, no-op semantics, and input reuse. - Add an RHPv4 RPC test that simulates a host failing after pool-add and verifies recovery on retry, plus a changeset entry.
File summaries
| File | Description |
|---|---|
rhp/v4/server.go |
Removes locally-added v2 txns from the pool when formation/refresh/renew won’t be broadcast. |
rhp/v4/rpc_test.go |
Adds an integration-style test for renewal recovery after a post-pool-add failure. |
chain/manager.go |
Adds RemoveV2PoolTransactions to drop txns (and dependents via revalidation) and notify listeners. |
chain/pool_test.go |
Adds tests for removal semantics and that removal frees conflicting inputs for reuse. |
.changeset/clean_up_orphaned_transaction_hat_wasnt_broadcast_during_failed_formationrenewrefresh.md |
Documents the behavioral change in a changeset. |
Copilot's findings
- Files reviewed: 5/5 changed files
- Comments generated: 4
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3fe179f3b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // pool, along with any pooled transactions that depend on them. It is intended | ||
| // for abandoning a locally-added transaction set that will not be broadcast. | ||
| // Transactions that are not in the pool are ignored. | ||
| func (m *Manager) RemoveV2PoolTransactions(ids []types.TransactionID) { |
There was a problem hiding this comment.
This is quite dangerous.. Not sure I like it. There’s no way to tell if the transaction has been broadcast so it’d be easy to create harder to debug issues
There was a problem hiding this comment.
What if we add an optional arg to BroadcastV2TransactionSet to prevent persistence of the transaction iff the broadcast failed and a ErrBroadcastFailed that we can check for.
To make sure that 1. we didn't broadcast and 2. we won't retry.
Because right now we have a similar issue on master. BroadcastV2TransactionSet might fail, we release the inputs but then we might retry later and double spend.
I ran into a bit of a loop with one of our hosts on Zeus today. The problem being a renewal failing due to an
internal errorand the likely candidate being theRenewContractcall failing.That causes the used outputs of the host to be released but the transaction remains in the transaction pool. So the renter can't reuse their outputs since the host's pool thinks they were double spent.
This leads to the following contract formation (triggered by the existing contract being stuck in the renew window) failing due to a double spent. Since the transaction is never broadcasted the indexer will never find out about that though.
This PR attempts to fix this by explicitly removing the transaction from the pool upon failure. Which should at least allow the indexer to form a new contract to replace the stuck one with.