Skip to content

Remove doomed attempts to reforward pinned requests#2449

Open
rousskov wants to merge 2 commits into
squid-cache:masterfrom
measurement-factory:OSD-7-survive-ftp-Relay-sendCommand-throw
Open

Remove doomed attempts to reforward pinned requests#2449
rousskov wants to merge 2 commits into
squid-cache:masterfrom
measurement-factory:OSD-7-survive-ftp-Relay-sendCommand-throw

Conversation

@rousskov

Copy link
Copy Markdown
Contributor
AsyncJob.cc(100) mustStop: Ftp::Relay will stop, reason: exception
...
FATAL: check failed: !request->pinnedConnection()
    exception location: FwdState.cc(1119) connectStart

Since February 2019 commit 3dde9e5, retries of "pinned" requests1
using a new Squid-to-server connections are meant for FTP and
connection-based authentication traffic only. Back then, we agreed that
such re-pinning is wrong2, and different mechanisms should be used to
handle forwarding errors in those cases, but we preserved the
corresponding functionality in fear of breaking "working" cases3. On
that preserved code path, request->pinnedConnection() is true at
pinnedCanRetry() check time.

Since September 2019 commit daf8070, connectStart() throws if
request->pinnedConnection() is true, effectively breaking retries for
nearly all preserved cases mentioned above! The only pinned retries that
could hypothetically continue working after that commit are those where
ConnStateData job had been destroyed between pinnedCanRetry() and
connectStart(), and those rare cases are not interesting since
ConnStateData destruction ought to abort request forwarding anyway.

Both 2019 commits are present in v5.0.1 and beyond. We are not aware of
any associated problems. Thus, our fears were excessive, and we can
remove code that preserves problematic functionality. If we discover a
need to retry pinned FTP or connection-authenticated requests via a new
connection, we will add mechanisms to support that functionality.

Removing this code also helps avoid FATAL errors observed during recent
FTP work.

CONNECT tunnels

Existing tunnel.cc code ignores request->flags.pinned when deciding
whether to retry a failed tunneling attempt. This change does not affect
the corresponding "check pinned connections" TODO in that tunneling
code. FWIW, that TODO is limited to tunnelStart(); switchToTunnel()
cases already do not retry due to a "committed to server" retry ban on
their code path. It is likely that pinned retries should be disabled in
tunneled cases as well, but that adjustment deserves a dedicated
analysis/change.

Footnotes

  1. Here, a "pinned" request is, roughly speaking, a request sent by
    Squid over a Squid-to-server connection that was kept/provided by a
    client-to-Squid connection manager (ConnStateData). See usePinned().

  2. "automatically re-opening a PINNED connection from the middleware
    is invalid in todays networks" at
    https://github.com/squid-cache/squid/pull/351#discussion_r250015937

  3. "keep re-pinning as is" at
    https://github.com/squid-cache/squid/pull/351#discussion_r253757986

rousskov added 2 commits June 22, 2026 15:41
... to better tolerate Ftp::Relay::sendCommand() and similar exceptions.

    ERROR: Squid BUG: assurance failed: ...
        exception location: FtpClient.cc(835) writeCommand
    AsyncJob.cc(145) callException: assurance failed ...
    AsyncJob.cc(100) mustStop: Ftp::Relay will stop, reason: exception
    ...
    FATAL: check failed: !request->pinnedConnection()
        exception location: FwdState.cc(1119) connectStart

When branch-added Ftp::Client::writeCommand() assertions throw, the
worker dies instead of just killing the affected ftp_port transaction.

`JobDialer::dial()` catches the BUG exception, as expected, but the
worker still dies because `Client::swanSong()` cleanup code calls
`FwdState::handleUnregisteredServerEnd()` which attempts to retry the
failed FTP transaction despite true `request->pinnedConnection()`, which
triggers another, this time uncaught, exception in `connectStart()`.

Also documented suspicions that `FwdState::pinnedCanRetry()` should
_always_ return false and, hence, should be removed (instead of adding
more and more code/conditions when that method returns false).

----

Cherry-picked osd-7-sanitize-ftp-command commit becd5b4 that was
reverted on that branch, so that we can use a dedicated branch to finish
this work.
Since February 2019 commit 3dde9e5, retries of "pinned" requests[^1]
using a new Squid-to-server connections are meant for FTP and
connection-authentication traffic only. Back then, we agreed that such
re-pinning is wrong[^2], and different mechanisms should be used to
handle forwarding errors in those cases, but we preserved the
corresponding functionality in fear of breaking "working" cases[^3]. On
that preserved code path, `request->pinnedConnection()` is true at
pinnedCanRetry() check time.

Since September 2019 commit daf8070, connectStart() throws if
`request->pinnedConnection()` is true, effectively breaking retries for
nearly all preserved cases mentioned above. The only pinned retries that
could hypothetically continue working after that commit are those where
ConnStateData job had been destroyed between pinnedCanRetry() and
connectStart(), and those rare cases are not interesting since
ConnStateData destruction ought to abort request forwarding anyway.

Both 2019 commits are present in v5.0.1 and beyond. We are not aware of
any associated problems. Thus, our fears were excessive, and we can
remove code that preserves problematic functionality. If we discover a
serious need to retry FTP or connection-authenticated requests via a new
connection, we will add proper mechanisms to support that functionality.

Removing this code also helps avoid FATAL errors observed during recent
FTP work.

CONNECT tunnels
---------------

Existing tunnel.cc code ignores `request->flags.pinned` when deciding
whether to retry a failed tunneling attempt. This change does not affect
the corresponding "check pinned connections" TODO in that tunneling
code. FWIW, that TODO is limited to tunnelStart(); switchToTunnel()
cases already do not retry due to a "committed to server" retry ban on
their code path. It is likely that pinned retries should be disabled in
tunneled cases as well, but that adjustment deserves a dedicated
analysis/change.

[^1]: Here, a "pinned" request is, roughly speaking, a request sent by
Squid over a Squid-to-server connection that was kept/provided by a
client-to-Squid connection manager (ConnStateData). See usePinned().

[^2]: "automatically re-opening a PINNED connection from the middleware
is invalid in todays networks" at
squid-cache#351 (comment)

[^3]: "keep re-pinning as is" at
squid-cache#351 (comment)

@rousskov rousskov left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR stems from recent commit 51d1ddb work, but it is not required for that commit changes to function properly: That commit does not cause the FATAL error quoted in this PR description. That error was observed in an earlier variation of that code.

Said that, I suspect that similar errors will sooner or later resurface as more code throws on errors. And even if that never happens, it is a good idea to remove this not-working and complex/dangerous code anyway.

Disclaimer: I tested FTP but not connection-based authentication cases. Removed code does not distinguishes the two categories though -- it ought to hit the same !request->pinnedConnection() assertion in FwdState::connectStart() AFAICT.

P.S. We are working on one more 51d1ddb followup PR.

@rousskov rousskov added S-waiting-for-more-reviewers needs a reviewer and/or a second opinion S-could-use-an-approval An approval may speed this PR merger (but is not required) labels Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-could-use-an-approval An approval may speed this PR merger (but is not required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant