Skip to content

fix: handle ACM AdminSetup silent-close, Setup fallback, and TLS flag…#1295

Open
nbmaiti wants to merge 4 commits into
mainfrom
lms_all_amt_tls
Open

fix: handle ACM AdminSetup silent-close, Setup fallback, and TLS flag…#1295
nbmaiti wants to merge 4 commits into
mainfrom
lms_all_amt_tls

Conversation

@nbmaiti
Copy link
Copy Markdown
Contributor

@nbmaiti nbmaiti commented Apr 29, 2026

All AMT and TLS are working.

Detail explanation of change in wiki

Summary

Fixes ACM activation edge cases and payload propagation:

  • Treats AdminSetup silent-close (empty LMS response) as success with synthesized WS-MAN response.
  • Adds Setup → AdminSetup fallback on ReturnValue=1 (NotSupported).
  • Resets connection before CommitChanges and improves 401 diagnostics.
  • Propagates localTlsEnforced and controlMode through remote payload flow.
  • Adds targeted tests for response completeness and TLS flag serialization.

Validation

  • Remote run: 5/5 cycles passed, local 5/5, remote 5/5.
  • Profiles: acm_p 5/5 (avg activate 43s), ccm_p 5/5 (avg activate 39s).
  • AMT versions validated: 16.1.27, 18.1.18, 19.0.5, 20.0.5, 21.0.2.

Test Result

On
Git Branch:        next
Git Commit:        3195b2019dd0b79539964e1b08c230880fabb578
Git Subject:       chore(release): 3.0.0-beta.23 [skip ci]

Index | Mode | AMT Ver | ccm_local | Remote | Overall | acm_p (S/Act/Deact) | ccm_p (S/Act/Deact)
-------+------+---------+-----------+--------+---------+---------------------+---------------------
1 | lms | 16.1.27 | PASS | FAIL | FAIL | FAIL / 20s / 1s | FAIL / 1s / 1s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
2 | lms | 18.1.18 | PASS | FAIL | FAIL | FAIL / 20s / 2s | FAIL / 3s / 2s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
3 | lms | 19.0.5 | PASS | FAIL | FAIL | FAIL / 20s / 9s | FAIL / 22s / 9s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
4 | lms | 20.0.5 | PASS | FAIL | FAIL | FAIL / 19s / 9s | FAIL / 19s / 9s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
5 | lms | 21.0.2 | PASS | FAIL | FAIL | FAIL / 20s / 9s | FAIL / 20s / 9s

On
Git Branch:        lms_all_amt_tls
Git Commit:        93bf7ba3ac3dc22a200d00f3cddfb7d9f8530988

Index | Mode | AMT Ver | ccm_local | Remote | Overall | acm_p (S/Act/Deact) | ccm_p (S/Act/Deact)
-------+------+---------+-----------+--------+---------+---------------------+---------------------
1 | lms | 16.1.27 | PASS | PASS | PASS | PASS / 25s / 1s | PASS / 23s / 1s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
2 | lms | 18.1.18 | PASS | PASS | PASS | PASS / 28s / 2s | PASS / 26s / 2s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
3 | lms | 19.0.5 | PASS | PASS | PASS | PASS / 53s / 3s | PASS / 49s / 3s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
4 | lms | 20.0.5 | PASS | PASS | PASS | PASS / 48s / 2s | PASS / 44s / 2s
-------+------+---------+-----------+--------+---------+---------------------+---------------------
5 | lms | 21.0.2 | PASS | PASS | PASS | PASS / 61s / 4s | PASS / 54s / 4s

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves AMT provisioning reliability (especially for TLS/LMS flows) by handling AdminSetup silent socket closes, adding a Setup→AdminSetup fallback path, and propagating a “local TLS enforced” flag through CLI → RPS payloads.

Changes:

  • Add localTlsEnforced to the RPS message payload and thread the flag through command contexts and RPS requests.
  • Update the RPS executor to (a) synthesize a success response when AdminSetup returns no data, (b) attempt AdminSetup fallback when Setup returns NotSupported, and (c) add targeted diagnostics for CommitChanges 401s.
  • Rework LMS Listen() to read until a complete HTTP response (chunked / Content-Length) and add tests for the HTTP completeness check.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/utils/constants.go Adds LMSReadIdleTimeout used by LMS read-loop behavior.
internal/rps/message.go Adds LocalTLSEnforced to JSON payload sent to RPS.
internal/rps/message_test.go Adds coverage ensuring localTlsEnforced is always present in request JSON.
internal/rps/executor.go Adds AdminSetup silent-close handling, Setup→AdminSetup fallback, CommitChanges auth reset/diagnostics, and helper parsers/builders.
internal/rps/executor_test.go New test file (currently empty) intended for executor coverage.
internal/lm/service.go Updates LMS Listen() to buffer until “complete” HTTP response; adds read-deadline logic and HTTP completeness helper.
internal/lm/service_test.go Adds unit tests for isCompleteHTTPResponse.
internal/commands/shared.go Adds LocalTLSEnforced to shared command context.
internal/commands/deactivate.go Propagates TLS/control-mode flags into remote deactivation RPS requests.
internal/commands/activate/remote.go Propagates LocalTlsEnforced into activation RPS request.
internal/commands/activate/activate.go Ensures context is updated with control-mode / local TLS enforcement before remote activation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/lm/service.go Outdated
Comment thread internal/lm/service.go
Comment thread internal/lm/service.go Outdated
Comment thread internal/rps/executor_test.go Outdated
Comment thread internal/rps/executor_test.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/rps/executor.go
Comment thread internal/rps/executor_test.go Outdated
@rsdmike rsdmike changed the base branch from next to main May 1, 2026 20:57
@nbmaiti nbmaiti force-pushed the lms_all_amt_tls branch from 63edf8b to 32d404d Compare May 5, 2026 15:30
@nbmaiti nbmaiti requested a review from Copilot May 5, 2026 15:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/lm/service.go Outdated
Comment thread internal/lm/service.go Outdated
Comment thread internal/lm/service.go Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

Unit tests (Go)

1 104 tests  +11   1 101 ✅ +8   4s ⏱️ -1s
   25 suites ± 0       3 💤 +3 
    1 files   ± 0       0 ❌ ±0 

Results for commit c3c5532. ± Comparison against base commit 6fdfa08.

♻️ This comment has been updated with latest results.

@nbmaiti nbmaiti force-pushed the lms_all_amt_tls branch from 9ff4960 to c3c5532 Compare May 5, 2026 17:43
Comment thread internal/certs/lmsTls.go Fixed
@nbmaiti nbmaiti marked this pull request as ready for review May 5, 2026 18:04
@nbmaiti nbmaiti force-pushed the lms_all_amt_tls branch 2 times, most recently from a1ddef5 to aaaa919 Compare May 5, 2026 19:04
Copy link
Copy Markdown
Member

@rsdmike rsdmike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few important things happening here that should not be. RPC-go should never be manipulating or inspecting wsman messages or making decisions off of them for RPS flows. That all has to be removed. Additionally there is duplicate flags now -- we already handle TLSEnforced, not sure why it has been added twice. Theres an empty cert file that needs to be removed. The control mode should already be available via AMTBaseCmd for activate/deactivate/configure . As it stands the version you tested with beta.23 is over two weeks old. You should be using main always -- same with RPS. I do not see the same failures. Please try again ensuring you have the latest code, and then lets take each failure and take a look at whats going on so we can address each systematically. Thanks !

nbmaiti added 4 commits May 18, 2026 18:05
 is enforced

When the device has LocalTLSEnforced set, RPS owns the TLS session end-to-end
through its TLSTunnelManager. rpc-go must forward raw TCP bytes; layering its
own TLS client on top causes a TLS-in-TLS handshake that AMT rejects.

* lm/service.go: add a 'tlsTunnel' flag on LMSConnection with SetTLSTunnelMode
  and ActivateTLSTunnelMode helpers so callers can mark the connection as a
  raw-bytes tunnel without needing to know the field layout.
* rps/executor.go: in NewExecutor, when LocalTlsEnforced is true, build the LMS
  connection with tunnel mode enabled and mark the executor as already
  switched. Also auto-switch on the first MethodTLSData if RPS skipped
  port_switch (v2.x protocol), and have handlePortSwitch dial plain TCP +
  tag the new connection as tunnel mode.

Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
 ReturnValue errors
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

AMT's IPS_HostBasedSetupService/AdminSetup, AMT_SetupAndConfigurationService/
Unprovision, and /transfer/Delete close the LMS socket without sending an HTTP
response when they succeed. RPS interprets the dangling read as a failure and
aborts the activation. Additionally, in TLS-tunnel mode TLS 1.3 has handshake
rounds that legitimately produce zero AMT-side bytes — the Listen loop must
not treat those quiet rounds as a fatal close.

* lm/service.go: rework Listen() to always deliver the read buffer (including
  empty), gate the typed ErrLMSReadTimeoutNoData signal on tlsTunnel mode so
  non-tunnel callers don't see noise on a benign silent close, and let
  non-tunnel mode loop on timeout-no-data until bytes appear or an error
  fires.
* rps/executor.go: classify the outgoing payload via isAdminSetupRequest. On
  an empty response: synthesize a SOAP success in LME mode, HECI-verify
  control mode in LMS mode before synthesizing, and skip synthesis in tunnel
  mode (RPS expects raw tunnel bytes, not XML). On a non-empty Setup/
  AdminSetup response, parse <g:ReturnValue> and surface a typed error for
  non-zero values (rv=1 NotSupported is common when CCM is disabled).

Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
 response
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Two end-of-conversation problems left rpc-go and RPS waiting on each other:

* CommitChanges sometimes returned '401 Unauthorized' because the existing
  digest-auth context from prior Setup ops still had a stale nonce. Closing
  the LMS connection before the CommitChanges request forces a fresh auth
  negotiation. When 401 still comes back, surface it as a typed error
  instead of letting RPS time out.
* In TLS-tunnel mode, AMT issues a TCP close (FIN/RST) right after the
  final WSMAN response (CommitChanges, last cert op, etc.). RPS waits for
  a MethodConnectionReset signal before emitting terminal success; without
  it the websocket sits idle until RPS times out at ~60s and closes with
  code 1006 — even though the device is already provisioned.

* lm/service.go: add PeekClose(peekTimeout) that does a bounded read to
  detect EOF/RST and returns any trailing bytes (e.g. TLS close_notify)
  so the caller can forward them before signalling RPS.
* rps/executor.go: add isCommitChangesRequest classification + close the
  LMS connection up front for CommitChanges; surface 401 Unauthorized in
  the CommitChanges path as a typed error; after each tunnel response,
  PeekClose for 150ms and emit MethodConnectionReset on close. Also add
  isProvisionedAfterUnexpectedRPSClose: when the websocket drops without
  terminal status in tunnel mode, verify via HECI that the device is
  actually provisioned before declaring failure.

Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
 latency
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Two timing knobs were paying a multi-second tax on every activation:

* lm/service.go: in Listen()'s TLS branch the subsequent-read timeout was set
  to 2s. Once AMT starts streaming bytes for a response the remaining packets
  arrive within milliseconds, so the second timeout fires on every single
  round-trip — for an ACM/CCM flow with ~50 round-trips this added ~100s of
  pure idle. Drop it to 100ms; keep the first-byte timeout at 2s so AMT still
  has time to start producing a TLS record.
* commands/base.go: GetControlMode retried on HECI busy with a flat 4s
  backoff for up to 4 attempts, so a transient busy at startup cost 12s
  before the command could begin. Use exponential backoff starting at 500ms,
  capped at 4s; the worst-case 3-attempt total drops to 3.5s and a typical
  one-retry case to 500ms.

* rps/executor.go: cosmetic-only — expand the empty-tunnel-response branch
  doc comment to describe RPS's Phase-2 expectations and add a //nolint
  rationale for the best-effort connection_reset Send. No behavior change.

Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread internal/rps/executor.go
Comment on lines +89 to +90
tlsTunnelActive: config.LocalTlsEnforced,
switchedToTunnel: config.LocalTlsEnforced, // already in plain-TCP tunnel mode from the start
Comment thread internal/rps/executor.go
e.isCommitChanges = isCommitChangesRequest(msg.Payload)
if e.isCommitChanges {
log.Debug("CommitChanges detected - closing any existing connections to force fresh auth")
e.localManagement.Close()
Comment thread internal/rps/executor.go
Comment on lines +738 to +743
soapBody := "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<a:Envelope xmlns:a=\"http://www.w3.org/2003/05/soap-envelope\" xmlns:b=\"http://schemas.xmlsoap.org/ws/2004/08/addressing\" xmlns:c=\"http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd\" xmlns:g=\"http://intel.com/wbem/wscim/1/ips-schema/1/IPS_HostBasedSetupService\">" +
"<a:Header>" +
"<b:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</b:To>" +
"<b:RelatesTo>" + messageID + "</b:RelatesTo>" +
"<b:Action a:mustUnderstand=\"true\">http://intel.com/wbem/wscim/1/ips-schema/1/IPS_HostBasedSetupService/" + actionName + "Response</b:Action>" +
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants