fix: lme timeout fix optimization#1220
Conversation
c3d6cc3 to
732e6cf
Compare
| } | ||
| // Do not pre-create WSMAN for local activation here. | ||
| // LocalActivateCmd sets up its own local WSMAN transport, and doing both | ||
| // can trigger an extra LME/APF initialize cycle. |
There was a problem hiding this comment.
Move this content as part of the PR description
| return utils.ActivationFailedControlMode | ||
| } | ||
|
|
||
| func (service *LocalActivationService) getGeneralSettingsWithRetry() (general.Response, error) { |
There was a problem hiding this comment.
Can this retry logic be implemented in the GoWSMAN package. Check this issue: device-management-toolkit/go-wsman-messages#656
There was a problem hiding this comment.
yes, for now we can keep it, todo added
| } | ||
|
|
||
| bin_buf = apf.Process(result, lme.Session) | ||
| bin_buf = lme.processWithLocalTimerOverride(result) |
There was a problem hiding this comment.
This change could you please explain as part of the PR description what was the behavior before and after
| LMSDialerTimeout = 5 // seconds | ||
| HeciReadTimeout = 30 // seconds | ||
| HeciRetryDelay = 3000 // milliseconds | ||
| LMSConnectionTimeout = 6 // seconds |
There was a problem hiding this comment.
Lots of timer values have been changed here, please provide the context w.r.t this change
There was a problem hiding this comment.
wiki link added as pr description
732e6cf to
9628a38
Compare
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce non-LMS (LME/MEI) activation/deactivation latency by tightening timeouts and adding targeted retry/timeout handling around LME channel establishment and WSMAN operations.
Changes:
- Reduced several global timeout/backoff constants to shorten LME/LMS wait cycles.
- Added LME connect retry + explicit timeouts while waiting for APF channel-open confirmation and AMT responses.
- Adjusted local activation flow to avoid extra LME/APF initialization and added URL normalization + tests for placeholder URLs.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/utils/constants.go | Lowers global timeout/backoff defaults used across LMS/LME networking and MEI reads. |
| internal/rps/executor.go | Adds LME connect retry and introduces explicit timers for channel-open confirmation and response waiting. |
| internal/local/amt/wsman.go | Caches LMS probe result and reuses local transport for non-TLS mode when LMS is unavailable. |
| internal/local/amt/localTransport.go | Adds MEI-busy retry and timeout handling for local WSMAN transport channel-open and responses. |
| internal/lm/engine.go | Overrides LME APF channel data timer behavior to flush sooner after APF_CHANNEL_DATA. |
| internal/commands/activate/local.go | Retries GetGeneralSettings on transient MEI/LME busy-like errors. |
| internal/commands/activate/activate.go | Normalizes URL input earlier and avoids pre-creating WSMAN for local activation to reduce extra LME cycles. |
| internal/commands/activate/activate_test.go | Adds validation tests for placeholder/whitespace URLs in local CCM scenarios. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
time optimize lme accss Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
cb9fc15 to
aaa0566
Compare
fixed lint and copilot issues Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
aaa0566 to
2efae28
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| channelOpenTimeout := time.Duration(utils.LMETimerTimeout) * time.Second | ||
| if channelOpenTimeout <= 0 || channelOpenTimeout > utils.AMTResponseTimeout*time.Second { | ||
| channelOpenTimeout = utils.AMTResponseTimeout * time.Second | ||
| } | ||
|
|
||
| channelOpenTimer := time.After(channelOpenTimeout) | ||
|
|
||
| channelOpenDone := make(chan struct{}) |
There was a problem hiding this comment.
channelOpenTimer := time.After(...) creates an un-stoppable timer. In an activation loop this can leave many pending timers alive until they fire, even when the channel opens quickly (similar to the earlier response timer issue). Prefer time.NewTimer with Stop() + channel drain (as done for responseTimer) so the timer can be cancelled on the fast path.
| @@ -81,7 +83,6 @@ func (cmd *ActivateCmd) Validate() error { | |||
| if localIntent && cmd.URL != "" { | |||
| lowerURL := strings.ToLower(cmd.URL) | |||
| if strings.HasPrefix(lowerURL, "http://") || strings.HasPrefix(lowerURL, "https://") { | |||
| log.Warn("Both --url and local activation flags detected; proceeding with local activation via http://") | |||
| // Clear URL so we don't trigger HTTP profile fullflow during local runs (prevents recursion) | |||
There was a problem hiding this comment.
The comment describing local-vs-remote precedence for HTTP(S) URLs doesn’t match the implemented behavior: the code clears an HTTP(S) --url when local intent is detected (local flags win), but the comment says to keep the URL and ignore local flags. Please update the comment (or the logic) so the documented precedence matches what Validate() actually enforces.
test lme time optimization
Detail of the change
https://github.com/device-management-toolkit/rpc-go/wiki/lme-optimization
Fixes: 1209