Skip to content

fix: lme timeout fix optimization#1220

Draft
nbmaiti wants to merge 2 commits into
mainfrom
lme_timeout_fix_optimization
Draft

fix: lme timeout fix optimization#1220
nbmaiti wants to merge 2 commits into
mainfrom
lme_timeout_fix_optimization

Conversation

@nbmaiti
Copy link
Copy Markdown
Contributor

@nbmaiti nbmaiti commented Mar 17, 2026

test lme time optimization

  1. remove extra pre-create WSMAN LocalActivateCmd , it triggers extra LME/APF intialize cycle.
  2. Optimize timeout times
  3. APF wsman msg wait for 3 sec(hardcoded) , overridden
  4. small cleanup of http/wss link during actvation/deactivation

Detail of the change
https://github.com/device-management-toolkit/rpc-go/wiki/lme-optimization

Fixes: 1209

@nbmaiti nbmaiti changed the base branch from main to next March 17, 2026 20:10
@nbmaiti nbmaiti changed the title Lme timeout fix optimization fix: lme timeout fix optimization Mar 18, 2026
@nbmaiti nbmaiti force-pushed the lme_timeout_fix_optimization branch 2 times, most recently from c3d6cc3 to 732e6cf Compare March 19, 2026 09:46
}
// Do not pre-create WSMAN for local activation here.
// LocalActivateCmd sets up its own local WSMAN transport, and doing both
// can trigger an extra LME/APF initialize cycle.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this content as part of the PR description

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return utils.ActivationFailedControlMode
}

func (service *LocalActivationService) getGeneralSettingsWithRetry() (general.Response, error) {
Copy link
Copy Markdown
Contributor

@sudhir-intc sudhir-intc Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this retry logic be implemented in the GoWSMAN package. Check this issue: device-management-toolkit/go-wsman-messages#656

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, for now we can keep it, todo added

Comment thread internal/lm/engine.go
}

bin_buf = apf.Process(result, lme.Session)
bin_buf = lme.processWithLocalTimerOverride(result)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change could you please explain as part of the PR description what was the behavior before and after

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO added

Comment thread pkg/utils/constants.go
LMSDialerTimeout = 5 // seconds
HeciReadTimeout = 30 // seconds
HeciRetryDelay = 3000 // milliseconds
LMSConnectionTimeout = 6 // seconds
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of timer values have been changed here, please provide the context w.r.t this change

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wiki link added as pr description

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce non-LMS (LME/MEI) activation/deactivation latency by tightening timeouts and adding targeted retry/timeout handling around LME channel establishment and WSMAN operations.

Changes:

  • Reduced several global timeout/backoff constants to shorten LME/LMS wait cycles.
  • Added LME connect retry + explicit timeouts while waiting for APF channel-open confirmation and AMT responses.
  • Adjusted local activation flow to avoid extra LME/APF initialization and added URL normalization + tests for placeholder URLs.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/utils/constants.go Lowers global timeout/backoff defaults used across LMS/LME networking and MEI reads.
internal/rps/executor.go Adds LME connect retry and introduces explicit timers for channel-open confirmation and response waiting.
internal/local/amt/wsman.go Caches LMS probe result and reuses local transport for non-TLS mode when LMS is unavailable.
internal/local/amt/localTransport.go Adds MEI-busy retry and timeout handling for local WSMAN transport channel-open and responses.
internal/lm/engine.go Overrides LME APF channel data timer behavior to flush sooner after APF_CHANNEL_DATA.
internal/commands/activate/local.go Retries GetGeneralSettings on transient MEI/LME busy-like errors.
internal/commands/activate/activate.go Normalizes URL input earlier and avoids pre-creating WSMAN for local activation to reduce extra LME cycles.
internal/commands/activate/activate_test.go Adds validation tests for placeholder/whitespace URLs in local CCM scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/rps/executor.go
Comment thread internal/rps/executor.go Outdated
Comment thread internal/local/amt/localTransport.go Outdated
Comment thread internal/local/amt/localTransport.go
Comment thread pkg/utils/constants.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

time optimize lme accss

Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
@nbmaiti nbmaiti force-pushed the lme_timeout_fix_optimization branch 2 times, most recently from cb9fc15 to aaa0566 Compare March 23, 2026 12:43
fixed lint and copilot issues

Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com>
@nbmaiti nbmaiti force-pushed the lme_timeout_fix_optimization branch from aaa0566 to 2efae28 Compare March 23, 2026 12:52
@nbmaiti nbmaiti requested a review from Copilot March 23, 2026 12:53
@nbmaiti nbmaiti requested a review from sudhir-intc March 23, 2026 12:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/rps/executor.go
Comment on lines +192 to +199
channelOpenTimeout := time.Duration(utils.LMETimerTimeout) * time.Second
if channelOpenTimeout <= 0 || channelOpenTimeout > utils.AMTResponseTimeout*time.Second {
channelOpenTimeout = utils.AMTResponseTimeout * time.Second
}

channelOpenTimer := time.After(channelOpenTimeout)

channelOpenDone := make(chan struct{})
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

channelOpenTimer := time.After(...) creates an un-stoppable timer. In an activation loop this can leave many pending timers alive until they fire, even when the channel opens quickly (similar to the earlier response timer issue). Prefer time.NewTimer with Stop() + channel drain (as done for responseTimer) so the timer can be cancelled on the fast path.

Copilot uses AI. Check for mistakes.
Comment on lines 80 to 86
@@ -81,7 +83,6 @@ func (cmd *ActivateCmd) Validate() error {
if localIntent && cmd.URL != "" {
lowerURL := strings.ToLower(cmd.URL)
if strings.HasPrefix(lowerURL, "http://") || strings.HasPrefix(lowerURL, "https://") {
log.Warn("Both --url and local activation flags detected; proceeding with local activation via http://")
// Clear URL so we don't trigger HTTP profile fullflow during local runs (prevents recursion)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment describing local-vs-remote precedence for HTTP(S) URLs doesn’t match the implemented behavior: the code clears an HTTP(S) --url when local intent is detected (local flags win), but the comment says to keep the URL and ignore local flags. Please update the comment (or the logic) so the documented precedence matches what Validate() actually enforces.

Copilot uses AI. Check for mistakes.
@rsdmike rsdmike changed the base branch from next to main May 1, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AMT provision using nonLMS /LME way takes lots of time

3 participants