Skip to content

Make LS checkers time out after being idle, segment based on use (take 2)#3435

Open
jakebailey wants to merge 1 commit intomainfrom
jabaile/restore-3363
Open

Make LS checkers time out after being idle, segment based on use (take 2)#3435
jakebailey wants to merge 1 commit intomainfrom
jabaile/restore-3363

Conversation

@jakebailey
Copy link
Copy Markdown
Member

This brings back #3363, which I think should be safe after #3429?

Copilot AI review requested due to automatic review settings April 17, 2026 21:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reintroduces an idle-expiring checker pool for the language server, with separate checker “lifetimes” (diagnostics vs temporary queries vs API) to reduce contention, improve cache locality, and preserve stable identity for API handles. This fits into the project system’s program lifecycle management and the LSP/API entrypoints that acquire type checkers.

Changes:

  • Add core.CheckerLifetime on context.Context and route checker acquisition by lifetime (Temporary/Diagnostics/API).
  • Replace the project checker pool implementation with a semaphore + idle-timeout cleanup design, plus Discard() to stop background cleanup when programs are superseded/unreferenced.
  • Add extensive checkerpool tests using testing/synctest, and plumb checker pool options through SessionOptions.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/project/snapshot.go Discards checker pools when a program becomes unreferenced (to stop background cleanup).
internal/project/session.go Adds CheckerPoolOptions to session initialization options.
internal/project/projectcollectionbuilder.go Discards old pools on program updates; rebinds project.checkerPool from Program.
internal/project/project.go Uses SessionOptions.CheckerPoolOptions when creating programs/checker pools; simplifies CreateProgramResult.
internal/project/checkerpool.go New checker pool implementation: lifetimes, semaphores, idle cleanup timer, Discard().
internal/project/checkerpool_test.go New comprehensive tests for routing, affinity, contention, cleanup, discard behavior.
internal/lsp/server.go Marks document diagnostic requests as CheckerLifetimeDiagnostics.
internal/core/context.go Adds checker lifetime context plumbing.
internal/compiler/program.go Exposes Program.GetCheckerPool() for project-side introspection.
internal/api/session.go Routes API type-checker usage to CheckerLifetimeAPI; tags API diagnostics as diagnostics lifetime.

Comment on lines +406 to 430
// cleanupIdleCheckers disposes checkers that have been idle for longer than
// the idle timeout. The API checker is separate and never idle-cleaned.
func (p *checkerPool) cleanupIdleCheckers() {
p.mu.Lock()
defer p.mu.Unlock()
now := time.Now()
for i := range p.checkers {
c := p.checkers[i]
if c == nil || p.heldBy[i] != "" {
continue
}
if p.lastReleased[i].IsZero() {
continue
}
idle := now.Sub(p.lastReleased[i])
if idle >= p.opts.IdleTimeout {
p.log(fmt.Sprintf("checkerpool: Disposing idle checker %d (idle %v)", i, idle))
p.disposeCheckerLocked(i, c)
}
}
// Reschedule for any remaining idle-but-not-yet-expired checkers.
// scheduleCleanupLocked will Reset the existing timer rather than
// creating a new one, avoiding goroutine leaks.
p.scheduleCleanupLocked()
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanupIdleCheckers runs even after Discard() and unconditionally calls scheduleCleanupLocked(). Because time.Timer.Stop() doesn’t guarantee the callback won’t run concurrently, the cleanup callback can still fire after Discard() and re-arm the timer / dispose checkers on a discarded pool. Add a p.discarded guard at the start of cleanupIdleCheckers (and/or inside scheduleCleanupLocked) to no-op and avoid rescheduling when discarded.

Copilot uses AI. Check for mistakes.
Comment on lines +486 to 501
// Discard signals that this pool's program has been replaced. The pool
// remains functional but stops its idle-cleanup timer so that checkers
// are not disposed until the pool is GC'd. The API checker is unaffected
// since it is never idle-cleaned.
func (p *checkerPool) Discard() {
p.mu.Lock()
defer p.mu.Unlock()
if p.discarded {
return // already discarded
}
return true
}

func (p *CheckerPool) createCheckerLocked() (*checker.Checker, int) {
for i, existing := range p.checkers {
if existing == nil {
checker, _ := checker.NewChecker(p.program)
p.checkers[i] = checker
return checker, i
}
p.log("checkerpool: Discarding pool, stopping idle cleanup")
p.discarded = true
if p.cleanupTimer != nil {
p.cleanupTimer.Stop()
p.cleanupTimer = nil
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discard() stops and nils cleanupTimer, but the timer’s callback may already be running or may run immediately after Stop() returns false. Without an additional discarded check in the callback, the pool can end up keeping itself alive (via a newly scheduled timer) even though it was discarded. Consider setting discarded=true before stopping the timer and ensuring the callback clears cleanupTimer and does not reschedule when discarded.

Copilot uses AI. Check for mistakes.
Comment on lines +303 to 323
func (p *checkerPool) getPersistentChecker() (*checker.Checker, func()) {
p.persistentSem <- struct{}{}
p.mu.Lock()

if p.persistentChecker == nil {
p.log("checkerpool: Creating persistent checker")
c, _ := checker.NewChecker(p.program)
p.persistentChecker = c
}
return checker, index

c := p.persistentChecker
p.persistentHeld = true
p.mu.Unlock()

return c, sync.OnceFunc(func() {
p.mu.Lock()
p.persistentHeld = false
p.mu.Unlock()
<-p.persistentSem
})
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The persistent/API checker release path doesn’t dispose a checker that has been canceled. checker.Checker panics on reuse once WasCanceled() is true (see checkNotCanceled()), so the next API acquisition can crash the server. Mirror the normal release behavior here: if the persistent checker WasCanceled(), clear it (so the next request recreates it) rather than keeping the canceled instance.

Copilot uses AI. Check for mistakes.
Comment on lines +360 to +369
// registerRequestCleanup uses context.AfterFunc to delete the request
// association when the request context is done. This prevents the map
// from growing unboundedly with completed request IDs.
// Must be called with p.mu held; the cleanup runs asynchronously.
func (p *checkerPool) registerRequestCleanup(ctx context.Context, requestID string) {
context.AfterFunc(ctx, func() {
p.mu.Lock()
defer p.mu.Unlock()
delete(p.requestAssociations, requestID)
})
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registerRequestCleanup only deletes requestAssociations when ctx is done. If a caller uses WithRequestID on a context that never cancels (e.g. context.Background() in some tests/utilities), requestAssociations can grow without bound for the life of the pool. Consider skipping request affinity when ctx.Done() == nil, or adding a fallback cleanup strategy (e.g. delete on release when Done is nil, or enforce a max size/TTL).

Copilot uses AI. Check for mistakes.
Comment on lines +522 to +523
// Mark its checker pool as discarded so idle checkers are disposed
// immediately rather than waiting for its idle timer.
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says Discard() will make idle checkers be disposed immediately, but checkerPool.Discard() actually stops the idle-cleanup timer and keeps checkers alive until the pool is GC’d. Please update this comment to reflect the actual behavior (stopping the timer to avoid keeping the pool alive / to allow GC).

Suggested change
// Mark its checker pool as discarded so idle checkers are disposed
// immediately rather than waiting for its idle timer.
// Mark its checker pool as discarded so its idle-cleanup timer stops
// keeping the pool alive, allowing the pool and any idle checkers it
// still references to be reclaimed when the pool is garbage-collected.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants