Skip to content

feat: reuse Sessions correctly in BasicCrawler#3605

Open
barjin wants to merge 3 commits intorefactor/proxy-configuration-drop-requestfrom
feat/session-reuse-in-basic-crawler
Open

feat: reuse Sessions correctly in BasicCrawler#3605
barjin wants to merge 3 commits intorefactor/proxy-configuration-drop-requestfrom
feat/session-reuse-in-basic-crawler

Conversation

@barjin
Copy link
Copy Markdown
Member

@barjin barjin commented Apr 27, 2026

Makes BasicCrawler (and its subclasses) reuse Session instances according to the sessionReuseStrategy. Before, the BasicCrawler would create a single-use Session for each Request unless instructed otherwise (e.g., via request.sessionId).

Refactors small bits around the Session lifecycle in BasicCrawler.

Closes #3595


Depends on #3599

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes BasicCrawler’s session lifecycle so that sessions are no longer forced into single-use, and introduces proxy assignment at session creation time to keep proxyInfo consistent with the Session instance.

Changes:

  • Replace per-request single-use session creation with SessionPool.getSession() to allow session reuse.
  • When proxyConfiguration is used, assign proxyInfo when creating sessions in the pool and expose it via context.
  • Add a test asserting proxyInfo is present on both the Session and the handler context.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
packages/basic-crawler/src/internals/basic-crawler.ts Switch to session reuse via getSession() and introduce a custom createSessionFunction that assigns proxyInfo.
test/core/crawlers/basic_crawler.test.ts Add coverage for proxyConfiguration producing proxyInfo on Session and in the handler context.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/basic-crawler/src/internals/basic-crawler.ts
Comment thread packages/basic-crawler/src/internals/basic-crawler.ts
Comment thread packages/basic-crawler/src/internals/basic-crawler.ts Outdated
Comment thread packages/basic-crawler/src/internals/basic-crawler.ts Outdated
Comment thread packages/basic-crawler/src/internals/basic-crawler.ts
Comment thread test/core/crawlers/basic_crawler.test.ts
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/upgrading/upgrading_v4.md
@barjin barjin requested review from janbuchar and l2ysho April 28, 2026 07:39
Copy link
Copy Markdown
Contributor

@l2ysho l2ysho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @barjin for this, after hours I spent in debugging remote browser approach, this PR makes perfect sense. I do not see any blocking issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants