Skip to content

Clean up codebase indexing parity, fix behavior when you turn off codebase indexing#11254

Open
moirahuang wants to merge 5 commits into
masterfrom
moira/disable-codebase-context
Open

Clean up codebase indexing parity, fix behavior when you turn off codebase indexing#11254
moirahuang wants to merge 5 commits into
masterfrom
moira/disable-codebase-context

Conversation

@moirahuang
Copy link
Copy Markdown
Contributor

@moirahuang moirahuang commented May 19, 2026

Description

Fix remote codebase indexing parity with local Codebase Context behavior:

  • Share Codebase Context / auto-indexing policy helpers between local and remote paths.
  • Gate remote search/status/manual index requests on Codebase Context being enabled.
  • Stop remote SearchCodebase from implicitly starting indexing for NotIndexed repos.
  • Clear client-side remote index state and best-effort drop known remote indexes when Codebase Context is disabled.
  • Scope remote codebase context entries to the active remote host.

Testing

Added unit tests + locally tested

  • I have manually tested my changes locally with ./script/run

Screenshots / Videos

https://www.loom.com/share/47167d614c704928978ae3ec7dfb924b

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

CHANGELOG-NONE

Co-Authored-By: Oz oz-agent@warp.dev

Conversation: https://staging.warp.dev/conversation/06ff18ae-3bf1-4c50-9d31-4f5daa10e0de
Run: https://oz.staging.warp.dev/runs/019e3d90-2fd2-7c00-bd7e-9f79f109f938
Plans:

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label May 19, 2026
@moirahuang moirahuang marked this pull request as ready for review May 19, 2026 03:19
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 19, 2026

@moirahuang

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@moirahuang moirahuang changed the title Fix remote Codebase Context parity Clean up codebase indexing parity, fix behavior when you turn off codebase indexing May 19, 2026
Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR centralizes Codebase Context policy, gates remote indexing/search on the shared enablement, scopes remote codebase context by host, and refactors remote search shaping to share local fragment-building logic.

Concerns

  • Remote search now reads each candidate file as a whole file with the default file-read limit before fragment extraction. That limit is lower than the Codebase Context indexable file size, so results from larger indexed files can be silently dropped.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

})
})
.collect(),
max_file_bytes: None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The whole-file request keeps the default 1MB per-file read limit, but Codebase Context indexes files up to 3MB. Any fragment from a larger indexed file is returned as truncated or omitted, then fails build_fragments_from_file_contents, so remote search silently drops valid results. Request enough bytes to cover the index limit or fetch bounded ranges around each fragment before reranking.

@moirahuang moirahuang requested a review from kevinyang372 May 19, 2026 03:25
Co-Authored-By: Oz <oz-agent@warp.dev>
Comment thread app/src/ai/codebase_context_policy.rs Outdated
&& codebase_context_enabled
}

#[cfg(test)]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These unit tests seem unnecessary if they are just verifying a written out logic condition

Comment thread app/src/ai/codebase_context_policy.rs Outdated
#[cfg(not(target_family = "wasm"))]
use crate::features::FeatureFlag;

#[cfg(not(target_family = "wasm"))]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function just carrying this one method? If we actually want a separate module for all the policy conditions I would consider moving is_auto_indexing_enabled here as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or i can just move the remote_codebase_indexing_enabled logic out and remove the module entirely

}

let remote_paths = self.active_git_repo_paths_needing_auto_index();
emit_auto_index_requested_telemetry(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed emit_auto_index_requested_telemetry is only emitted after the early return for !should_auto_index_codebase(CodebaseAutoIndexingSurface::Remote, ctx) -- this means its not really accurate right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not fully following this comment. we want emit_auto_index_requested_telemetry to fire when we actually auto index. so don't we want it to emit only after we know that we aren't early returning?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see -- I thought we are emitting it whenever it changed


fn clear_remote_codebase_indexing_state(&mut self) -> Vec<RemotePath> {
let remote_paths = self.statuses.keys().cloned().collect::<Vec<_>>();
self.statuses.clear();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We should be able to take ownership of statuses since we are clearing it anyway -- this avoids the extra clone above

fragments: Vec<Fragment>,
context_lines: usize,
) -> HashSet<CodeContextLocation> {
// Map to collect fragments by file path
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erm why the refactor here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was duplicate logic w remote so it's consolidating it

}

#[cfg(test)]
mod tests {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these unit tests actually verifying?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make sure that the remote metadata format is correct but i can simplify

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep them in a separate test file rather than in the middle of the actual app logic?

if metadata.is_empty() {
return Ok(SearchCodebaseResult::Success { files: vec![] });
}
let metadata = metadata
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit hard for me to grok what these refactors are for

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's meant to convert from remote metadata to FragmentMetadata so we can reuse more local paths but i can try to improve the readability of this

@moirahuang moirahuang requested a review from kevinyang372 May 19, 2026 19:12
Copy link
Copy Markdown
Member

@kevinyang372 kevinyang372 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't read too much into the detail assuming most of the refactoring is moving logic around. But lmk if there is actually pieces that are worth calling out

}

#[cfg(test)]
mod tests {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep them in a separate test file rather than in the middle of the actual app logic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants