Clean up codebase indexing parity, fix behavior when you turn off codebase indexing#11254
Clean up codebase indexing parity, fix behavior when you turn off codebase indexing#11254moirahuang wants to merge 5 commits into
Conversation
Co-Authored-By: Oz <oz-agent@warp.dev>
…nd remote, pass host for codebase context
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR centralizes Codebase Context policy, gates remote indexing/search on the shared enablement, scopes remote codebase context by host, and refactors remote search shaping to share local fragment-building logic.
Concerns
- Remote search now reads each candidate file as a whole file with the default file-read limit before fragment extraction. That limit is lower than the Codebase Context indexable file size, so results from larger indexed files can be silently dropped.
Verdict
Found: 0 critical, 1 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| }) | ||
| }) | ||
| .collect(), | ||
| max_file_bytes: None, |
There was a problem hiding this comment.
build_fragments_from_file_contents, so remote search silently drops valid results. Request enough bytes to cover the index limit or fetch bounded ranges around each fragment before reranking.
Co-Authored-By: Oz <oz-agent@warp.dev>
| && codebase_context_enabled | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
These unit tests seem unnecessary if they are just verifying a written out logic condition
| #[cfg(not(target_family = "wasm"))] | ||
| use crate::features::FeatureFlag; | ||
|
|
||
| #[cfg(not(target_family = "wasm"))] |
There was a problem hiding this comment.
Is this function just carrying this one method? If we actually want a separate module for all the policy conditions I would consider moving is_auto_indexing_enabled here as well
There was a problem hiding this comment.
or i can just move the remote_codebase_indexing_enabled logic out and remove the module entirely
| } | ||
|
|
||
| let remote_paths = self.active_git_repo_paths_needing_auto_index(); | ||
| emit_auto_index_requested_telemetry( |
There was a problem hiding this comment.
I just noticed emit_auto_index_requested_telemetry is only emitted after the early return for !should_auto_index_codebase(CodebaseAutoIndexingSurface::Remote, ctx) -- this means its not really accurate right?
There was a problem hiding this comment.
i'm not fully following this comment. we want emit_auto_index_requested_telemetry to fire when we actually auto index. so don't we want it to emit only after we know that we aren't early returning?
There was a problem hiding this comment.
Ah I see -- I thought we are emitting it whenever it changed
|
|
||
| fn clear_remote_codebase_indexing_state(&mut self) -> Vec<RemotePath> { | ||
| let remote_paths = self.statuses.keys().cloned().collect::<Vec<_>>(); | ||
| self.statuses.clear(); |
There was a problem hiding this comment.
nit: We should be able to take ownership of statuses since we are clearing it anyway -- this avoids the extra clone above
| fragments: Vec<Fragment>, | ||
| context_lines: usize, | ||
| ) -> HashSet<CodeContextLocation> { | ||
| // Map to collect fragments by file path |
There was a problem hiding this comment.
Erm why the refactor here?
There was a problem hiding this comment.
there was duplicate logic w remote so it's consolidating it
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
What are these unit tests actually verifying?
There was a problem hiding this comment.
to make sure that the remote metadata format is correct but i can simplify
There was a problem hiding this comment.
Can we keep them in a separate test file rather than in the middle of the actual app logic?
| if metadata.is_empty() { | ||
| return Ok(SearchCodebaseResult::Success { files: vec![] }); | ||
| } | ||
| let metadata = metadata |
There was a problem hiding this comment.
It's a bit hard for me to grok what these refactors are for
There was a problem hiding this comment.
it's meant to convert from remote metadata to FragmentMetadata so we can reuse more local paths but i can try to improve the readability of this
kevinyang372
left a comment
There was a problem hiding this comment.
I didn't read too much into the detail assuming most of the refactoring is moving logic around. But lmk if there is actually pieces that are worth calling out
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
Can we keep them in a separate test file rather than in the middle of the actual app logic?
Description
Fix remote codebase indexing parity with local Codebase Context behavior:
SearchCodebasefrom implicitly starting indexing forNotIndexedrepos.Testing
Added unit tests + locally tested
./script/runScreenshots / Videos
https://www.loom.com/share/47167d614c704928978ae3ec7dfb924b
Agent Mode
CHANGELOG-NONE
Co-Authored-By: Oz oz-agent@warp.dev
Conversation: https://staging.warp.dev/conversation/06ff18ae-3bf1-4c50-9d31-4f5daa10e0de
Run: https://oz.staging.warp.dev/runs/019e3d90-2fd2-7c00-bd7e-9f79f109f938
Plans:
This PR was generated with Oz.