Skip to content

Diff zoekt git settings #635

Open
xvandish wants to merge 2 commits intosourcegraph:mainfrom
xvandish:diff-zoekt-git-settings-for-upstream
Open

Diff zoekt git settings #635
xvandish wants to merge 2 commits intosourcegraph:mainfrom
xvandish:diff-zoekt-git-settings-for-upstream

Conversation

@xvandish
Copy link
Contributor

@xvandish xvandish commented Aug 2, 2023

This is an optimization on top of what I introduced in #600.

Rather than always updating the zoekt settings for every repo, which can be at least 10 git calls per repo (10 is the current number of zoekt settings we store), we instead get all zoekt-* settings with one call, and then only make calls to update the changed values. I remember testing, and the code handled new and deleted zoekt-* keys fine. This is just a constant speedup, but it really limits the number of git calls and IO we perform, especially with larger sets of repos.

We also make another change, in that if the zoekt-settings are updated, we return the repoDest which triggers a re-index right away.

Copy link
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code looks a little tricky. Maybe you want to take this opportunity to switch to using go-git to do this? Look at the implementation of setFetch to see how you could do this.

outBuf := &bytes.Buffer{}
errBuf := &bytes.Buffer{}
cmd.Stdout = outBuf
cmd.Stderr = errBuf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't use errBuf

Comment on lines +82 to +85
settingsToUpdate, err := getZoektSettingsToUpdate(repoDest, settings, keys)
if err != nil {
return false, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this fails do you maybe wanna fallback to just updating everything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure that makes sense, good idea

@xvandish
Copy link
Contributor Author

xvandish commented Aug 3, 2023

this code looks a little tricky. Maybe you want to take this opportunity to switch to using go-git to do this? Look at the implementation of setFetch to see how you could do this.

Sure, I'll give it a shot! thanks for taking a look

@jtibshirani
Copy link
Contributor

@xvandish are you interested in taking this PR forward? It still seems like a helpful optimization!

@xvandish
Copy link
Contributor Author

@xvandish are you interested in taking this PR forward? It still seems like a helpful optimization!

Hi @jtibshirani ! Sorry for the delay, yes, still interested! I'll try to update this week -

keegancsmith added a commit that referenced this pull request Mar 20, 2026
Port the old idea from #635 onto current main so mirror updates stop rewriting every zoekt config value on every pass. The original patch shelled out to git config and never landed; moving the update path to go-git keeps it aligned with the rest of clone.go, makes it straightforward to detect real config changes, and lets CloneRepo return the repo path only when an existing clone actually needs reindexing.

While touching the port, also cover the stale-metadata case the old PR still missed. Empty settings now remove old zoekt.* keys instead of leaving outdated values behind, and remote.origin.url still stays in sync for existing clones.

Test Plan: go test ./gitindex
keegancsmith added a commit that referenced this pull request Mar 20, 2026
The #635 port accidentally introduced remote.origin.url rewrites for existing clones. That is a behavior change from the original flow and not required for the metadata-diff optimization.

This keeps the update path focused on zoekt settings only, while preserving the return-value signal that triggers reindex when metadata actually changes. The integration test now validates metadata mutation and no-op behavior without expecting remote URL churn.

Test Plan: go test ./...

Amp-Thread-ID: https://ampcode.com/threads/T-019d0c05-51b0-77ee-a1dc-cb60ce14712d
Co-authored-by: Amp <amp@ampcode.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants