Conversation
have CloneRepo return the repo so it gets re-indexed
keegancsmith
left a comment
There was a problem hiding this comment.
this code looks a little tricky. Maybe you want to take this opportunity to switch to using go-git to do this? Look at the implementation of setFetch to see how you could do this.
| outBuf := &bytes.Buffer{} | ||
| errBuf := &bytes.Buffer{} | ||
| cmd.Stdout = outBuf | ||
| cmd.Stderr = errBuf |
| settingsToUpdate, err := getZoektSettingsToUpdate(repoDest, settings, keys) | ||
| if err != nil { | ||
| return false, err | ||
| } |
There was a problem hiding this comment.
if this fails do you maybe wanna fallback to just updating everything?
There was a problem hiding this comment.
Yeah sure that makes sense, good idea
Sure, I'll give it a shot! thanks for taking a look |
|
@xvandish are you interested in taking this PR forward? It still seems like a helpful optimization! |
Hi @jtibshirani ! Sorry for the delay, yes, still interested! I'll try to update this week - |
Port the old idea from #635 onto current main so mirror updates stop rewriting every zoekt config value on every pass. The original patch shelled out to git config and never landed; moving the update path to go-git keeps it aligned with the rest of clone.go, makes it straightforward to detect real config changes, and lets CloneRepo return the repo path only when an existing clone actually needs reindexing. While touching the port, also cover the stale-metadata case the old PR still missed. Empty settings now remove old zoekt.* keys instead of leaving outdated values behind, and remote.origin.url still stays in sync for existing clones. Test Plan: go test ./gitindex
The #635 port accidentally introduced remote.origin.url rewrites for existing clones. That is a behavior change from the original flow and not required for the metadata-diff optimization. This keeps the update path focused on zoekt settings only, while preserving the return-value signal that triggers reindex when metadata actually changes. The integration test now validates metadata mutation and no-op behavior without expecting remote URL churn. Test Plan: go test ./... Amp-Thread-ID: https://ampcode.com/threads/T-019d0c05-51b0-77ee-a1dc-cb60ce14712d Co-authored-by: Amp <amp@ampcode.com>
This is an optimization on top of what I introduced in #600.
Rather than always updating the zoekt settings for every repo, which can be at least 10 git calls per repo (10 is the current number of zoekt settings we store), we instead get all
zoekt-*settings with one call, and then only make calls to update the changed values. I remember testing, and the code handled new and deletedzoekt-*keys fine. This is just a constant speedup, but it really limits the number of git calls and IO we perform, especially with larger sets of repos.We also make another change, in that if the zoekt-settings are updated, we return the
repoDestwhich triggers a re-index right away.