Skip to content

zoekt-indexserver: Per-Entry Branch Config#1011

Draft
dharesign wants to merge 1 commit intosourcegraph:mainfrom
dharesign:indexserver-branch-support
Draft

zoekt-indexserver: Per-Entry Branch Config#1011
dharesign wants to merge 1 commit intosourcegraph:mainfrom
dharesign:indexserver-branch-support

Conversation

@dharesign
Copy link
Copy Markdown
Contributor

Allow specifying which branches to index per config entry via the Branches field (comma-separated, e.g. "main,release-*", default HEAD) and BranchPrefix (e.g. "refs/tags/" to index tags instead of branches).

These values are persisted as zoekt.branches-to-index and zoekt.branch-prefix in each repo's git config by the mirror commands, then read at index time and passed to zoekt-git-index as -branches and -prefix flags.

Fixes #432.

Allow specifying which branches to index per config entry via the
Branches field (comma-separated, e.g. "main,release-*", default HEAD)
and BranchPrefix (e.g. "refs/tags/" to index tags instead of branches).

These values are persisted as zoekt.branches-to-index and
zoekt.branch-prefix in each repo's git config by the mirror commands,
then read at index time and passed to zoekt-git-index as -branches and
-prefix flags.

Fixes sourcegraph#432.
Copy link
Copy Markdown
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes, but not sure if my comments are feasible.

Not the biggest fan of needing to update so many mirror commands for this, but currently this is the most obvious way to do this. I don't think we have a nice way to map indexserver config <-> cloned repo.

Maybe a more general field getting into the zoekt config for just zoekt-indexserver to read would make this code a bit cleaner / easier to add future features like this to zoekt-indexserver?

Comment on lines +167 to +179
// Read zoekt.branches-to-index from the repo's git config. If set, pass
// it to zoekt-git-index as -branches along with -allow_missing_branches.
// The value is a comma-separated list of branch names (default: HEAD).
if branchOut, err := exec.Command("git", "--git-dir", dir, "config", "zoekt.branches-to-index").Output(); err == nil {
if branches := strings.TrimSpace(string(branchOut)); branches != "" {
args = append(args, "-branches", branches, "-allow_missing_branches")
}
}
if prefixOut, err := exec.Command("git", "--git-dir", dir, "config", "zoekt.branch-prefix").Output(); err == nil {
if prefix := strings.TrimSpace(string(prefixOut)); prefix != "" {
args = append(args, "-prefix", prefix)
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like it violates a layer we had before. Only zoekt-git-index would read the git config, not indexserver. Would it be possible to instead read these values in in the git indexer, and if they are set override branches-to-index or branch-prefix

@dharesign
Copy link
Copy Markdown
Contributor Author

How about modifying the pendingRepos channel in zoekt-indexserver to be a struct that has the mirror_config item that resulted in that repo. indexPendingRepo can then read from the config directly without having to put anything in the git config.

@dharesign
Copy link
Copy Markdown
Contributor Author

How about modifying the pendingRepos channel in zoekt-indexserver to be a struct that has the mirror_config item that resulted in that repo. indexPendingRepo can then read from the config directly without having to put anything in the git config.

This is easy enough to do initially, however once the repos have been cloned the periodicFetch will find the repositories by scanning the file system. It's not immediately obvious how to tie those back to the mirror_config.

Additionally, it looks like if you remove items from mirror_config, nothing cleans up the already-cloned repos? So it's possible for the repos on disk to have no corresponding entry in the mirror_config file.

@dharesign dharesign marked this pull request as draft April 10, 2026 18:04
@dharesign
Copy link
Copy Markdown
Contributor Author

https://github.com/sourcegraph/zoekt/compare/main...dharesign:zoekt:indexserver-branch-support-take2?expand=1 shows how the code would look for the initial clone. TBD how to handle periodicFetch.

@keegancsmith
Copy link
Copy Markdown
Member

@dharesign Yeah I see how the current architecture causes issues now. It's quite nice that periodicFetch can just rely on git state.

Unclear what a good architecture is here now. My initial thought is this, what do you think: zoekt-git-index decides on which branches to index in this order. The first one it finds it uses, no merging/etc.

  1. command line flag
  2. git config
  3. HEAD

Then it does make sense for indexserver to persist this state.

Finally, git already has the concept of branch specific config. I wonder if we should instead mark each branch we should index that way? That might be complicating things too much though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

zoekt-indexserver: Support Per-Repository Branch Selection

2 participants