Skip to content

CBG-5153: Implement resync using CBGT for a single node#8057

Merged
RIT3shSapata merged 24 commits intomainfrom
CBG-5153
Mar 23, 2026
Merged

CBG-5153: Implement resync using CBGT for a single node#8057
RIT3shSapata merged 24 commits intomainfrom
CBG-5153

Conversation

@RIT3shSapata
Copy link
Copy Markdown
Contributor

@RIT3shSapata RIT3shSapata commented Feb 5, 2026

CBG-5153

Describe your PR here...

  • Use bullet points if there's more than one thing changed

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Dependencies (if applicable)

  • Link upstream PRs
  • Update Go module dependencies when merged

Integration Tests

@RIT3shSapata RIT3shSapata self-assigned this Feb 5, 2026
Copy link
Copy Markdown
Collaborator

@torcolvin torcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments before I look more, I think the test is passing right now because it starts the non distributed resync.

Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp_test.go Outdated
Comment thread db/background_mgr_resync_dcp_test.go Outdated
Comment thread base/dcp_sharded.go Outdated
Copy link
Copy Markdown
Collaborator

@torcolvin torcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave a non exhaustive review for any cases where import was used for code, but I don't think I found all cases. Definitely give a look through the code pathways to see if you can find other locations.

Comment thread base/dcp_common.go Outdated
Comment thread base/dcp_sharded.go Outdated
Comment thread base/dcp_sharded.go Outdated
Comment thread base/dcp_sharded.go Outdated
// StartShardedDCPFeed initializes and starts a CBGT Manager targeting the provided bucket.
// dbName is used to define a unique path name for local file storage of pindex files
func StartShardedDCPFeed(ctx context.Context, dbName string, configGroup string, uuid string, heartbeater Heartbeater, bucket Bucket, spec BucketSpec, scope string, collections []string, numPartitions uint16, cfg cbgt.Cfg) (*CbgtContext, error) {
func StartShardedDCPFeed(ctx context.Context, dbName string, configGroup string, uuid string, heartbeater Heartbeater, bucket Bucket, spec BucketSpec, scope string, collections []string, numPartitions uint16, cfg cbgt.Cfg, resyncIndex bool) (*CbgtContext, error) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could take DestType instead of a boolean parameter, which would be generally preferred since it would be more reasonable. The name DestType might not be appropriate, but it could be ShardedDCPFeedType instead?

It might be worth figuring out below what is actually needed to do to split on the feed type and whether to pass these arguments into this function or pass a type parameter. Review this comment after looking through the rest of the code and the comments.

Comment thread base/dcp_sharded.go Outdated
Comment thread db/import_pindex.go Outdated
Comment thread db/import_pindex.go Outdated
Comment thread db/import_pindex.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/import_pindex.go Outdated
@torcolvin torcolvin mentioned this pull request Feb 18, 2026
4 tasks
Comment thread .github/workflows/ci.yml Outdated
Comment thread base/dcp_sharded.go Outdated
Comment thread base/dcp_sharded.go Outdated

// Given a dbName, generate a unique and length-constrained index name for CBGT to use as part of their DCP name.
func GenerateIndexName(dbName string) string {
func GenerateIndexName(dbName string, feedID string) string {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can pull this into a separate ticket but I think that this value needs to be <200 characters and we need to shorten this.

We can't break code that might work for import but as we make the index name longer, we want to make sure that this code works if there is a long database name.

// standardMetadataID returns either the dbName or a base64 encoded SHA256 hash of the dbName, whichever is shorter.
is an example of how we handle this for a metadataID

I think creating a separate ticket is a good plan for this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I mentioned in my comment earlier, I don't think it will ever exceed 34:
#8057 (comment)
we can pair up and discuss it further

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you file a separate ticket for this and resolve this when you are done?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread base/dcp_sharded.go Outdated
// the only index defined, and the name is safe. In that case, continue using legacy index name
// to avoid restarting the import processing from zero
func dcpSafeIndexName(ctx context.Context, c *CbgtContext, dbName string) (safeIndexName, previousUUID string) {
func dcpSafeIndexName(ctx context.Context, c *CbgtContext, dbName string, feedType ShardedDCPFeedType, feedID string) (safeIndexName, previousUUID string) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can pass feedType to both functions here, and through the callstack.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand this, could you please elaborate more?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pass just feedType ShardedDCPFeedType? I think that the feedID should actually always be the same for resync, and doesn't need to be unique per run.

That was my mistake in the earlier specification.

Comment thread db/background_mgr_resync_dcp_test.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread base/constants_syncdocs.go Outdated
Comment thread base/dcp_sharded.go Outdated

// Given a dbName, generate a unique and length-constrained index name for CBGT to use as part of their DCP name.
func GenerateIndexName(dbName string) string {
func GenerateIndexName(dbName string, feedID string) string {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you file a separate ticket for this and resolve this when you are done?

Comment thread base/dcp_sharded.go Outdated
Comment thread base/dcp_sharded_test.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/database.go Outdated
@RIT3shSapata RIT3shSapata marked this pull request as ready for review March 16, 2026 15:20
Copilot AI review requested due to automatic review settings March 16, 2026 15:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This draft PR extends Sync Gateway’s sharded DCP/CBGT infrastructure to support running resync via CBGT (in addition to the existing import feed), including new feed-type aware naming/keying and associated test updates.

Changes:

  • Introduces sharded DCP feed typing (import vs resync) and updates CBGT index naming / dest-key generation to incorporate feed identity.
  • Registers CBGT pindex implementations for both import and resync index types, and updates import feed startup to use the new APIs.
  • Adds initial “distributed resync” wiring in the resync DCP background manager and adjusts/adds unit tests around index naming and resync behavior.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
rest/adminapitest/admin_api_test.go Updates admin API tests to use new dest-key generation for import feeds.
db/pindex.go Renames/expands pindex impl registration to cover both import and resync index types.
db/import_listener.go Updates import feed dest key + StartShardedDCPFeed call signature to pass feed type + feed ID.
db/database.go Updates database init to call the renamed pindex registration function.
db/background_mgr_resync_dcp.go Adds “distributed” (CBGT-based) resync startup path.
db/background_mgr_resync_dcp_test.go Extends resync DCP tests to exercise both non-distributed and distributed modes; adds helper waiters.
base/heartbeat.go Exports CouchbaseHeartBeater type (renaming from unexported) and updates method receivers accordingly.
base/heartbeat_test.go Adjusts tests to match exported CouchbaseHeartBeater type.
base/dcp_sharded.go Adds resync CBGT index type const, feed type enum, feedID-aware index naming, and generalized DestKey; updates CBGT index creation APIs.
base/dcp_sharded_test.go Updates tests for new GenerateIndexName/DestKey signatures and adds resync DestKey coverage.
base/dcp_common_test.go Updates index-name test to pass feedID.
base/dcp_test.go Updates CBGT index creation tests for new feed parameters and naming.
base/constants_syncdocs.go Adds metadata key prefixes + helpers for resync heartbeater/cfg.
base/constants_syncdocs_test.go Adds (but currently does not assert) expected resync prefix values.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread db/background_mgr_resync_dcp_test.go
Comment thread db/background_mgr_resync_dcp_test.go Outdated
Comment thread base/dcp_test.go Outdated
Comment thread db/background_mgr_resync_dcp.go
Comment thread base/dcp_sharded.go Outdated
Comment on lines 205 to 207
legacyIndexName := GenerateLegacyIndexName(dbName)

indexUUID, _ := getCBGTIndexUUID(c.Manager, indexName)
Comment thread db/background_mgr_resync_dcp.go Outdated

numpartitions := db.Options.ImportOptions.ImportPartitions
resyncCbgtContext, err := base.StartShardedDCPFeed(loggingCtx, db.Name, db.Options.GroupID, db.UUID, resyncHB, bucket,
scopeName, collectionNamesByScope[scopeName], numpartitions, resyncCfg, base.ResyncShardedDCPFeedType, clientOptions.CheckpointPrefix)
Comment thread db/pindex.go Outdated
Comment on lines +36 to +44
pIndexType := indexType + configGroup
base.InfofCtx(ctx, base.KeyDCP, "Registering PindexImplType for %s", pIndexType)
cbgt.RegisterPIndexImplType(pIndexType,
&cbgt.PIndexImplType{
New: getNewPIndexImplType(ctx),
Open: openPIndexImpl,
OpenEx: getOpenExPIndexImpl(ctx),
Description: "general/syncGateway-import " +
" - import processing for shared bucket access",
Comment thread base/constants_syncdocs.go Outdated
MetaKeyUserEmailPrefix // "useremail:"
MetaKeySessionPrefix // "session:"
MetaKeyResyncHeartBeaterPrefix // "resync_hb:"
MetaKeyResyncCfgPrefix // "resync_cfg
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks correct from copilot.

Comment thread base/constants_syncdocs_test.go Outdated
Comment on lines 143 to 145
resyncHeatbeatPrefix string
resyncCfgPrefix string
}{
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is correct I think.

Comment thread db/background_mgr_resync_dcp_test.go
Comment thread base/constants_syncdocs.go Outdated
Comment thread base/dcp_sharded.go Outdated
Comment thread base/dcp_sharded.go Outdated
Comment thread db/background_mgr_resync_dcp_test.go
Comment thread db/pindex.go Outdated
" - import processing for shared bucket access",
})
for _, indexType := range []string{base.CBGTIndexTypeSyncGatewayImport, base.CBGTIndexTypeSyncGatewayResync} {
pIndexType := indexType + configGroup
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to use config group for resync.

@torcolvin torcolvin mentioned this pull request Mar 17, 2026
4 tasks
commit 725b533
Author: Tor Colvin <tor.colvin@couchbase.com>
Date:   Thu Mar 19 10:56:25 2026 -0400

    CBG-5192 add some log fixes for RTE errors (#8129)

    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

commit 99040a0
Author: Tor Colvin <tor.colvin@couchbase.com>
Date:   Wed Mar 18 13:52:07 2026 -0400

    Create sharded dcp options struct (#8112)

commit f78dcc5
Author: Tor Colvin <tor.colvin@couchbase.com>
Date:   Tue Mar 17 17:10:49 2026 -0400

    Add factory folder to gitignore (#8114)

commit 90a39e6
Author: Tor Colvin <tor.colvin@couchbase.com>
Date:   Mon Mar 16 10:46:50 2026 -0400

    Setup 4.0.4/3.3.4 builds (#8106)

commit 50f8ed8
Author: Ben Brooks <ben.brooks@couchbase.com>
Date:   Mon Mar 16 14:16:43 2026 +0000

    CBG-5203: Silently handle PING BLIP messages for cbl-js (#8101)

    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

commit ca1a6a9
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Mar 16 09:15:29 2026 -0400

    Bump golang.org/x/net from 0.51.0 to 0.52.0 (#8105)

    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit e2e5a9c
Author: Ben Brooks <ben.brooks@couchbase.com>
Date:   Fri Mar 13 13:46:01 2026 +0000

    Test: Add a timeout in ...Delta...BypassRevCache...WhenInFlightRevChanged (#8102)

commit c7da947
Author: Gregory Newman-Smith <109068393+gregns1@users.noreply.github.com>
Date:   Fri Mar 13 13:45:49 2026 +0000

    CBG-4651: change _online/_offline endpoints to work for both persistent and non persistent config (#8093)
Merge branch 'main' of github.com:couchbase/sync_gateway into CBG-5153
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a CBGT-backed “distributed” DCP path for resync (single-node), alongside refactors to generalize CBGT index naming and dest-key generation for multiple feed types (import vs resync).

Changes:

  • Add CBGT resync feed type support (index naming, dest keys, pindex registration) and wire distributed resync startup via StartShardedDCPFeed.
  • Introduce new metadata key prefixes for resync heartbeater/cfg documents.
  • Update affected unit/integration tests to use the new generalized helpers and cover resync variants.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
rest/adminapitest/admin_api_test.go Update tests to use new generalized DestKey(...) helper.
db/pindex.go Register pindex impl types for import and resync.
db/import_listener.go Switch import feed to use generalized DestKey + GenerateCBGTIndexName.
db/database.go Rename call site to RegisterPindexImpl.
db/background_mgr_resync_dcp.go Add distributed (CBGT-based) resync startup path.
db/background_mgr_resync_dcp_test.go Extend resync tests to exercise distributed vs non-distributed paths; adjust wait helpers.
base/heartbeat.go Export Couchbase heartbeater type (CouchbaseHeartBeater) and update method receivers.
base/heartbeat_test.go Align tests with exported heartbeater type.
base/dcp_sharded.go Add resync index type constant; generalize index-name and dest-key generation.
base/dcp_sharded_test.go Update dest-key tests to cover import vs resync feed types.
base/dcp_common_test.go Validate CBGT index-name length for both import and resync.
base/dcp_test.go Update CBGT index tests for generalized index-name generator; add resync cases.
base/constants_syncdocs.go Add resync heartbeat/cfg metadata prefixes.
base/constants_syncdocs_test.go Add assertions for new resync metadata prefixes.
Comments suppressed due to low confidence (1)

db/background_mgr_resync_dcp.go:305

  • In distributed mode, doneChan is never set (remains nil), so the <-doneChan select case is disabled and the resync run will never reach the completion path on its own; it will block until the terminator is closed. If distributed resync is intended to be finite (like the non-distributed one-shot), wire up a completion signal from the CBGT feed and/or explicitly stop the CBGT manager when processing is complete so BackgroundManager can transition to Completed.
	select {
	case <-doneChan:
		base.InfofCtx(ctx, base.KeyAll, "[%s] Finished running sync function. %d/%d docs changed", resyncLoggingID, r.DocsChanged.Value(), r.DocsProcessed.Value())
		err = dcpClient.Close()

Comment thread base/dcp_test.go
Comment thread db/background_mgr_resync_dcp_test.go Outdated
Comment thread db/pindex.go Outdated
Comment thread db/background_mgr_resync_dcp.go
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread base/dcp_sharded.go
Comment thread db/background_mgr_resync_dcp_test.go
Comment thread db/pindex.go
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread base/constants_syncdocs.go
Comment thread base/heartbeat.go Outdated
@RIT3shSapata RIT3shSapata requested a review from torcolvin March 23, 2026 10:23
@torcolvin
Copy link
Copy Markdown
Collaborator

One more change I just thought of - I think I want to name the checkpoint names differently so that we can see the difference.

@RIT3shSapata RIT3shSapata merged commit 32b2167 into main Mar 23, 2026
48 checks passed
@RIT3shSapata RIT3shSapata deleted the CBG-5153 branch March 23, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants