CBG-5153: Implement resync using CBGT for a single node#8057
CBG-5153: Implement resync using CBGT for a single node#8057RIT3shSapata merged 24 commits intomainfrom
Conversation
torcolvin
left a comment
There was a problem hiding this comment.
A few comments before I look more, I think the test is passing right now because it starts the non distributed resync.
torcolvin
left a comment
There was a problem hiding this comment.
I gave a non exhaustive review for any cases where import was used for code, but I don't think I found all cases. Definitely give a look through the code pathways to see if you can find other locations.
| // StartShardedDCPFeed initializes and starts a CBGT Manager targeting the provided bucket. | ||
| // dbName is used to define a unique path name for local file storage of pindex files | ||
| func StartShardedDCPFeed(ctx context.Context, dbName string, configGroup string, uuid string, heartbeater Heartbeater, bucket Bucket, spec BucketSpec, scope string, collections []string, numPartitions uint16, cfg cbgt.Cfg) (*CbgtContext, error) { | ||
| func StartShardedDCPFeed(ctx context.Context, dbName string, configGroup string, uuid string, heartbeater Heartbeater, bucket Bucket, spec BucketSpec, scope string, collections []string, numPartitions uint16, cfg cbgt.Cfg, resyncIndex bool) (*CbgtContext, error) { |
There was a problem hiding this comment.
This could take DestType instead of a boolean parameter, which would be generally preferred since it would be more reasonable. The name DestType might not be appropriate, but it could be ShardedDCPFeedType instead?
It might be worth figuring out below what is actually needed to do to split on the feed type and whether to pass these arguments into this function or pass a type parameter. Review this comment after looking through the rest of the code and the comments.
|
|
||
| // Given a dbName, generate a unique and length-constrained index name for CBGT to use as part of their DCP name. | ||
| func GenerateIndexName(dbName string) string { | ||
| func GenerateIndexName(dbName string, feedID string) string { |
There was a problem hiding this comment.
We can pull this into a separate ticket but I think that this value needs to be <200 characters and we need to shorten this.
We can't break code that might work for import but as we make the index name longer, we want to make sure that this code works if there is a long database name.
sync_gateway/rest/config_manager.go
Line 832 in aa83a21
I think creating a separate ticket is a good plan for this.
There was a problem hiding this comment.
Like I mentioned in my comment earlier, I don't think it will ever exceed 34:
#8057 (comment)
we can pair up and discuss it further
There was a problem hiding this comment.
Can you file a separate ticket for this and resolve this when you are done?
There was a problem hiding this comment.
| // the only index defined, and the name is safe. In that case, continue using legacy index name | ||
| // to avoid restarting the import processing from zero | ||
| func dcpSafeIndexName(ctx context.Context, c *CbgtContext, dbName string) (safeIndexName, previousUUID string) { | ||
| func dcpSafeIndexName(ctx context.Context, c *CbgtContext, dbName string, feedType ShardedDCPFeedType, feedID string) (safeIndexName, previousUUID string) { |
There was a problem hiding this comment.
I think we can pass feedType to both functions here, and through the callstack.
There was a problem hiding this comment.
I did not understand this, could you please elaborate more?
There was a problem hiding this comment.
Can you pass just feedType ShardedDCPFeedType? I think that the feedID should actually always be the same for resync, and doesn't need to be unique per run.
That was my mistake in the earlier specification.
|
|
||
| // Given a dbName, generate a unique and length-constrained index name for CBGT to use as part of their DCP name. | ||
| func GenerateIndexName(dbName string) string { | ||
| func GenerateIndexName(dbName string, feedID string) string { |
There was a problem hiding this comment.
Can you file a separate ticket for this and resolve this when you are done?
There was a problem hiding this comment.
Pull request overview
This draft PR extends Sync Gateway’s sharded DCP/CBGT infrastructure to support running resync via CBGT (in addition to the existing import feed), including new feed-type aware naming/keying and associated test updates.
Changes:
- Introduces sharded DCP feed typing (import vs resync) and updates CBGT index naming / dest-key generation to incorporate feed identity.
- Registers CBGT pindex implementations for both import and resync index types, and updates import feed startup to use the new APIs.
- Adds initial “distributed resync” wiring in the resync DCP background manager and adjusts/adds unit tests around index naming and resync behavior.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| rest/adminapitest/admin_api_test.go | Updates admin API tests to use new dest-key generation for import feeds. |
| db/pindex.go | Renames/expands pindex impl registration to cover both import and resync index types. |
| db/import_listener.go | Updates import feed dest key + StartShardedDCPFeed call signature to pass feed type + feed ID. |
| db/database.go | Updates database init to call the renamed pindex registration function. |
| db/background_mgr_resync_dcp.go | Adds “distributed” (CBGT-based) resync startup path. |
| db/background_mgr_resync_dcp_test.go | Extends resync DCP tests to exercise both non-distributed and distributed modes; adds helper waiters. |
| base/heartbeat.go | Exports CouchbaseHeartBeater type (renaming from unexported) and updates method receivers accordingly. |
| base/heartbeat_test.go | Adjusts tests to match exported CouchbaseHeartBeater type. |
| base/dcp_sharded.go | Adds resync CBGT index type const, feed type enum, feedID-aware index naming, and generalized DestKey; updates CBGT index creation APIs. |
| base/dcp_sharded_test.go | Updates tests for new GenerateIndexName/DestKey signatures and adds resync DestKey coverage. |
| base/dcp_common_test.go | Updates index-name test to pass feedID. |
| base/dcp_test.go | Updates CBGT index creation tests for new feed parameters and naming. |
| base/constants_syncdocs.go | Adds metadata key prefixes + helpers for resync heartbeater/cfg. |
| base/constants_syncdocs_test.go | Adds (but currently does not assert) expected resync prefix values. |
You can also share your feedback on Copilot code review. Take the survey.
| legacyIndexName := GenerateLegacyIndexName(dbName) | ||
|
|
||
| indexUUID, _ := getCBGTIndexUUID(c.Manager, indexName) |
|
|
||
| numpartitions := db.Options.ImportOptions.ImportPartitions | ||
| resyncCbgtContext, err := base.StartShardedDCPFeed(loggingCtx, db.Name, db.Options.GroupID, db.UUID, resyncHB, bucket, | ||
| scopeName, collectionNamesByScope[scopeName], numpartitions, resyncCfg, base.ResyncShardedDCPFeedType, clientOptions.CheckpointPrefix) |
| pIndexType := indexType + configGroup | ||
| base.InfofCtx(ctx, base.KeyDCP, "Registering PindexImplType for %s", pIndexType) | ||
| cbgt.RegisterPIndexImplType(pIndexType, | ||
| &cbgt.PIndexImplType{ | ||
| New: getNewPIndexImplType(ctx), | ||
| Open: openPIndexImpl, | ||
| OpenEx: getOpenExPIndexImpl(ctx), | ||
| Description: "general/syncGateway-import " + | ||
| " - import processing for shared bucket access", |
| MetaKeyUserEmailPrefix // "useremail:" | ||
| MetaKeySessionPrefix // "session:" | ||
| MetaKeyResyncHeartBeaterPrefix // "resync_hb:" | ||
| MetaKeyResyncCfgPrefix // "resync_cfg |
There was a problem hiding this comment.
This comment looks correct from copilot.
| resyncHeatbeatPrefix string | ||
| resyncCfgPrefix string | ||
| }{ |
There was a problem hiding this comment.
This comment is correct I think.
| " - import processing for shared bucket access", | ||
| }) | ||
| for _, indexType := range []string{base.CBGTIndexTypeSyncGatewayImport, base.CBGTIndexTypeSyncGatewayResync} { | ||
| pIndexType := indexType + configGroup |
There was a problem hiding this comment.
We don't want to use config group for resync.
commit 725b533 Author: Tor Colvin <tor.colvin@couchbase.com> Date: Thu Mar 19 10:56:25 2026 -0400 CBG-5192 add some log fixes for RTE errors (#8129) Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> commit 99040a0 Author: Tor Colvin <tor.colvin@couchbase.com> Date: Wed Mar 18 13:52:07 2026 -0400 Create sharded dcp options struct (#8112) commit f78dcc5 Author: Tor Colvin <tor.colvin@couchbase.com> Date: Tue Mar 17 17:10:49 2026 -0400 Add factory folder to gitignore (#8114) commit 90a39e6 Author: Tor Colvin <tor.colvin@couchbase.com> Date: Mon Mar 16 10:46:50 2026 -0400 Setup 4.0.4/3.3.4 builds (#8106) commit 50f8ed8 Author: Ben Brooks <ben.brooks@couchbase.com> Date: Mon Mar 16 14:16:43 2026 +0000 CBG-5203: Silently handle PING BLIP messages for cbl-js (#8101) Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> commit ca1a6a9 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Mar 16 09:15:29 2026 -0400 Bump golang.org/x/net from 0.51.0 to 0.52.0 (#8105) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit e2e5a9c Author: Ben Brooks <ben.brooks@couchbase.com> Date: Fri Mar 13 13:46:01 2026 +0000 Test: Add a timeout in ...Delta...BypassRevCache...WhenInFlightRevChanged (#8102) commit c7da947 Author: Gregory Newman-Smith <109068393+gregns1@users.noreply.github.com> Date: Fri Mar 13 13:45:49 2026 +0000 CBG-4651: change _online/_offline endpoints to work for both persistent and non persistent config (#8093) Merge branch 'main' of github.com:couchbase/sync_gateway into CBG-5153
There was a problem hiding this comment.
Pull request overview
Implements a CBGT-backed “distributed” DCP path for resync (single-node), alongside refactors to generalize CBGT index naming and dest-key generation for multiple feed types (import vs resync).
Changes:
- Add CBGT resync feed type support (index naming, dest keys, pindex registration) and wire distributed resync startup via
StartShardedDCPFeed. - Introduce new metadata key prefixes for resync heartbeater/cfg documents.
- Update affected unit/integration tests to use the new generalized helpers and cover resync variants.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| rest/adminapitest/admin_api_test.go | Update tests to use new generalized DestKey(...) helper. |
| db/pindex.go | Register pindex impl types for import and resync. |
| db/import_listener.go | Switch import feed to use generalized DestKey + GenerateCBGTIndexName. |
| db/database.go | Rename call site to RegisterPindexImpl. |
| db/background_mgr_resync_dcp.go | Add distributed (CBGT-based) resync startup path. |
| db/background_mgr_resync_dcp_test.go | Extend resync tests to exercise distributed vs non-distributed paths; adjust wait helpers. |
| base/heartbeat.go | Export Couchbase heartbeater type (CouchbaseHeartBeater) and update method receivers. |
| base/heartbeat_test.go | Align tests with exported heartbeater type. |
| base/dcp_sharded.go | Add resync index type constant; generalize index-name and dest-key generation. |
| base/dcp_sharded_test.go | Update dest-key tests to cover import vs resync feed types. |
| base/dcp_common_test.go | Validate CBGT index-name length for both import and resync. |
| base/dcp_test.go | Update CBGT index tests for generalized index-name generator; add resync cases. |
| base/constants_syncdocs.go | Add resync heartbeat/cfg metadata prefixes. |
| base/constants_syncdocs_test.go | Add assertions for new resync metadata prefixes. |
Comments suppressed due to low confidence (1)
db/background_mgr_resync_dcp.go:305
- In distributed mode,
doneChanis never set (remains nil), so the<-doneChanselect case is disabled and the resync run will never reach the completion path on its own; it will block until the terminator is closed. If distributed resync is intended to be finite (like the non-distributed one-shot), wire up a completion signal from the CBGT feed and/or explicitly stop the CBGT manager when processing is complete so BackgroundManager can transition to Completed.
select {
case <-doneChan:
base.InfofCtx(ctx, base.KeyAll, "[%s] Finished running sync function. %d/%d docs changed", resyncLoggingID, r.DocsChanged.Value(), r.DocsProcessed.Value())
err = dcpClient.Close()
|
One more change I just thought of - I think I want to name the checkpoint names differently so that we can see the difference. |
CBG-5153
Describe your PR here...
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiDependencies (if applicable)
Integration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/386/ (failed)GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/398/GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/399/GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/400/