Skip to content

CASSSIDECAR-409 Adding gossip based safety check to Live Migration data copy task endpoint#321

Open
nvharikrishna wants to merge 2 commits intoapache:trunkfrom
nvharikrishna:409-lm-data-copy-safety-check-trunk
Open

CASSSIDECAR-409 Adding gossip based safety check to Live Migration data copy task endpoint#321
nvharikrishna wants to merge 2 commits intoapache:trunkfrom
nvharikrishna:409-lm-data-copy-safety-check-trunk

Conversation

@nvharikrishna
Copy link
Contributor

Making this change as part of CEP-40.

CASSSIDECAR-409 Add gossip-based safety checks to Live Migration data copy endpoint

Summary of changes:

  • Added Gossip-based safety check for live migration data copy
    • Data copy task gets rejected if Sidecar is able to connect to the destination JMX port
    • Data copy task fails if the source is not found in Gossip
    • Data copy task fails if the destination is found in Gossip (the destination has started earlier)
  • Introduced configuration for gossip checks under the live_migration section of the Sidecar configuration
  • Updated live migration data copy request to pass the gossip-based checks-related values (optional)

@nvharikrishna nvharikrishna force-pushed the 409-lm-data-copy-safety-check-trunk branch from 5709a4b to 4a5daf8 Compare February 20, 2026 16:59
Copy link
Contributor

@frankgh frankgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. I have a few minor comments.

*/
private Future<Void> verifyCassandraNotRunning(InstanceMetadata localInstance)
{
return executorPools.internal().executeBlocking(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not need to run on a blocking method. There's no IO happening here. The check is an in-memory check. Which brings the question about whether you want to do an online check here instead of relying on a cached value?

private GossipInfoResponse.GossipInfo findGossipInfo(GossipInfoResponse gossipResponse,
String instance) throws UnknownHostException
{
InstanceMetadata metadata = instancesMetadata.instanceFromHost(instance);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this instance managed by the local sidecar? instancesMetadata is only available for instances managed by the local Sidecar process.

+ ". It must be greater than zero when specified.");
}

if (gossipFetchMaxRetries != null && gossipFetchMaxRetries <= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also have some guardrails around an upperbound here? I'm thinking from a point of view of a malicious payload that can make the service retry for a long time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants