Skip to content

CASSANALYTICS-60: CDC support for Cassandra 5.0 commit logs#175

Open
lukasz-antoniak wants to merge 2 commits intoapache:trunkfrom
lukasz-antoniak:CASSANALYTICS-60
Open

CASSANALYTICS-60: CDC support for Cassandra 5.0 commit logs#175
lukasz-antoniak wants to merge 2 commits intoapache:trunkfrom
lukasz-antoniak:CASSANALYTICS-60

Conversation

@lukasz-antoniak
Copy link
Member

Fixes Analytics reader of Cassandra 5.0 commit logs (part of CASSANALYTICS-60).

Using environment variable CASSANDRA_VERSION=5.0.0 (or 4.0.0), developers can run CDC unit tests with different Cassandra versions.

Refactoring all tests to @ParameterizedTest requires more efforts, and will be performed in follow-up PR. Static variables like CdcTests.BRIDGE are references from multiple places in the code.

protected final KafkaStats kafkaStats;

public KafkaPublisher(TopicSupplier topicSupplier,
public KafkaPublisher(String version,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API change between Sidecar and Analytics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for calling it out.
Thinking from API design perspective, how about passing CassandraVersion, instead of String version value? It, then, ensures the version is always valid. Sidecar anyways has access to CassandraVersion. It would be clear for Sidecar to create the version object.

* @param <T>
*/
public abstract class RangeTombstoneBuilder<T>
public interface RangeTombstoneBuilder<T>
Copy link
Member Author

@lukasz-antoniak lukasz-antoniak Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classloader clash when FourZeroRangeTombstoneBuilder is loaded by bridge classloader and RangeTombstoneBuilder (abstract parent class) by application classloader.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add the note in code so it stays as interface.

protected final KafkaStats kafkaStats;

public KafkaPublisher(TopicSupplier topicSupplier,
public KafkaPublisher(String version,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for calling it out.
Thinking from API design perspective, how about passing CassandraVersion, instead of String version value? It, then, ensures the version is always valid. Sidecar anyways has access to CassandraVersion. It would be clear for Sidecar to create the version object.

}
}

public static <T> T executeActionOnBridgeClassLoader(@NotNull CassandraVersion version, Throwing.Function<ClassLoader, T> action)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add @VisibleForTesting

* @param <T>
*/
public abstract class RangeTombstoneBuilder<T>
public interface RangeTombstoneBuilder<T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add the note in code so it stays as interface.

Copy link
Contributor

@yifan-c yifan-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed the PR is marked as Draft. I will re-review if there are new changes.
Please share the CI link too.

@lukasz-antoniak
Copy link
Member Author

@yifan-c, refactored all CDC tests to @ParameterizedTest and replaced static bridge references. Opening the PR. I will not squash commits to make it easier to review. Once positive review is given, I will squash, rebase and add change log.

@lukasz-antoniak lukasz-antoniak marked this pull request as ready for review March 9, 2026 09:11
Copy link
Contributor

@yifan-c yifan-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor comments. newer commits looks good overall.

*/

package org.apache.cassandra.cdc;
package org.apache.cassandra.cdc.test;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the test package used for? Seems to group the test foundation classes, but CdcTester is not in this new package.

*/
public class CdcBridgeTestInjector implements InvocationInterceptor
{
public void interceptTestTemplateMethod(Invocation<Void> invocation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add @Override

* Setups all fields of {@code CdcTestBase} based on {@code CassandraVersion} method parameter
* before each test method execution.
*/
public class CdcBridgeTestInjector implements InvocationInterceptor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

this.position = (int) log.maxOffset();
}
// TODO(lantoniak): Does not this change break markers in the stored state and prevents commit log from re-read?
// if (statusTracker.shouldContinue())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change form commit 34c6c38 broke trunk as well. @jyothsnakonisa, can you also have a look?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is being fixed in PR #179.

Patched by Lukasz Antoniak; Reviewed by Yifan Cai for CASSANALYTICS-60
Copy link
Contributor

@jyothsnakonisa jyothsnakonisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, some minor comments. I have local cdc setup, lets merge it after I test it with that setup on a 5.0 cassandra version

Comment on lines +35 to +38
RangeTombstone buildTombstone(List<Value> start, boolean isStartInclusive, List<Value> end, boolean isEndInclusive);
boolean canBuild();
RangeTombstone build();
void add(T marker);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add documentation to these interface methods?

Copy link
Member Author

@lukasz-antoniak lukasz-antoniak Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are little self-explanatory and not sure if JavaDoc will add much more value. Do you have something on your mind? This PR has changed abstract class to interface and those methods did not have explicit JavaDoc before.

Comment on lines +27 to +28
RangeTombstone rangeTombstone;
boolean expectOpen = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be make them protected?

Comment on lines +63 to +66
public abstract CommitLogInstance createCommitLogInstance(Path path);

public abstract TableIdLookup internalTableIdLookup();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add java docs?


public CommitLogInstance createCommitLogInstance(Path path)
{
return new FourZeroCommitLog(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my understanding commit log implementation has not changed between 4.0 and 5.0 but just double checking that returning FourZeroCommitLog is correct or not

Comment on lines +92 to +104
return new TableIdLookup()
{
@Nullable
public UUID lookup(String keyspace, String table) throws NoSuchElementException
{
TableMetadata tm = Schema.instance.getTableMetadata(keyspace, table);
if (tm == null)
{
throw new NoSuchElementException();
}
return tm.id.asUUID();
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are creating a new instance of TableIdLookup for every invocation of this method, instead can you have a single instance and return it on every call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

DatabaseDescriptor.setCommitLogSyncGroupWindow(30);
DatabaseDescriptor.setCommitLogSegmentSize(commitLogSegmentSize);
DatabaseDescriptor.getRawConfig().commitlog_total_space = new DataStorageSpec.IntMebibytesBound(1024);
DatabaseDescriptor.setCommitLogWriteDiskAccessMode(Config.DiskAccessMode.direct);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can u check if it's okay to hardcode it, curious if this changes based on hardware

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not seem to have an option to skip this setting. Only mmap and direct are supported based on 5.0.5 code. direct seemed like a safer option. AbstractCommitLogSegmentManager#createSegmentBuilder uses other modes for encrypted or compressed commit logs automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants