CASSANALYTICS-60: CDC support for Cassandra 5.0 commit logs#175
CASSANALYTICS-60: CDC support for Cassandra 5.0 commit logs#175lukasz-antoniak wants to merge 2 commits intoapache:trunkfrom
Conversation
| protected final KafkaStats kafkaStats; | ||
|
|
||
| public KafkaPublisher(TopicSupplier topicSupplier, | ||
| public KafkaPublisher(String version, |
There was a problem hiding this comment.
API change between Sidecar and Analytics.
There was a problem hiding this comment.
Thanks for calling it out.
Thinking from API design perspective, how about passing CassandraVersion, instead of String version value? It, then, ensures the version is always valid. Sidecar anyways has access to CassandraVersion. It would be clear for Sidecar to create the version object.
| * @param <T> | ||
| */ | ||
| public abstract class RangeTombstoneBuilder<T> | ||
| public interface RangeTombstoneBuilder<T> |
There was a problem hiding this comment.
Classloader clash when FourZeroRangeTombstoneBuilder is loaded by bridge classloader and RangeTombstoneBuilder (abstract parent class) by application classloader.
There was a problem hiding this comment.
maybe add the note in code so it stays as interface.
| protected final KafkaStats kafkaStats; | ||
|
|
||
| public KafkaPublisher(TopicSupplier topicSupplier, | ||
| public KafkaPublisher(String version, |
There was a problem hiding this comment.
Thanks for calling it out.
Thinking from API design perspective, how about passing CassandraVersion, instead of String version value? It, then, ensures the version is always valid. Sidecar anyways has access to CassandraVersion. It would be clear for Sidecar to create the version object.
| } | ||
| } | ||
|
|
||
| public static <T> T executeActionOnBridgeClassLoader(@NotNull CassandraVersion version, Throwing.Function<ClassLoader, T> action) |
There was a problem hiding this comment.
nit: add @VisibleForTesting
| * @param <T> | ||
| */ | ||
| public abstract class RangeTombstoneBuilder<T> | ||
| public interface RangeTombstoneBuilder<T> |
There was a problem hiding this comment.
maybe add the note in code so it stays as interface.
yifan-c
left a comment
There was a problem hiding this comment.
Noticed the PR is marked as Draft. I will re-review if there are new changes.
Please share the CI link too.
|
@yifan-c, refactored all CDC tests to |
yifan-c
left a comment
There was a problem hiding this comment.
some minor comments. newer commits looks good overall.
| */ | ||
|
|
||
| package org.apache.cassandra.cdc; | ||
| package org.apache.cassandra.cdc.test; |
There was a problem hiding this comment.
What is the test package used for? Seems to group the test foundation classes, but CdcTester is not in this new package.
| */ | ||
| public class CdcBridgeTestInjector implements InvocationInterceptor | ||
| { | ||
| public void interceptTestTemplateMethod(Invocation<Void> invocation, |
| * Setups all fields of {@code CdcTestBase} based on {@code CassandraVersion} method parameter | ||
| * before each test method execution. | ||
| */ | ||
| public class CdcBridgeTestInjector implements InvocationInterceptor |
db23c2f to
b5b31a4
Compare
| this.position = (int) log.maxOffset(); | ||
| } | ||
| // TODO(lantoniak): Does not this change break markers in the stored state and prevents commit log from re-read? | ||
| // if (statusTracker.shouldContinue()) |
There was a problem hiding this comment.
I think this change form commit 34c6c38 broke trunk as well. @jyothsnakonisa, can you also have a look?
Patched by Lukasz Antoniak; Reviewed by Yifan Cai for CASSANALYTICS-60
b5b31a4 to
fffdf31
Compare
jyothsnakonisa
left a comment
There was a problem hiding this comment.
Looks good overall, some minor comments. I have local cdc setup, lets merge it after I test it with that setup on a 5.0 cassandra version
| RangeTombstone buildTombstone(List<Value> start, boolean isStartInclusive, List<Value> end, boolean isEndInclusive); | ||
| boolean canBuild(); | ||
| RangeTombstone build(); | ||
| void add(T marker); |
There was a problem hiding this comment.
Can you add documentation to these interface methods?
There was a problem hiding this comment.
They are little self-explanatory and not sure if JavaDoc will add much more value. Do you have something on your mind? This PR has changed abstract class to interface and those methods did not have explicit JavaDoc before.
| RangeTombstone rangeTombstone; | ||
| boolean expectOpen = true; |
There was a problem hiding this comment.
May be make them protected?
| public abstract CommitLogInstance createCommitLogInstance(Path path); | ||
|
|
||
| public abstract TableIdLookup internalTableIdLookup(); | ||
|
|
There was a problem hiding this comment.
Can you please add java docs?
|
|
||
| public CommitLogInstance createCommitLogInstance(Path path) | ||
| { | ||
| return new FourZeroCommitLog(path); |
There was a problem hiding this comment.
As per my understanding commit log implementation has not changed between 4.0 and 5.0 but just double checking that returning FourZeroCommitLog is correct or not
| return new TableIdLookup() | ||
| { | ||
| @Nullable | ||
| public UUID lookup(String keyspace, String table) throws NoSuchElementException | ||
| { | ||
| TableMetadata tm = Schema.instance.getTableMetadata(keyspace, table); | ||
| if (tm == null) | ||
| { | ||
| throw new NoSuchElementException(); | ||
| } | ||
| return tm.id.asUUID(); | ||
| } | ||
| }; |
There was a problem hiding this comment.
You are creating a new instance of TableIdLookup for every invocation of this method, instead can you have a single instance and return it on every call?
| DatabaseDescriptor.setCommitLogSyncGroupWindow(30); | ||
| DatabaseDescriptor.setCommitLogSegmentSize(commitLogSegmentSize); | ||
| DatabaseDescriptor.getRawConfig().commitlog_total_space = new DataStorageSpec.IntMebibytesBound(1024); | ||
| DatabaseDescriptor.setCommitLogWriteDiskAccessMode(Config.DiskAccessMode.direct); |
There was a problem hiding this comment.
Can u check if it's okay to hardcode it, curious if this changes based on hardware
There was a problem hiding this comment.
We do not seem to have an option to skip this setting. Only mmap and direct are supported based on 5.0.5 code. direct seemed like a safer option. AbstractCommitLogSegmentManager#createSegmentBuilder uses other modes for encrypted or compressed commit logs automatically.
Fixes Analytics reader of Cassandra 5.0 commit logs (part of CASSANALYTICS-60).
Using environment variable
CASSANDRA_VERSION=5.0.0(or4.0.0), developers can run CDC unit tests with different Cassandra versions.Refactoring all tests to
@ParameterizedTestrequires more efforts, and will be performed in follow-up PR. Static variables likeCdcTests.BRIDGEare references from multiple places in the code.