Make LogicalTableConfig serialiser / deserialiser pluggable#17678
Make LogicalTableConfig serialiser / deserialiser pluggable#17678Jackie-Jiang merged 19 commits intoapache:masterfrom
Conversation
Move fromZNRecord/toZNRecord methods from TableConfigSerDeUtils and LogicalTableConfigUtils into the config classes themselves. This makes it easier to extend these classes — subclasses can override serialization/deserialization behavior directly. - Add helix-core dependency to pinot-spi for ZNRecord access - Add fromZNRecord()/toZNRecord() to TableConfig and LogicalTableConfig - Delete TableConfigSerDeUtils (all callers updated) - Remove fromZNRecord/toZNRecord from LogicalTableConfigUtils (validation stays) - Update all callers across 15 files to use the new methods directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #17678 +/- ##
============================================
- Coverage 63.21% 63.21% -0.01%
Complexity 1454 1454
============================================
Files 3176 3179 +3
Lines 191002 191054 +52
Branches 29204 29210 +6
============================================
+ Hits 120742 120770 +28
- Misses 60847 60858 +11
- Partials 9413 9426 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Moves ZNRecord serialization/deserialization responsibilities into TableConfig and LogicalTableConfig so serialization can be co-located with the config types (and toZNRecord() can be overridden by subclasses), removing the old utility-based SerDe entry points.
Changes:
- Added
fromZNRecord()/toZNRecord()toorg.apache.pinot.spi.config.table.TableConfigandorg.apache.pinot.spi.data.LogicalTableConfig - Removed
TableConfigSerDeUtilsand removed SerDe methods fromLogicalTableConfigUtils(kept validation) - Updated callers across tools, controller, common caches, and tests to use the new APIs
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/ValidateConfigCommand.java | Switches table config parsing to TableConfig.fromZNRecord() |
| pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/MoveReplicaGroup.java | Switches table config parsing to TableConfig.fromZNRecord() |
| pinot-tools/src/main/java/org/apache/pinot/tools/UpdateSegmentState.java | Switches table config parsing to TableConfig.fromZNRecord() |
| pinot-spi/src/main/java/org/apache/pinot/spi/data/LogicalTableConfig.java | Introduces fromZNRecord()/toZNRecord() for logical table configs |
| pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java | Introduces fromZNRecord()/toZNRecord() for table configs |
| pinot-spi/pom.xml | Adds Helix dependency needed by ZNRecord APIs now in SPI |
| pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/test/java/.../RealtimeToOfflineSegmentsTaskExecutorTest.java | Updates tests to use tableConfig.toZNRecord() |
| pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/test/java/.../PurgeTaskExecutorTest.java | Updates tests to use tableConfig.toZNRecord() |
| pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/test/java/.../MergeRollupTaskExecutorTest.java | Updates tests to use tableConfig.toZNRecord() |
| pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkDimensionTableOverhead.java | Updates perf benchmark to use tableConfig.toZNRecord() |
| pinot-core/src/test/java/org/apache/pinot/core/data/manager/offline/DimensionTableDataManagerTest.java | Updates tests to use tableConfig.toZNRecord() |
| pinot-controller/src/test/java/org/apache/pinot/controller/api/TableSizeReaderTest.java | Updates tests to use tableConfig.toZNRecord() |
| pinot-controller/src/main/java/org/apache/pinot/controller/util/TableRetentionValidator.java | Switches table config parsing to TableConfig.fromZNRecord() |
| pinot-common/src/test/java/org/apache/pinot/common/utils/config/TableConfigSerDeUtilsTest.java | Repoints SerDe tests to TableConfig.fromZNRecord()/toZNRecord() |
| pinot-common/src/main/java/org/apache/pinot/common/utils/config/TableConfigSerDeUtils.java | Deletes table config utility SerDe |
| pinot-common/src/main/java/org/apache/pinot/common/utils/LogicalTableConfigUtils.java | Removes logical table utility SerDe while keeping validation logic |
| pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java | Switches to new SerDe APIs for table/logical-table configs |
| pinot-common/src/main/java/org/apache/pinot/common/config/provider/ZkTableCache.java | Switches to new SerDe APIs for table/logical-table configs |
| pinot-common/src/main/java/org/apache/pinot/common/config/provider/LogicalTableMetadataCache.java | Switches to new SerDe APIs for table/logical-table configs |
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.helix</groupId> | ||
| <artifactId>helix-core</artifactId> |
There was a problem hiding this comment.
Adding helix-core to pinot-spi makes Helix a transitive dependency for SPI consumers, which increases coupling and may be undesirable for a SPI layer. If keeping SPI lightweight is a goal, consider isolating ZNRecord SerDe into a separate module (e.g., a Helix-specific SPI extension) or exposing a Helix-agnostic representation (Map/JSON) from SPI while keeping ZNRecord conversions in pinot-common.
| <artifactId>helix-core</artifactId> | |
| <artifactId>helix-core</artifactId> | |
| <optional>true</optional> |
There was a problem hiding this comment.
IIRC we don't want to introduce Helix as a dependency to the SPI layer
cc - @Jackie-Jiang
There was a problem hiding this comment.
+1. And this is the main reason why we have a separate TableConfigSerDeUtils in pinot-common.
In order to override the ser/de behavior of them, you should make the ser/de component pluggable, but keep it within pinot-common to not pollute pinot-spi
There was a problem hiding this comment.
Thanks updated the PR.
I've added a config record object which is very similar to the data structure of ZNRecord. Technically it's coupled to ZNRecord, but there is no Helix dependency now. This is now a much simpler change than any alternative I could think of.
There was a problem hiding this comment.
Updated the PR to make serialization/deserialization pluggable.
Avoided TableConfig for now and will do that in a seperate PR
|
TableRebalanceIntegrationTest test failures look to be intermittent failures as the test is passing locally and is unrelated to this change |
Move fromZNRecord/toZNRecord methods from TableConfigSerDeUtils and LogicalTableConfigUtils into the config classes themselves. This makes it easier to extend these classes — subclasses can override serialization/deserialization behavior directly. - Add helix-core dependency to pinot-spi for ZNRecord access - Add fromZNRecord()/toZNRecord() to TableConfig and LogicalTableConfig - Delete TableConfigSerDeUtils (all callers updated) - Remove fromZNRecord/toZNRecord from LogicalTableConfigUtils (validation stays) - Update all callers across 15 files to use the new methods directly - Add LogicalTableConfigSerDeTest for ZNRecord round-trip coverage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.helix</groupId> | ||
| <artifactId>helix-core</artifactId> |
There was a problem hiding this comment.
+1. And this is the main reason why we have a separate TableConfigSerDeUtils in pinot-common.
In order to override the ser/de behavior of them, you should make the ser/de component pluggable, but keep it within pinot-common to not pollute pinot-spi
Replace ZNRecord-based SerDe on TableConfig and LogicalTableConfig with a helix-agnostic ConfigRecord POJO in pinot-spi. Bridge utilities in pinot-common (TableConfigSerDeUtils, LogicalTableConfigUtils) handle ZNRecord conversion, keeping helix-core out of the SPI layer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
The test failures are due to the docker environment issue. This is unrealted to this PR. |
abhishekbafna
left a comment
There was a problem hiding this comment.
Added a comment to consider before merging.
| if (best == null) { | ||
| throw new RuntimeException("No implementation of LogicalTableConfigSerDe found"); | ||
| } |
There was a problem hiding this comment.
Instead of the exception, it can init with the default SerDe which is implemented in the PR. What is the advantage of exception over default initialisation?
There was a problem hiding this comment.
exception fails fast and lets us know that for some reason there is no matching instance.
| import org.apache.pinot.spi.data.LogicalTableConfig; | ||
|
|
||
|
|
||
| /** |
There was a problem hiding this comment.
(minor) Consider using markdown style of javadoc, which is easier to read.
Same for other javadocs
| /** | ||
| * Deserializes a {@link LogicalTableConfig} from the given {@link ZNRecord}. | ||
| */ | ||
| LogicalTableConfig fromZNRecord(ZNRecord znRecord) |
There was a problem hiding this comment.
We should also include a serialize interface (i.e. ZNRecord toZNRecord(LogicalTableConfig logicalTableConfig))
| * Config classes in pinot-spi use this for serialization/deserialization, while bridge utilities | ||
| * in pinot-common handle conversion between ConfigRecord and ZNRecord. | ||
| */ | ||
| public class ConfigRecord { |
There was a problem hiding this comment.
Let's not introduce this class. We don't need this layer of abstraction, which can introduce unnecessary overhead
There was a problem hiding this comment.
I don't think we can manage without this class, because we want the serialisation code toConfigRecord to be extendable for classes that extend LogicalTableConfig.
Without having the Helix dependency in the SPI layer, I don't think we can do that easily.
Another way is if we avoid using simple fields and just store the LogicalTableConfig as a pure JSON, then it can work, but since that's not how TableConfig or LogicalTableConfig is stored currently, it will be a backwards incompatible change.
There was a problem hiding this comment.
Also if we're using simple fields and need to translate class attributes into field names, I really think it makes sense to have the logic in the same class so that its easier to maintain.
There was a problem hiding this comment.
removed ConfigRecord and moved refactoring to DefaultLogicalTableConfigSerDe
The PR introduces a pluggable serialization/deserialization framework for LogicalTableConfig, enabling LogicalTable to be extended based on different use cases