[FLINK-39291][API / Type Serialization System] FlinkScalaKryoInstantiator InstantiatorStrategy fix#27812
Open
Myracle wants to merge 1 commit intoapache:masterfrom
Conversation
51837a3 to
6aadbdc
Compare
Collaborator
Contributor
Author
|
@flinkbot run azure |
…ator InstantiatorStrategy fix
6aadbdc to
a4ba7a5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
FlinkScalaKryoInstantiator.newKryo()sets the KryoInstantiatorStrategyto a pureStdInstantiatorStrategy(Objenesis), which creates all object instances by bypassing constructors entirely. This is inconsistent with the fallback path inKryoSerializer.getKryoInstance(), which usesDefaultInstantiatorStrategy(tries no-arg constructors first) withStdInstantiatorStrategyas a fallback.When
flink-table-api-scalais on the classpath,KryoSerializerloadsFlinkScalaKryoInstantiatorvia reflection and uses the Kryo instance it creates. Any class that relies on its no-arg constructor to initialize internal state will then fail during deserialization, because the constructor is never invoked. A concrete example is Apache Iceberg'sSerializableByteBufferMap, which initializes its internalwrappedmap (Map<Integer, ByteBuffer>) in the no-arg constructor. When deserialized via Kryo'sMapSerializer,MapSerializer.create()callskryo.newInstance(), which bypasses the constructor underStdInstantiatorStrategy, leavingwrapped = null. Subsequently,MapSerializer.read()callsmap.put(key, value), which delegates towrapped.put()— resulting in aNullPointerException.This pull request aligns
FlinkScalaKryoInstantiator'sInstantiatorStrategywith the default behavior inKryoSerializer.getKryoInstance()— usingDefaultInstantiatorStrategyas the primary strategy andStdInstantiatorStrategyas the fallback — so that no-arg constructors are properly invoked when available.Brief change log
FlinkScalaKryoInstantiator.newKryo()to useDefaultInstantiatorStrategywithStdInstantiatorStrategyas a fallback, instead of a pureStdInstantiatorStrategy. This ensures Kryo first attempts to invoke no-arg constructors (via reflection) and only falls back to Objenesis (bypassing constructors) when no suitable constructor is found.FlinkScalaKryoInstantiatorTestto verify:InstantiatorStrategyis correctly configured asDefaultInstantiatorStrategywithStdInstantiatorStrategyfallback.KryoSerializer(which loadsFlinkScalaKryoInstantiatorvia reflection) produces a Kryo instance with the correct strategy.Verifying this change
This change added tests and can be verified as follows:
FlinkScalaKryoInstantiatorTest.testInstantiatorStrategyIsDefaultWithFallback()that verifies the Kryo instance usesDefaultInstantiatorStrategyas the primary strategy withStdInstantiatorStrategyas the fallback.FlinkScalaKryoInstantiatorTest.testClassWithNoArgConstructorIsProperlyInitialized()that verifieskryo.newInstance()invokes the no-arg constructor and properly initializes fields (simulating Iceberg'sSerializableByteBufferMapscenario).FlinkScalaKryoInstantiatorTest.testSerializationRoundTripWithMapField()that performs an end-to-end serialize/deserialize round-trip for an object with a Map field initialized in the constructor, reproducing the exact NPE failure scenario.FlinkScalaKryoInstantiatorTest.testClassWithoutNoArgConstructorUsesObjenesisFallback()that verifies classes without no-arg constructors can still be instantiated via the Objenesis fallback strategy.FlinkScalaKryoInstantiatorTest.testKryoSerializerUsesCorrectStrategy()that verifiesKryoSerializer(which loadsFlinkScalaKryoInstantiatorvia reflection when it's on the classpath) produces a Kryo instance with the correctInstantiatorStrategy.Does this pull request potentially affect one of the following parts:
@Public(Evolving): noInstantiatorStrategyused byFlinkScalaKryoInstantiator, ensuring no-arg constructors are invoked during Kryo deserialization whenflink-table-api-scalais on the classpath)DefaultInstantiatorStrategyadds a negligible reflection-based constructor lookup before falling back to Objenesis; for classes with registered custom serializers,newInstanceis not called by Kryo at all)Documentation