-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49789][SQL] Handling of generic parameter with bounds while creating encoders #48252
base: master
Are you sure you want to change the base?
Conversation
…e/SPARK-46679 CDPD-58844. Upgrade janino to 3.1.10 Change-Id: I8744bb020e5fedcc0e9e4bc08c556c98a80406ba Sync workflow files | Triggered by Kitchen/RE-github-workflows Sync workflow files | Triggered by Kitchen/RE-github-workflows CDPD-58844. Upgrade janino to 3.1.10 Change-Id: I8744bb020e5fedcc0e9e4bc08c556c98a80406ba Sync workflow files | Triggered by Kitchen/RE-github-workflows Sync workflow files | Triggered by Kitchen/RE-github-workflows
…ng exception in creating encoder
…anges. added tests. cleanup of tests
…anges. added tests. cleanup of tests
…ests for edge cases. Flattened the recursive call
…ests for edge cases. Flattened the recursive call
…lient Connect, as per review feedback
@hvanhovell It was brought to my notice by @squito that they are 2 different issues. I am also adding the review comments put in the previous PR ( which I will be closing) hvanhovell 2 days ago done hvanhovell 2 days ago done hvanhovell 2 days ago done hvanhovell 2 days ago Unifying it will other java serialziation code, is something which I intend to do in my refactoring.. Author
I am hesitant at this point to further complicate the code, unless you all think that its worthwhile to do it in this PR. |
…ders for generic type and parameterized type
…irectly serializable generic type
…irectly serializable generic type
…precated method causing build failure
What changes were proposed in this pull request?
If a bean has generic types with bounds ( eg T <: SomeClass>) , as getters/setters, then depending upon the nature of the bounds, if it is java Serializable, KryoSerializable or a UDT Type, then appropriate encoder is created. If the bound is of first two types, then the data is represented as a BinaryType, while if the bound is of UDT type then schema / behaviour follows UDT Type.
Following things are considered while fixing the issue:
Since the concrete class type of the generic parameter is not available, it is not possible to create instance of the class ( during deser), if the bound represents any other type than the 3 mentioned above.
Because a UDT class can appear anywhwere in the bound's hierarchy, all the super classes of the bound ( including the bound) is considered and checked . To create the encoder the preference is UDTType followed by JavaSerializer or KryoSerializer, whichever shows first.
The majority of the code change in JavaTypeInference is a boolean check , to ignore any data type match when considering bound, except UDT and Type Variable ( type variable is included because T <: S and say S <: Serializable).
Following cases are considered which are sort of boundary cases:
If the generic bean is of type
`
Bean[T <: UDTClass] {
@BeanProperty var udt: T = _
}
Then the UDTEncoder will be created for the field
But if the Bean is of type
Bean[T <: UDTDerivedClass] {
@BeanProperty var udt: T = _
}
where UDTDerivedClass <: UDTClass
Then a JavaSerializable encoder will be created , even though the class hierarchy of UDTDerivedClass contains UDTClass. The reason being that concrete instance created by UDTType would be of UDTClass which is not assignable to
UDTDerivedClass
`
similarly for non generic bean class having UDTDerivedClass as bean property will also use Java Serialization encoder. ( added test for the same). The reason for JavaSerializationEncoder is same as that for Generic one.
Why are the changes needed?
To fix the regression in spark 3.3 onwards, where the bean having a generic type as return value, throws EncoderNotFoundException.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added bug tests..
Was this patch authored or co-authored using generative AI tooling?
No