-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-8453: Add XSD Support to XML Reader (Part 1) #2824
Conversation
contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/xsd/DrillXSDSchemaUtils.java
Show resolved
Hide resolved
contrib/format-xml/src/test/java/org/apache/drill/exec/store/xml/xsd/TestXSDSchema.java
Outdated
Show resolved
Hide resolved
Sorry bogged down. Will review soon. |
I'm ok with merging this. It's still a bit of a work-in-progress (hence the Part 1) Some TODOs in here are mine. I do intend to get to them, but no reason to hold up this change set for that. I highly recommend that you squash these 15 commits together into one coherent commit rather than commit all 15 as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@jnturton Are we good to go? |
@mbeckerle We always squash commits for Drill PRs :-) |
The test was not constructing the right expectedSchema.
Test enhanced to use attributes and global attributeGroup definition and reference as well as local attribute declarations. Does not detect, nor handle, the fact that an attribute is allowed to have the exact same name as an element in XML Schema. Currently does not distinguish optional/required attributes (they're all addNullable), and does not implement XSD default/fixed. Note that attribute definitions in XSD come last in the element definition lexically; however, they come out first in Drill field order. This is consistent with the way XML text looks where the element tag carries attributes first, in the opening tag, then child elements follow after the opening tag.
Added test of complex element with only attributes. Added numerous assertion checks to be sure invariants are being maintained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@jnturton I fixed imports. |
--------- Co-authored-by: Michael Beckerle <[email protected]>
DRILL-8453: Add XSD Support to XML Reader (Part 1)
Description
This PR is a part of a series to add better support for reading XML data to Drill. One of the main challenges is that XML data does not have a way of inferring data types, nor does it have a way of detecting arrays.
The only way to do this really well is to have a schema. Some XML files link a schema definition file to the data. This PR adds the capability for Drill to map XSD schema files into Drill schemas.
The current plan is as follows:
Documentation
No user facing changes.
Testing
Added new unit tests.