ENH: Add support for writing more ExtensionArray/PyArrow types #257
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR improves the ExtensionArray writing support from #232 to handle other array types (including PyArrow ones).
Before it, such fields were often silently and unexpectedly casted to
object
dtype and then toOFTString
.Instead of casting the data array to a basic numpy dtype and then using the data array dtype to infer the OGR type in the Cython code, the high-level code now infers the equivalent numpy dtype and passes it into Cython, leaving the ExtensionArray with data as-is. As a side benefit, field data arrays are now never copied during writing.
Addition of
DTYPE_OGR_FIELD_TYPES["string"]
allows theinfer_field_types
function to handle the non-ambiguous string data case, where no additional type inference is needed.