timestamp errors working with Delta Lake tables (and presumably Parquet?) #6396
-
I originally had an issue preventing writing Delta Lake tables with timestamps due to the NanoSecond thing, fixed by #6386 before that PR, I was working around it by casting the timestamp column to a string then casting back to timestamp. this still caused some issues, so I was using things seem fine -- I can then write the table without issue and then read it back: import ibis
ibis.options.interactive = True
t = ibis.read_delta("/path/to/delta")
t.count() but when doing something more complex (that scans the entire table?): t = t.filter(_.timestamp >= (datetime.datetime.now() - datetime.timedelta(days=30)).count() I get an error like:
indicating that some of the timestamps aren't the same as the others? throwing this here for SEO in case anyone else hits this, will answer after posting |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
solving this led me here: https://stackoverflow.com/questions/59682833/pyarrow-lib-arrowinvalid-casting-from-timestampns-to-timestampms-would-los and here: https://delta-io.github.io/delta-rs/python/api_reference.html#writing-deltatables with good examples in here: https://github.com/delta-io/delta-rs/blob/5dc89b389d830cf72f68ace54adab85c81c26a69/python/tests/test_writer.py#L462-L488 with the solution being to, on writing, do: import ibis
from pyarrow.dataset import ParquetFileFormat
t = ... # get t however
t.to_delta(
"path/to/delta",
mode="overwrite",
file_options=ParquetFileFormat().make_write_options().update(coerce_timestamps="us")
) it was a little hard to figure out how to use the |
Beta Was this translation helpful? Give feedback.
solving this led me here: https://stackoverflow.com/questions/59682833/pyarrow-lib-arrowinvalid-casting-from-timestampns-to-timestampms-would-los
and here: https://delta-io.github.io/delta-rs/python/api_reference.html#writing-deltatables
with good examples in here: https://github.com/delta-io/delta-rs/blob/5dc89b389d830cf72f68ace54adab85c81c26a69/python/tests/test_writer.py#L462-L488
with the solution being to, on writing, do:
it was a little hard to figure out…