Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-5487] Proper semantics for TIMESTAMP WITH LOCAL TIME ZONE #205

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wnob
Copy link
Contributor

@wnob wnob commented Jan 19, 2023

This is a work-in-progress as there are still some tests failing, but I believe the proper semantics are now codified in AvaticaResultSetConversionsTest. Was hoping to get some early feedback to see if I was barking up the wrong tree with these changes.

@@ -77,7 +78,9 @@ public AvaticaResultSet(AvaticaStatement statement,
ResultSetMetaData resultSetMetaData,
TimeZone timeZone,
Meta.Frame firstFrame) throws SQLException {
super(timeZone);
// Initialize the parent ArrayFactory with the UTC time zone because otherwise an array of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you just tore down a Chesterton's fence.

Copy link
Contributor Author

@wnob wnob Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't refute that with 100% certainty, but I will point out:

  • Currently, at head, if you just make this 1 line change to pass UTC_ZONE to the super constructor (here, the superclass being ArrayFactoryImpl) and run all the tests, none of them fail.
  • Alternatively, if you keep this 1 line as is, but go into AvaticaResultSetConversionTest and delete the timeZone property for the connection, which is currently set to "GMT", several getString() and getArray() tests start to fail. The getString() failures are self-explanatory, but the getArray() failures are tricky. They're double-adjusting the timestamps:
    • Once when they call getObject() (which invokes getTimestamp() with the connection's default calendar, which was previously GMT but is now the system default).
    • Then, in ArrayImpl.equalContents() the arrays are converted to result sets for traversal via Array.getResultSet() , which uses the same time zone as the "factory" result set and eventually invokes getObject() again, thus applying the time zone offset twice.

So, there's certainly an existing bug in the implementation of Array.getResultSet() that only manifests when the array contains timestamps and when the connection default time zone is not GMT, and this was not covered by tests. By removing the explicit GMT timeZone for these tests, I'm covering this case, and this seemed like the best solution. It's just hardcoding a UTC time zone for the ArrayFactoryImpl parent of a ResultSet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there's certainly an existing bug in the implementation of Array.getResultSet() that only manifests
when the array contains timestamps and when the connection default time zone is not GMT, and this
was not covered by tests

I don't know of any such bugs. But your changes do seem to make things worse - by ensuring that the nested AvaticaResultSet set does not get the right timeZone.

If there's a bug, log it.

*
* <p>Note that, when a TIMESTAMP is adjusted to a calendar, the offset is subtracted.
* Here, on the other hand, to adjust the string to the calendar (which only happens for type
* TIMESTAMP WITH LOCAL TIME ZONE), the offset is added. These are meant to be inverse operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after this change the method name is no longer appropriate.

the purpose of this method is to access a SQL TIMESTAMP value as a string

you have added logic to access a SQL TIMESTAMP WITH LOCAL TIME ZONE value as a string. and paved over the old functionality, which we still need.

Copy link
Contributor Author

@wnob wnob Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private method, and all existing invocations of it pass it a null Calendar. If you open Avatica at head in IntelliJ, it suggests inlining the constant parameter.

With my change, a TIMESTAMP WITH LOCAL TIME ZONE getter might pass it a non-null Calendar, and that's the only time it could be non-null. So, existing behavior cannot possibly be impacted by switching that minus to a plus.

Note that both a regular TIMESTAMP and a TIMESTAMP WITH LOCAL TIME ZONE are represented in JDBC as TIMESTAMP (since there is no JDBC type for the latter). I chose to re-use the existing timestamp getter with the addition of this new fixedInstant parameter since it seemed like a straightforward addition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private method, and all existing invocations of it pass it a null Calendar. If you open Avatica at head in IntelliJ, it suggests inlining the constant parameter.

Ah, I missed that. I guess the calendar parameter had some use in the distant past.

Still, don't overuse the 'no tests broke' or 'intellij suggested a refactoring' argument. By that argument you can justify renaming a variable called 'black' to 'white' - intellij and tests don't care about class and variable names - but it doesn't account for the confusion you create when you don't make the code self-explanatory.

Note that both a regular TIMESTAMP and a TIMESTAMP WITH LOCAL TIME ZONE are represented in JDBC as TIMESTAMP

There are two separate decisions to make: how to represent the type, and how to represent the value.

It seems that there is a way to represent the type: type = TIMESTAMP and typeName = "TIMESTAMP_WITH_LOCAL_TIME_ZONE" or something like that.

How have you decided to represent the value? It's worth calling out explicitly.

I chose to re-use the existing timestamp getter with the addition of this new fixedInstant parameter since it seemed like a straightforward addition.

It's not totally straightforward. Difference in interpretation is subtle. And adding a boolean field and an extra couple of ifs to some very simple classes does make them significantly more complex.

So I would actually create a class TimestampLtzAccessor (maybe it would extend TimestampAccessor, maybe not) to make everything explicit.

@julianhyde
Copy link
Contributor

julianhyde commented Jan 20, 2023

If it's not too much trouble, could you refactor the tests as a commit before the fix? It's disconcerting to see lots of test changes in a bug fix; I'd prefer to just see additions. The commit message cold be 'Refactor XxxTest'

I think that splitting accessor classes will help you achieve the 'hygiene' I referred to in https://issues.apache.org/jira/browse/CALCITE-5488.

The commit message should be the bug summary, i.e. '[CALCITE-5487] Support TIMESTAMP WITH LOCAL TIME ZONE in ResultSet'. Leaving the jira case number off makes it more difficult for the reviewer.

With those changes, I think we should be good.

@wnob wnob marked this pull request as draft January 30, 2023 19:48
@F21 F21 force-pushed the main branch 2 times, most recently from b97b819 to 4c0999b Compare November 28, 2023 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants