-
Notifications
You must be signed in to change notification settings - Fork 302
Improve handling of PostgreSQL intervals #1604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of PostgreSQL intervals #1604
Conversation
|
|
||
| instance PGTF.ToField PgInterval where | ||
| toField (PgInterval t) = PGTF.toField t | ||
| toField = PGTF.toField . fromMaybe (error "PgInterval.toField") . pgIntervalToInterval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and toPersistValue) should perhaps saturate/clamp the interval rather than throwing an impure exception. For example if you try to insert PgInterval 9223372036855, the existing toPersistValue method will generate "9223372036855s", which Postgres will complain about:
SqlError
{ sqlState = "22015"
, sqlExecStatus = FatalError
, sqlErrorMsg = "interval field value out of range: \"9223372036855s\""
, sqlErrorDetail = ""
, sqlErrorHint = ""
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, that example is what happens with the current version of Persistent if you try to insert a PgInterval that's too big for PostgreSQL to handle. With these changes, an impure exception would be thrown from Haskell instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the current behaviour is not ideal, but is it necessary to change the To/FromField instances for PgInterval here? I'd prefer that we leave it as-is if we can (especially if the aim is for downstream users to stop using PgInterval), since it's possible that there are downstream users who are relying on the error being thrown at the database level in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I think it is necessary to change the instance. The builtinGetters are type directed, so I'm forced to decode a PostgreSQL interval into a Haskell Interval. I could introduce a type that's essentially Either PgInterval Interval, but that feels like it's not worth the effort. I can't imagine that someone is relying on the behavior of PgInterval throwing a SqlError when it's too big.
| -- | Represent Postgres interval using NominalDiffTime | ||
| -- | ||
| -- @since 2.11.0.0 | ||
| newtype PgInterval = PgInterval {getPgInterval :: NominalDiffTime} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest that longer term the PgInterval type should be deprecated and ultimately removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - can we add a comment here explaining why it's best avoided and that it might be deprecated in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, never mind, I changed my mind - I think it's a valid use case to say that you want to eg store intervals in the database that should always be convertable to NominalDiffTime so that you can interact with them using that type on the Haskell side. I think we should call out the fact that the conversion isn't always possible in the docs for this type, but I also think there are probably lots of use cases where this tradeoff is acceptable and the risk of accidentally creating intervals which can't be converted to NominalDiffTime is low - the fact that this is the type that actually exists in this library right now and has done for years is evidence of this, I think.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
| restore-keys: | | ||
| ${{ runner.os }}-${{ matrix.ghc }}-${{ hashFiles('cabal.project.freeze') }} | ||
| ${{ runner.os }}-${{ matrix.ghc }}- | ||
| - run: cabal v2-build all --disable-optimization --only-dependencies $CONFIG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this because I was seeing an error:
Cannot select only the dependencies (as requested by the '--only-dependencies' flag), the package persistent-2.17.1.0 is required by a dependency of one of the other targets.
I think that happened because postgresql-simple-interval depends on persistent, which is obviously part of this project. So Cabal can't build only the dependencies, since that would require also building persistent.
|
Here are some examples of behavior before and after these changes: Click to reveal extensions and imports.:set -XOverloadedStrings
:set -XTypeApplications
import Control.Monad.Logger as L
import Database.Persist.Sql as S
import Database.Persist.Postgresql as P
import Database.PostgreSQL.Simple.Interval as I
L.runStderrLoggingT (P.withPostgresqlConn "user=postgres" (\ c -> S.runSqlConn (S.rawSql @(S.Single P.PgInterval) "..." [...]) c))-- PgInterval can't represent days.
rawSql "select '1 day'::interval" []
-- before:
-- *** Exception: ConversionFailed {errSQLType = "interval", errSQLTableOid = column number 0 is out of range 0..-1
-- Nothing, errSQLField = "column number 0 is out of range 0..-1
-- ", errHaskellType = "PgInterval", errMessage = ":: Failed reading: satisfyWith"}
-- after:
-- *** Exception: user error (invalid interval)-- PgInterval can't represent months.
rawSql "select '1 month'::interval'" []
-- before:
-- *** Exception: ConversionFailed {errSQLType = "interval", errSQLTableOid = column number 0 is out of range 0..-1
-- Nothing, errSQLField = "column number 0 is out of range 0..-1
-- ", errHaskellType = "PgInterval", errMessage = ":: Failed reading: satisfyWith"}
-- after:
-- *** Exception: user error (invalid interval)-- PgInterval loses precision.
rawSql "select ?::interval" [toPersistValue $ PgInterval 0.0000009]
-- before:
-- [Single {unSingle = PgInterval {getPgInterval = 0.000001s}}]
-- after:
-- [Single {unSingle = PgInterval {getPgInterval = 0s}}]-- PgInterval overflows.
rawSql "select ?::interval" [toPersistValue $ PgInterval 9223372036854.775808]
-- before:
-- *** Exception: SqlError {sqlState = "22015", sqlExecStatus = FatalError, sqlErrorMsg = "interval field value out of range: \"9223372036854.775808s\"", sqlErrorDetail = "", sqlErrorHint = ""}
-- after:
-- *** Exception: PgInterval.toPersistValueAnd for completeness, here's how the new -- Interval can represent days.
rawSql "select '1 day'::interval" []
-- [Single {unSingle = MkInterval {months = 0, days = 1, microseconds = 0}}]
-- Interval can represent months.
rawSql "select '1 month'::interval" []
-- [Single {unSingle = MkInterval {months = 1, days = 0, microseconds = 0}}]
-- Interval cannot lose precision (correct by construction).
rawSql "select ?::interval" [toPersistValue $ MkInterval 0 0 1]
-- [Single {unSingle = MkInterval {months = 0, days = 0, microseconds = 1}}]
-- Interval cannot overflow (correct by construction).
rawSql "select ?::interval" [toPersistValue $ MkInterval 0 0 9223372036854775807]
-- [Single {unSingle = MkInterval {months = 0, days = 0, microseconds = 9223372036854775807}}] |
| instance PersistFieldSql PgInterval where | ||
| sqlType _ = SqlOther "interval" | ||
|
|
||
| pgIntervalToInterval :: PgInterval -> Maybe Interval.Interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reproduce the relevant part of your PR comment here, and also for the function in the other direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, rather than having these here, I think I would prefer that postgres-simple-interval provide conversions to and from NominalDiffTime - in the case where you need to deal with an interval in both the database and in Haskell code, it's going to be a lot more comfortable to work with NominalDiffTime on the Haskell side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just opened a PR for adding these conversions to postgresql-simple-interval: MercuryTechnologies/postgresql-simple-interval#24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may not be great for an interval to be backed by a NominalDiffTime - the syntax allows, for example, INTERVAL '1 month' where the exact NominalDiffTime this corresponds to would differ based on both operation and starting month. Same for INTERVAL '1 year' though that is down to leap seconds and leap days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, that's the whole point of this PR! Although for posterity it's interval '1 day' and interval '1 month' that the existing PgInterval can't handle. interval '1 year' is the same as interval '12 months'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a one-day interval on the postgres side ever different from nominalDay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh lol you answered on the other thread
| intervalToPgInterval :: Interval.Interval -> Maybe PgInterval | ||
| intervalToPgInterval interval | ||
| | Interval.months interval /= 0 = Nothing | ||
| | Interval.days interval /= 0 = Nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels to me like it should be possible to handle nonzero days values here - can we essentially do days * 86400 + seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but that has the potential to be incorrect in the presence of leap seconds. Since the current PgInterval doesn't handle days at all, I figured there's no reason to do a best effort conversion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, ty
|
|
||
| instance PGTF.ToField PgInterval where | ||
| toField (PgInterval t) = PGTF.toField t | ||
| toField = PGTF.toField . fromMaybe (error "PgInterval.toField") . pgIntervalToInterval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the current behaviour is not ideal, but is it necessary to change the To/FromField instances for PgInterval here? I'd prefer that we leave it as-is if we can (especially if the aim is for downstream users to stop using PgInterval), since it's possible that there are downstream users who are relying on the error being thrown at the database level in this case.
| -- | Represent Postgres interval using NominalDiffTime | ||
| -- | ||
| -- @since 2.11.0.0 | ||
| newtype PgInterval = PgInterval {getPgInterval :: NominalDiffTime} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - can we add a comment here explaining why it's best avoided and that it might be deprecated in the future?
hdgarrood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I have the commit bit on this repo but the code looks good to me now. Thanks! I think all that's left is a changelog update?
| prop "interval round trips" $ \(m, d, u) -> runConnAssert $ do | ||
| let | ||
| expected = | ||
| IntervalDb . Interval.MkInterval m d $ | ||
| clamp (-microsecondLimit) microsecondLimit u | ||
| key <- insert expected | ||
| actual <- getJust key | ||
| liftIO $ actual `shouldBe` expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also the round trip tests in postgresql-simple-interval's suite: https://github.com/MercuryTechnologies/postgresql-simple-interval/blob/7d58fbe45b6a2306d4ae025e71bfa196b1d8254f/source/test-suite/Main.hs#L428-L429
| * Changed the representation of intervals to use the `Interval` type from [the `postgresql-simple-interval` package](https://hackage.haskell.org/package/postgresql-simple-interval). | ||
| This changes the behavior of `PgInterval` for very small and very large values. | ||
| * Previously `PgInterval 0.000_000_9` would be rounded to `0.000_001` seconds, but now it is truncated to 0 seconds. | ||
| * Previously `PgInterval 9_223_372_036_854.775_808` would overflow and throw a SQL error, but now it saturates to `9_223_372_036_854.775_807` seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true for all values greater than 9_223_372_036_854.775_807? ie "this is the maximum number of seconds that PgInterval can now represent?"
Hm, no, that's me being confused. Previously, you could construct an interval that was huge in Haskell, and you would error in SQL when you try to write it. Now, instead of a thrown exception, you get a clamped value, which is a problem if you were dealing with (consults calculator) >=292,271 years and relying on an exception to prevent you from doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Previously if you tried to insert PgInterval 9223372036854.775808, you'd get a SQL error.
postgres=# select interval '9223372036854.775808 seconds';
ERROR: interval field value out of range: "9223372036854.775808 seconds"
LINE 1: select interval '9223372036854.775808 seconds';
^| * [#1604](https://github.com/yesodweb/persistent/pull/1604) | ||
| * Changed the representation of intervals to use the `Interval` type from [the `postgresql-simple-interval` package](https://hackage.haskell.org/package/postgresql-simple-interval). | ||
| This changes the behavior of `PgInterval` for very small and very large values. | ||
| * Previously `PgInterval 0.000_000_9` would be rounded to `0.000_001` seconds, but now it is truncated to 0 seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this Haskell behavior or database behavior? And is this something that would be observable on write or read?
From reading the interval docs, I'm guessing that you mean that we'd round PgInterval up when writing to the database, and that the database cannot represent a value like 0.000_000_9 in the first place. So values of 500-999 picoseconds are going to behave differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer is a little complicated. postgresql-simple-interval uses realToFrac to convert from Pico to Micro (source). That truncates extra precision:
ghci> realToFrac (0.0000009 :: Pico) :: Micro
0.000000I'm ... not totally sure what PostgreSQL does, actually. I thought it parsed the input as a double and then rounded, but I'm not seeing that behavior experimentally:
-- 2025-09-09 10:43:48.239 CDT [50979] LOG: starting PostgreSQL 17.6 on aarch64-apple-darwin24.6.0, compiled by clang version 19.1.7, 64-bit
-- psql (17.6)
-- good, rounds down
select interval '0.0000004 seconds'; -- 00:00:00
-- good, rounds up
select interval '0.0000006 seconds'; -- 00:00:00.000001
-- good, rounds half to even
select interval '0.0000005 seconds'; -- 00:00:00
-- bad, rounds half to odd
select interval '0.0000015 seconds'; -- 00:00:00.000001
-- bad and weird, inconsistent with previous
select interval '0.00000151 seconds'; -- 00:00:00.000002
select interval '1 second' * 0.0000015; --00:00:00.000002Co-authored-by: Matt Parsons <[email protected]>
|
tagged and released, thank you! |
Persistent uses the
PgIntervaltype to handle PostgreSQLintervals. Unfortunately it's not a good fit for that. On the Haskell side,PgIntervalis just a wrapper aroundNominalDiffTime, which represents some amount of picoseconds. On the PostgreSQL side, anintervalhas 32 bits of precision for months and days together with 64 bits of precision for microseconds. So converting betweenPgIntervalandintervalis not total in either direction.I created a new
Intervaltype in mypostgresql-simple-intervallibrary. It accurately models PostgreSQL'sintervaltype and provides most of the instances and functions that you'd want. https://hackage-content.haskell.org/package/postgresql-simple-interval-0.2025.8.27/docs/Database-PostgreSQL-Simple-Interval.html#t:IntervalThis pull request updates
persistent-postgresqlto use theIntervaltype frompostgresql-simple-intervalrather thanPgInterval. It leavesPgIntervalfor backwards compatibility, but I would recommend deprecating and eventually removing it.These issues and PRs are related: