You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@dlebauer You don't need to mention constraints that are already inherent in the data types being used ("all *id fields are integers"). That said, there may be cases where the wrong data type has been used. (For example, variables.min and variables.max are varchar, not numeric.)
The NOT NULL constraints for the non-textual columns were dealt with in GH issue #200. The one exception (of the NOT-NULL constraints specified in the google spreadsheet) is the column started_at. So I've added a non-NULL constraint for it to my draft migration and also specified it to be <= NOW().
While created_at and updated_at are always UTC (or should be), the comments on started_at and finished_at refer to 'machine time'. Thus, we'll assume the constraint should be started_at <= NOW()::timestamp where the explicit cast to timestamp, while not necessary, makes clear we are comparing started_at with the current local or machine time.
@dlebauer All the runs created in March and April have a NULL in the started_at column. Is this an error, or should we forget about having a not-NULL constraint on this column?
The migration value_constraints_batch_1 was included in release 4.2. This ensures that runs.outdir, runs.outprefix, and runs.setting are all non-null and have a default value of ''.
@dlebauer Remaining to do (or not) are value constraints on the columns of type timestamp—namely start_time, finish_time, started_at, and finished_at. Here are constraints that I wanted to add to a previous migration but had to comment out because existing data violates them:
-- ALTER TABLE runs ALTER COLUMN started_at SET NOT NULL;-- ALTER TABLE runs ADD CONSTRAINT valid_run_start_time CHECK (started_at <= NOW()::timestamp); -- NOW()::timestamp makes clear we are using local machine time-- ALTER TABLE runs ADD CONSTRAINT consistent_run_start_and_end_times CHECK (started_at <= finished_at AND finished_at <= NOW()::timestamp);
As an example, on ebi_production, running
select*from runs where not started_at <= finished_at;
returns 7 rows, 4 of which were created in the last two weeks.
Another issue is what timezone is being assumed for these times (see also issue #257). For existing data, in some cases it seems to be Illinois time and in other cases it seems to be UTC. Perhaps this is not important.
Time zone is critical if you put any constraints on started_at and finished_at otherwise this will completely destroy the sync system. All times should be UTC. We decided that a few years ago so hopefully any Central time records are old. There should be no constraints on the absolute values of start_time and finish_time. The relative constraint of finish being after start seems ok
The text was updated successfully, but these errors were encountered:
@gsrohde commented on Mon Feb 16 2015
Details
At the least, all textual columns will be set to NOT NULL.
@dlebauer commented on Tue Feb 17 2015
make sure to look at not-null value constraints defined in the google spreadsheet.
At least, all *id fields are integers, all times are ... times. created_at, updated_at should be > 2010-01-01; etc.
@gsrohde commented on Wed Feb 18 2015
@dlebauer You don't need to mention constraints that are already inherent in the data types being used ("all *id fields are integers"). That said, there may be cases where the wrong data type has been used. (For example, variables.min and variables.max are varchar, not numeric.)
@gsrohde commented on Wed Feb 18 2015
The NOT NULL constraints for the non-textual columns were dealt with in GH issue #200. The one exception (of the NOT-NULL constraints specified in the google spreadsheet) is the column
started_at
. So I've added a non-NULL constraint for it to my draft migration and also specified it to be <= NOW().@gsrohde commented on Mon Mar 16 2015
While created_at and updated_at are always UTC (or should be), the comments on started_at and finished_at refer to 'machine time'. Thus, we'll assume the constraint should be
started_at <= NOW()::timestamp
where the explicit cast to timestamp, while not necessary, makes clear we are comparingstarted_at
with the current local or machine time.@gsrohde commented on Wed May 20 2015
@dlebauer All the runs created in March and April have a NULL in the
started_at
column. Is this an error, or should we forget about having a not-NULL constraint on this column?@dlebauer commented on Wed May 20 2015
Delete these rows, add constraint, and then we will fix pecan
On Wed, May 20, 2015 at 2:02 PM Scott Rohde [email protected]
wrote:
@gsrohde commented on Mon Dec 07 2015
The migration
value_constraints_batch_1
was included in release 4.2. This ensures thatruns.outdir
,runs.outprefix
, andruns.setting
are all non-null and have a default value of''
.@gsrohde commented on Mon Dec 07 2015
@dlebauer See pending (unchecked) item above and decide whether to ignore, table (put in backlog), or fix now.
@dlebauer commented on Tue Jul 12 2016
@gsrohde it looks like this constraint is ready to be added. Is that correct?
@gsrohde commented on Wed Aug 03 2016
@dlebauer Remaining to do (or not) are value constraints on the columns of type timestamp—namely
start_time
,finish_time
,started_at
, andfinished_at
. Here are constraints that I wanted to add to a previous migration but had to comment out because existing data violates them:As an example, on ebi_production, running
returns 7 rows, 4 of which were created in the last two weeks.
Another issue is what timezone is being assumed for these times (see also issue #257). For existing data, in some cases it seems to be Illinois time and in other cases it seems to be UTC. Perhaps this is not important.
@dlebauer commented on Wed Aug 03 2016
@robkooper @mdietze can you address Scott's questions above?
@mdietze commented on Wed Aug 03 2016
Time zone is critical if you put any constraints on started_at and finished_at otherwise this will completely destroy the sync system. All times should be UTC. We decided that a few years ago so hopefully any Central time records are old. There should be no constraints on the absolute values of start_time and finish_time. The relative constraint of finish being after start seems ok
The text was updated successfully, but these errors were encountered: