Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runs value constraints: needed cleanup and decisions #5

Open
1 of 3 tasks
dlebauer opened this issue Sep 18, 2018 · 0 comments
Open
1 of 3 tasks

runs value constraints: needed cleanup and decisions #5

dlebauer opened this issue Sep 18, 2018 · 0 comments

Comments

@dlebauer
Copy link
Member

@gsrohde commented on Mon Feb 16 2015

  • Decide what constraints are possible
  • Decide on precise form of started_at constraint (pending resolution of #257) [Resolved for now by assuming local machine time; see note]
  • Add them.

Details

At the least, all textual columns will be set to NOT NULL.


@dlebauer commented on Tue Feb 17 2015

make sure to look at not-null value constraints defined in the google spreadsheet.

At least, all *id fields are integers, all times are ... times. created_at, updated_at should be > 2010-01-01; etc.


@gsrohde commented on Wed Feb 18 2015

@dlebauer You don't need to mention constraints that are already inherent in the data types being used ("all *id fields are integers"). That said, there may be cases where the wrong data type has been used. (For example, variables.min and variables.max are varchar, not numeric.)


@gsrohde commented on Wed Feb 18 2015

The NOT NULL constraints for the non-textual columns were dealt with in GH issue #200. The one exception (of the NOT-NULL constraints specified in the google spreadsheet) is the column started_at. So I've added a non-NULL constraint for it to my draft migration and also specified it to be <= NOW().


@gsrohde commented on Mon Mar 16 2015

While created_at and updated_at are always UTC (or should be), the comments on started_at and finished_at refer to 'machine time'. Thus, we'll assume the constraint should be started_at <= NOW()::timestamp where the explicit cast to timestamp, while not necessary, makes clear we are comparing started_at with the current local or machine time.


@gsrohde commented on Wed May 20 2015

@dlebauer All the runs created in March and April have a NULL in the started_at column. Is this an error, or should we forget about having a not-NULL constraint on this column?


@dlebauer commented on Wed May 20 2015

Delete these rows, add constraint, and then we will fix pecan
On Wed, May 20, 2015 at 2:02 PM Scott Rohde [email protected]
wrote:

Assigned #245 PecanProject/bety#245 to
@dlebauer https://github.com/dlebauer.


Reply to this email directly or view it on GitHub
PecanProject/bety#245 (comment).


@gsrohde commented on Mon Dec 07 2015

The migration value_constraints_batch_1 was included in release 4.2. This ensures that runs.outdir, runs.outprefix, and runs.setting are all non-null and have a default value of ''.


@gsrohde commented on Mon Dec 07 2015

@dlebauer See pending (unchecked) item above and decide whether to ignore, table (put in backlog), or fix now.


@dlebauer commented on Tue Jul 12 2016

@gsrohde it looks like this constraint is ready to be added. Is that correct?


@gsrohde commented on Wed Aug 03 2016

@dlebauer Remaining to do (or not) are value constraints on the columns of type timestamp—namely start_time, finish_time, started_at, and finished_at. Here are constraints that I wanted to add to a previous migration but had to comment out because existing data violates them:

-- ALTER TABLE runs ALTER COLUMN started_at SET NOT NULL;
-- ALTER TABLE runs ADD CONSTRAINT valid_run_start_time CHECK (started_at <= NOW()::timestamp); -- NOW()::timestamp makes clear we are using local machine time
-- ALTER TABLE runs ADD CONSTRAINT consistent_run_start_and_end_times CHECK (started_at <= finished_at AND finished_at <= NOW()::timestamp);

As an example, on ebi_production, running

select * from runs where not started_at <= finished_at;

returns 7 rows, 4 of which were created in the last two weeks.

Another issue is what timezone is being assumed for these times (see also issue #257). For existing data, in some cases it seems to be Illinois time and in other cases it seems to be UTC. Perhaps this is not important.


@dlebauer commented on Wed Aug 03 2016

@robkooper @mdietze can you address Scott's questions above?


@mdietze commented on Wed Aug 03 2016

Time zone is critical if you put any constraints on started_at and finished_at otherwise this will completely destroy the sync system. All times should be UTC. We decided that a few years ago so hopefully any Central time records are old. There should be no constraints on the absolute values of start_time and finish_time. The relative constraint of finish being after start seems ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant