Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload Sugarcane Validation Data #363

Closed
dlebauer opened this issue Nov 2, 2015 · 17 comments
Closed

Upload Sugarcane Validation Data #363

dlebauer opened this issue Nov 2, 2015 · 17 comments

Comments

@dlebauer
Copy link
Member

dlebauer commented Nov 2, 2015

Replaces redmine issue 1940

Upload data used to calibrate / validate Jaiswal et al in prep

@dlebauer
Copy link
Member Author

dlebauer commented Nov 2, 2015

@djaiswal to upload these data I'll need to enter the dates of harvest as managements for the different ratoons. can you identify these? I've updated the dataset to be consistent with the database here: https://docs.google.com/spreadsheets/d/1dLCR3gh0F4lP9ph_JFQoLgLEC1i4NSbiq9z42eztnzs/edit?usp=sharing

Also note that I've converted all units except yield to g/m2

@dlebauer dlebauer assigned dlebauer and gsrohde and unassigned djaiswal and dlebauer May 6, 2016
@dlebauer
Copy link
Member Author

dlebauer commented May 6, 2016

@gsrohde
Copy link
Contributor

gsrohde commented May 9, 2016

@dlebauer The problem with empty entries for some trait values in some rows is not hard to fix for the bulk upload. It's easier at this point than trying to use the API. I should also set the dateloc to 5 (day) and the timeloc to 9 (no info), which the bulk upload doesn't currently do.

One minor problem is that the dates are treated as if they are midnight Illinois time and then converted to UTC when inserted into the database using the offset in effect at the time of upload (5 or 6 hours). The date could possibly appear to be a day off, though in this case probably wouldn't since Brazil time is close to Central time.

If you want this done right away, the easiest thing for me to do is to temporarily connect my "fixed" version of BETYdb directly to the ebi_production database rather than wait until I do a formal release with the revised bulk upload code.

I'm assuming you only want me to upload the data given in your most recent comment. Let me know if otherwise.

@gsrohde gsrohde assigned dlebauer and unassigned gsrohde May 10, 2016
@dlebauer
Copy link
Member Author

@gsrohde yes, please only the two files:

SugarcaneData_Brazil_RB72454.csv - alldata_for_upload (8).csv.zip

Sugarcane Data - Bulk Upload (3).zip

The default when no time is provided should be noon local time.

@dlebauer dlebauer assigned gsrohde and dlebauer and unassigned dlebauer and gsrohde May 25, 2016
@dlebauer
Copy link
Member Author

I've uploaded SugarcaneData_Brazil_RB72454.csv - alldata_for_upload (8).csv.zip. Am working on reformatting the second file https://docs.google.com/spreadsheets/d/1ThszYYwhlIsBA5c1YxD-YiatXiJFTgMF4J7SSMkpriU/edit#gid=13.

@dlebauer
Copy link
Member Author

@gsrohde how should I format an upload when multiple variables have values of n and SE. I thought it was trait1, trait1 n, trait1 SE, trait2, trait2 n, trait2 SE. See attached.
Sugarcane Data - zhao bulk upload (1).txt

@gsrohde
Copy link
Contributor

gsrohde commented May 25, 2016

@dlebauer

As stated at https://www.authorea.com/users/5574/articles/6800/_show_article:

Note that if you have more than one trait variable column, each trait will get the same values of n and SE. There is currently no way to use different sample size and standard error values for different trait variables. Also, the n and SE values for any associated covariates will be set to NULL. (Eventually, we may enable associating differing values of n and SE to different trait variables and covariates. In this case, we might add columns [trait variable 1] n and [trait variable 1] SE, etc. or [covariate 1] n and [covariate 1] SE, prefixing the usual column heading with a variable name to indicate which variable the sample size and standard error value is to be associated with.)

(By the way, it would be nice to get the GitBook server set up correctly so we can start pointing to those versions of the documentation as the "official" ones.)

@dlebauer
Copy link
Member Author

what needs to be done to get the gitbook server set up - do you mean host
it on a local server?

On Wed, May 25, 2016 at 4:52 PM, Scott Rohde [email protected]
wrote:

@dlebauer https://github.com/dlebauer

As stated at
https://www.authorea.com/users/5574/articles/6800/_show_article:

Note that if you have more than one trait variable column, each trait will
get the same values of n and SE. There is currently no way to use different
sample size and standard error values for different trait variables. Also,
the n and SE values for any associated covariates will be set to NULL.
(Eventually, we may enable associating differing values of n and SE to
different trait variables and covariates. In this case, we might add
columns [trait variable 1] n and [trait variable 1] SE, etc. or [covariate
1] n and [covariate 1] SE, prefixing the usual column heading with a
variable name to indicate which variable the sample size and standard error
value is to be associated with.)

(By the way, it would be nice to get the GitBook server set up correctly
so we can start pointing to those versions of the documentation as the
"official" ones.)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#363 (comment)

@gsrohde
Copy link
Contributor

gsrohde commented May 26, 2016

No, I mean on the GitBook site.

The "BETYdb: Data Access" book seems fine: it displays properly and syncs properly with GitHub as far as I can tell.

For the other three "Pecan" books, when I click the "Read" button, I get the introduction, but in the left pane where the table of contents should be, I just see the word "apache". As for syncing, I don't know whether or not these books are syncing properly with GitHub.

I was able to fix one problem: When I tried to edit the "Data Entry Workflow" book on the GitBook site, the table of contents panel would never finish loading. I fixed this by removing the commented-out line in the book.json file. It seems you can't use // style comment delimiters there. (I was hoping this would fix the TOC in Read mode as well, but it didn't seem to.)

@dlebauer
Copy link
Member Author

@gsrohde
Copy link
Contributor

gsrohde commented May 27, 2016

Yeah, they look okay now to me also, though I'm on a different computer. Maybe it was some sort of browser caching issue.

@dlebauer
Copy link
Member Author

TODO:

  • upload lehrer, zhao, and two menzier sheets (these require ability to upload multipe traits w/ n, SE, or else to have entity_id) (@gsrohde will the API upload support either of these?)
  • @djaiswal add treatments to betydb and corresponding rows in liu, ludlow, Kawamitsu sheets

data are here https://docs.google.com/spreadsheets/d/1ThszYYwhlIsBA5c1YxD-YiatXiJFTgMF4J7SSMkpriU/edit#gid=1724902509

@djaiswal
Copy link

djaiswal commented Sep 22, 2016

@dlebauer
Trait data (spreadhseet name ending with bulk upload) looks okay to be uploaded as it is already checked for quality control by undergrads sometime ago.

However, I still do not see the original sugarcane yield data in Brasil for cultivar RB72454 on betyDB. In the earlier messages on this issue, you mentioned that it has already been uploaded. Could you please clarify.

Y!.r7(RSU`23,fLv

@dlebauer
Copy link
Member Author

@djaiswal Many of those sheets still require treamtment names and sitenames, and cultivars if they are available

Then I can upload them and

  • if the data can be made public, I can make them available without logging in
  • if they have been checked, I can make them available without checking the box to 'include unchecked' on the search page (assuming this is where you were looking)

@djaiswal
Copy link

djaiswal commented Sep 22, 2016

@dlebauer

The most recent CSV prepared by you contains filled columns for treatment, site-names, and cultivar (
SugarcaneData_Brazil_RB72454.csv.-.alldata_for_upload.8.csv.zip

Can you please briefly look at each of the three columns that you mentioned . if It is not consistent with what necessary format then I can go back to betDB documentation and change it accordingly.

This can be made public as it is data from other published literature.

Yes, the file that I shared with you originally at the beginning of this issue is checked.

I was searching after logging in the betyDB system.

@dlebauer
Copy link
Member Author

Issue moved to PecanProject/bety-data #10 via ZenHub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants