Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload Sugarcane Validation Data #10

Open
dlebauer opened this issue Sep 18, 2018 · 0 comments
Open

Upload Sugarcane Validation Data #10

dlebauer opened this issue Sep 18, 2018 · 0 comments

Comments

@dlebauer
Copy link
Member

@dlebauer commented on Mon Nov 02 2015

Replaces redmine issue 1940

Upload data used to calibrate / validate Jaiswal et al in prep


@dlebauer commented on Mon Nov 02 2015

@djaiswal to upload these data I'll need to enter the dates of harvest as managements for the different ratoons. can you identify these? I've updated the dataset to be consistent with the database here: https://docs.google.com/spreadsheets/d/1dLCR3gh0F4lP9ph_JFQoLgLEC1i4NSbiq9z42eztnzs/edit?usp=sharing

Also note that I've converted all units except yield to g/m2


@dlebauer commented on Fri May 06 2016

Yield: https://docs.google.com/spreadsheets/d/1Rzw1PKfiM99KiEw0SvPDxE1bcdwBf4ZWq5olj1VeD8A/edit#gid=795358067

Traits: https://docs.google.com/spreadsheets/d/1ThszYYwhlIsBA5c1YxD-YiatXiJFTgMF4J7SSMkpriU/edit?pref=2&pli=1#gid=13


@dlebauer commented on Fri May 06 2016

@gsrohde please upload
SugarcaneData_Brazil_RB72454.csv - alldata_for_upload (8).csv.zip


@gsrohde commented on Mon May 09 2016

@dlebauer The problem with empty entries for some trait values in some rows is not hard to fix for the bulk upload. It's easier at this point than trying to use the API. I should also set the dateloc to 5 (day) and the timeloc to 9 (no info), which the bulk upload doesn't currently do.

One minor problem is that the dates are treated as if they are midnight Illinois time and then converted to UTC when inserted into the database using the offset in effect at the time of upload (5 or 6 hours). The date could possibly appear to be a day off, though in this case probably wouldn't since Brazil time is close to Central time.

If you want this done right away, the easiest thing for me to do is to temporarily connect my "fixed" version of BETYdb directly to the ebi_production database rather than wait until I do a formal release with the revised bulk upload code.

I'm assuming you only want me to upload the data given in your most recent comment. Let me know if otherwise.


@dlebauer commented on Fri May 13 2016

@gsrohde yes, please only the two files:

SugarcaneData_Brazil_RB72454.csv - alldata_for_upload (8).csv.zip

Sugarcane Data - Bulk Upload (3).zip

The default when no time is provided should be noon local time.


@dlebauer commented on Wed May 25 2016

I've uploaded SugarcaneData_Brazil_RB72454.csv - alldata_for_upload (8).csv.zip. Am working on reformatting the second file https://docs.google.com/spreadsheets/d/1ThszYYwhlIsBA5c1YxD-YiatXiJFTgMF4J7SSMkpriU/edit#gid=13.


@dlebauer commented on Wed May 25 2016

@gsrohde how should I format an upload when multiple variables have values of n and SE. I thought it was trait1, trait1 n, trait1 SE, trait2, trait2 n, trait2 SE. See attached.
Sugarcane Data - zhao bulk upload (1).txt


@gsrohde commented on Wed May 25 2016

@dlebauer

As stated at https://www.authorea.com/users/5574/articles/6800/_show_article:

Note that if you have more than one trait variable column, each trait will get the same values of n and SE. There is currently no way to use different sample size and standard error values for different trait variables. Also, the n and SE values for any associated covariates will be set to NULL. (Eventually, we may enable associating differing values of n and SE to different trait variables and covariates. In this case, we might add columns [trait variable 1] n and [trait variable 1] SE, etc. or [covariate 1] n and [covariate 1] SE, prefixing the usual column heading with a variable name to indicate which variable the sample size and standard error value is to be associated with.)

(By the way, it would be nice to get the GitBook server set up correctly so we can start pointing to those versions of the documentation as the "official" ones.)


@dlebauer commented on Wed May 25 2016

what needs to be done to get the gitbook server set up - do you mean host
it on a local server?

On Wed, May 25, 2016 at 4:52 PM, Scott Rohde [email protected]
wrote:

@dlebauer https://github.com/dlebauer

As stated at
https://www.authorea.com/users/5574/articles/6800/_show_article:

Note that if you have more than one trait variable column, each trait will
get the same values of n and SE. There is currently no way to use different
sample size and standard error values for different trait variables. Also,
the n and SE values for any associated covariates will be set to NULL.
(Eventually, we may enable associating differing values of n and SE to
different trait variables and covariates. In this case, we might add
columns [trait variable 1] n and [trait variable 1] SE, etc. or [covariate
1] n and [covariate 1] SE, prefixing the usual column heading with a
variable name to indicate which variable the sample size and standard error
value is to be associated with.)

(By the way, it would be nice to get the GitBook server set up correctly
so we can start pointing to those versions of the documentation as the
"official" ones.)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
PecanProject/bety#363 (comment)


@gsrohde commented on Thu May 26 2016

No, I mean on the GitBook site.

The "BETYdb: Data Access" book seems fine: it displays properly and syncs properly with GitHub as far as I can tell.

For the other three "Pecan" books, when I click the "Read" button, I get the introduction, but in the left pane where the table of contents should be, I just see the word "apache". As for syncing, I don't know whether or not these books are syncing properly with GitHub.

I was able to fix one problem: When I tried to edit the "Data Entry Workflow" book on the GitBook site, the table of contents panel would never finish loading. I fixed this by removing the commented-out line in the book.json file. It seems you can't use // style comment delimiters there. (I was hoping this would fix the TOC in Read mode as well, but it didn't seem to.)


@dlebauer commented on Thu May 26 2016

@gsrohde not sure what you mean - All of these appear okay to me (or maybe you've fixed them?) https://pecan.gitbooks.io/betydb-documentation/content/ or https://pecan.gitbooks.io/pecan-documentation/content/ or https://pecan.gitbooks.io/betydbdoc-dataentry/content/


@gsrohde commented on Fri May 27 2016

Yeah, they look okay now to me also, though I'm on a different computer. Maybe it was some sort of browser caching issue.


@dlebauer commented on Fri May 27 2016

TODO:

  • upload lehrer, zhao, and two menzier sheets (these require ability to upload multipe traits w/ n, SE, or else to have entity_id) (@gsrohde will the API upload support either of these?)
  • @djaiswal add treatments to betydb and corresponding rows in liu, ludlow, Kawamitsu sheets

data are here https://docs.google.com/spreadsheets/d/1ThszYYwhlIsBA5c1YxD-YiatXiJFTgMF4J7SSMkpriU/edit#gid=1724902509


@djaiswal commented on Thu Sep 22 2016

@dlebauer
Trait data (spreadhseet name ending with bulk upload) looks okay to be uploaded as it is already checked for quality control by undergrads sometime ago.

However, I still do not see the original sugarcane yield data in Brasil for cultivar RB72454 on betyDB. In the earlier messages on this issue, you mentioned that it has already been uploaded. Could you please clarify.

Y!.r7(RSU`23,fLv


@dlebauer commented on Thu Sep 22 2016

@djaiswal Many of those sheets still require treamtment names and sitenames, and cultivars if they are available

Then I can upload them and

  • if the data can be made public, I can make them available without logging in
  • if they have been checked, I can make them available without checking the box to 'include unchecked' on the search page (assuming this is where you were looking)

@djaiswal commented on Thu Sep 22 2016

@dlebauer

The most recent CSV prepared by you contains filled columns for treatment, site-names, and cultivar (
SugarcaneData_Brazil_RB72454.csv.-.alldata_for_upload.8.csv.zip

Can you please briefly look at each of the three columns that you mentioned . if It is not consistent with what necessary format then I can go back to betDB documentation and change it accordingly.

This can be made public as it is data from other published literature.

Yes, the file that I shared with you originally at the beginning of this issue is checked.

I was searching after logging in the betyDB system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant