diff --git a/sdrf-metabolomics/README.adoc b/sdrf-metabolomics/README.adoc new file mode 100644 index 000000000..56fb63acd --- /dev/null +++ b/sdrf-metabolomics/README.adoc @@ -0,0 +1,396 @@ += Sample and Data Relationship Format for Metabolomics (SDRF-Metabolomics) +:sectnums: +:toc: left +:doctype: book +//only works on some backends, not HTML +:showcomments: +//use style like Section 1 when referencing within the document. +:xrefstyle: short +:figure-caption: Figure +:pdf-page-size: A4 + +//GitHub specific settings +ifdef::env-github[] +:tip-caption: :bulb: +:note-caption: :information_source: +:important-caption: :heavy_exclamation_mark: +:caution-caption: :fire: +:warning-caption: :warning: +endif::[] + +== Status of this document + + +This document outlines the proposed adaptation of the established Sample and Data Relationship File (SDRF) - proteomics to SDRF - metabolomics data. Distribution is unlimited. + +Version Draft - this is a draft of version 1.0 + +== Abstract + +Metabolomics explores small molecules, called metabolites, within cells, fluids, tissues, or organisms on a large scale. Together, these molecules and their interactions form the metabolome in a biological system. Just like genomics studies DNA and transcriptomics studies RNA, metabolomics examines the substances involved in metabolism, which are affected by genetics and the environment. Metabolomics stands out because it directly shows the biochemical activity and condition of cells or tissues through metabolite levels, making it the most accurate way to understand molecular traits. + +This document presents a specification for a sample metadata annotation of metabolomics experiments. + + +=== Requirements + +The SDRF-Metabolomic format describes the sample characteristics and the relationships between samples and data files included in a dataset. The information in SDRF files is organised so that it follows the natural flow of a metabolomics experiment. The main requirements to be fulfilled for SDRF-Metabolomics format are: + +- The SDRF file is a tab-delimited format where each ROW corresponds to a relationship between a Sample and a Data file +- Each column MUST correspond to an attribute/property of the Sample or the Data file. +- Each value in each cell MUST be the property for a given Sample or Data file. +- The file MUST begin with columns describing the samples of origin and continue with the data files. +- Support for handling unknown values/characteristics. + +=== Issues to be addressed + +The main issues to be addressed by the SDRF-Metabolomics are: + +- It MUST be able to represent the sample metadata and the data files generated by the instruments or the analyses. +- It MUST be able to represent the experimental design including the way samples and data have been collected. +- It should also be compatible with ISA-TAB, used in this field + +[[ontologies-supported]] +== Ontologies/Controlled Vocabularies Supported + +The list of ontologies/controlled vocabularies (CV) supported are: + +- PSI Mass Spectrometry CV (PSI-MS) +- Experimental Factor Ontology (EFO). +- Unimod protein modification database for mass spectrometry +- PSI-MOD CV (PSI-MOD) +- Cell line ontology +- Drosophila anatomy ontology +- Cell ontology +- Plant ontology +- Uber-anatomy ontology +- Zebrafish anatomy and development ontology +- Zebrafish developmental stages ontology +- Plant Environment Ontology +- FlyBase Developmental Ontology +- Rat Strain Ontology +- Chemical Entities of Biological Interest Ontology +- NCBI organismal classification +- PATO - the Phenotype and Trait Ontology +- PRIDE Controlled Vocabulary (CV) + +[[sdrf-file-format]] +== SDRF-Metabolomics file format + +The SDRF-Metabolomics file format describes the sample characteristics and the relationships between samples and data files. The file format is a tab-delimited one where each ROW corresponds to a relationship between a Sample and a Data file, each column corresponds to an attribute/property of the Sample and the value in each cell is the specific value of the property for a given Sample (**Figure 2**). + +[#img-sunset] +image::https://github.com/bigbio/proteomics-metadata-standard/raw/master/sdrf-proteomics/images/sdrf-nutshell.png[] + +**Figure 2**: SDRF-Proteomics in a nutshell. The file format is a tab-delimited one where columns are properties of the sample, the data file or the variables under study. The rows are the samples of origin and the cells are the values for one property in a specific sample. + +=== SDRF-Metabolomics format rules + +There are general scenarios/use cases that are addressed by the following rules: + +- **Unknown values**: In some cases, the column is mandatory in the format but for some samples the corresponding value is unknown. In those cases, users SHOULD use ‘not available’. +- **Not Applicable values**: In some cases, the column is mandatory but for some samples the corresponding value is not applicable. In those cases, users SHOULD use ‘not applicable’. +- **Case sensitivity**: By specification the SDRF is case insensitive, but we RECOMMEND using lowercase characters throughout all the text (Column names and values). +- **Spaces**: By specification the SDRF is case sensitive to spaces (sourcename != source name). +- **Column order**: The SDRF MUST start with the source name column (accession/name of the sample of origin), then all the sample characteristics; followed by the assay name corresponding to the MS run. Finally, after the assay name all the comments (properties of the data file generated). +- **Extension**: The extension of the SDRF should be .tsv or .txt. + + +[[sdrf-file-standarization]] +=== SDRF-Metabolomics values + +The value for each property (e.g. characteristics, comment) corresponding to each sample can be represented in multiple ways. + +- Free Text (Human readable): In the free text representation, the value is provided as text without Ontology support (e.g. colon or providing accession numbers). This is only RECOMMENDED when the text inserted in the table is the exact name of an ontology/CV term in EFO. If the term is not in EFO, other ontologies can be used. + +|=== +| source name | characteristics[organism] + +| sample 1 |homo sapiens +| sample 2 |homo sapiens +|=== + +- Ontology url (Computer readable): Users can provide the corresponding URI (Uniform Resource Identifier) of the ontology/CV term as a value. This is recommended for enriched files where the user does not want to use intermediate tools to map from free text to ontology/CV terms. + +|=== +| source name | characteristics[organism] + +| Sample 1 |http://purl.obolibrary.org/obo/NCBITaxon_9606 +| Sample 2 |http://purl.obolibrary.org/obo/NCBITaxon_9606 +|=== + +- Key=value representation (Human and Computer readable): The current representation aims to provide a mechanism to represent the complete information of the ontology/CV term including Accession, Name and other additional properties. In the key=value pair representation the Value of the property is represented as an Object with multiple properties, where the key is one of the properties of the object and the value is the corresponding value for the particular key. An example of key value pairs is post-translational modification <> + + NT=Glu->pyro-Glu; MT=fixed; PP=Anywhere; AC=Unimod:27; TA=E + +== SDRF-Metabolomics: Samples metadata + +The Sample metadata has different Categories/Headings to organize all the attributes/ column headers of a given sample. Each Sample contains a _source name_ (accession) and a set of _characteristics_. Any metabolomics sample MUST contain the following characteristics: + +- *source name*: Unique sample name (it can be present multiple times if the same sample is used several times in the same dataset) +- *characteristics[organism]*: The organism of the Sample of origin. +- *characteristics[disease]*: The disease under study in the Sample. +- *characteristics[organism part]*: The part of organism's anatomy or substance arising from an organism from which the biomaterial was derived (e.g. liver) +- *characteristics[cell type]*: A cell type is a distinct morphological or functional form of cell. Examples are epithelial, glial etc. + +Example: + +|=== +| source name | characteristics[organism] | characteristics[organism part] | characteristics[disease] | characteristics[cell type] + +|sample_treat | homo sapiens | liver | liver cancer | not available +|sample_control | homo sapiens | liver | liver cancer | not available +|=== + +NOTE: Additional characteristics can be added depending on the type of the experiment and sample. The TO BE ADDED AT LATER STAGE defines a set of templates and checklists of properties that should be provided depending on the metabolomics experiment. + +Some important notes: + +- Each characteristics name in the column header SHOULD be a CV term from the EFO ontology. For example, the header _characteristics[organism]_ corresponds to the ontology term Organism. + +- Multiple values (columns) for the same characteristics term are allowed in SDRF-Proteomics. However, it is RECOMMENDED not to use the same column in the same file. If you have multiple phenotypes, you can specify what it refers to or use another more specific term, e.g. "immunophenotype". + +[[from-sample-data]] +== SDRF-Metabolomics: Data files metadata + +The connection between the Samples to the Data files is done by using a series of properties and attributes. The use of comment is mainly aimed at differentiating sample properties from the data properties. It matches a given sample to the corresponding file(s). + +The order of the columns is important, _assay name_ SHOULD always be located before the comments. It is RECOMMENDED to put the last column as _comment[data file]_. The following properties MUST be provided for each data file: + +- **assay name**: Examples of assay names are: “run 1”, “run_fraction_1_2”. +- **comment[fraction identifier]**: The fraction identifier allows to record the number of a given fraction. The fraction identifier corresponds to this ontology term. It MUST start from 1 and if the experiment is not fractionated, 1 MUST be used for each assay. +- **comment[label]**: label describes the label applied to each Sample (if any). In case of multiplex experiments such as TMT, SILAC, and/or ITRAQ the corresponding label SHOULD be added. For Label-free experiments the label free sample term MUST be used <>. +- **comment[data file]**: The data file provides the name of the raw file generated by the instrument. The data files can be instrument raw files. +- **comment[instrument]**: Instrument model used to capture the sample <>. + +Example: + +|=== +| | assay name | comment[fraction identifier] | comment[instrument]| comment[data file] +|sample 1| run 1 | 1 | NT=Waters Xevo TQ | CBA1-BA.raw +|sample 1| run 2 | 2 | NT=Waters Xevo TQ | CBA2-BA.raw +|=== + +[[Analytical Platform]] +To be added later + +[[Seperation method]] +To be added later + +[[Detection/Identification(instrument)]] +==== Type and Model of the instrument + +The model of the mass spectrometer SHOULD be specified as _comment[instrument]_. Possible values are listed under https://www.ebi.ac.uk/ols/ontologies/ms/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMS_1000031&viewMode=All&siblings=false[instrument model term]. + + +== SDRF-Proteomics study variables + +The variable/property under study SHOULD be highlighted using the factor value category. For example, the _factor value[tissue]_ is used when the user wants to compare expression across different tissues. You can add Multiple variables under study by providing multiple factor values. + +|=== +|factor value | :zero: | 0..* | “factor value” columns SHOULD indicate which experimental factor/variable is used as the hypothesis to perform the data analysis. The “factor value” columns SHOULD occur after all characteristics and the attributes of the samples. | factor value[phenotype] +|=== + +[[conventions]] +== SDRF-Metabolomics conventions + +Conventions define how to encode some particular information in the file format in specific use cases. Conventions define a set of new columns that are needed to represent a particular use case or experiment type (e.g. phosphorylation dataset). In addition, conventions define how some specific free-text columns (value that are not defined as ontology terms) should be written. Conventions are compiled from the metabolomics community using TO BE ADDED LATER or pull-request and will be added to updated versions of this specification document in the future. + +In the convention section <>, the columns are described and defined, while in the section use cases and templates <> the columns needed to describe a use case are specified. + +=== How to encode age + +One of the characteristics of a patient sample can be the age of an individual. It is RECOMMENDED to provide the age in the following format: {X}Y{X}M{X}D. Some valid examples are: + +- 40Y (forty years) +- 40Y5M (forty years and 5 months) +- 40Y5M2D (forty years, 5 months, and 2 days) + +When needed, weeks can also be used: 8W (eight weeks) + +Age interval: + +Sometimes the sample does not have an exact age but a range of age. To annotate an age range the following standard is RECOMMENDED: + + 40Y-85Y + +This means that the subject (sample) is between 40 and 85 years old. Other temporal information can be encoded similarly. + +[[pooled-samples]] +=== Pooled samples + +When multiple samples are pooled into one, the general approach is to annotate them separately, abiding by the general rule: one row stands for one sample-to-file relationship. In this case, multiple rows are created for the corresponding data file. + +|=== +|source name |characteristics[pooled sample] | assay name | comment[data file] + +| sample 1 | not pooled | run 1 | file01.raw +| sample 2 | not pooled | run 1 | file01.raw +| sample 10 | SN=sample 1,sample 2, ... sample 9| run 1 | file01.raw +|=== + +`SN` stands for source names and lists `source name` fields of samples that are annotated in the same file and *used in the same experiment*. + +Another possible value for _characteristics[pooled sample]_ is a string `pooled` for cases when it is known that a sample is pooled but the individual samples cannot be annotated. + +=== Derived samples (such as patient-derived xenografts) + +In cancer research, patient-derived xenografts (PDX) are commonly used. In those, the patient’s tumor is transplanted into another organism, usually a mouse. In these cases, the metadata, such as age and sex MUST refer to the original patient and not the mouse. + +PDX samples SHOULD be annotated by using the column name _characteristics[xenograft]_. The value should then describe the growth condition, such as ‘pancreatic cancer cells grown in nude mice’. + +For experiments where both the PDX and the original tumor are measured, the PDX entry SHOULD reference the respective tumor sample’s source name in the _characteristics[source name]_ column. Non-PDX samples SHOULD contain the “not applicable” value in the _characteristics[xenograft]_ and the characteristics[source name] column. Both tumor and PDX samples SHOULD reference the patient using the characteristics[individual] column. This column SHOULD contain some sort of patient identifier. + +=== Spiked-in samples + +There are multiple scenarios when a sample is spiked with additional analytes. Peptides, proteins, or mixtures can be added to the sample as controlled amounts to provide a standard or ground truth for quantification, or for retention time alignment, etc. + +To include information about the spiked compounds, use _characteristics[spiked compound]_. The information is provided in key-value pairs. Here are the keys and values that SHOULD be provided: + +|=== +|Key | Meaning | Examples | Peptide | Protein | Mixture | Other + +|SP | Species | Escherichia coli K-12 | :zero: | :zero: | :zero: | :zero: +|CT | Compound type | protein, peptide, mixture, other | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: +|QY | Quantity (molar or mass) | 10 mg, 20 nmol | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: +|PS | Peptide sequence | PEPTIDESEQ |:white_check_mark: | | | +|AC | Uniprot Accession | A9WZ33 | | :white_check_mark: | | +|CN | Compound name | `iRT mixture`, `substance name` | | :zero: | :zero: | :zero: +|CV | Compound vendor | `in-house` or vendor name | :zero: | :zero: | :white_check_mark: | :zero: +|CS | Compound specification URI | `http://vendor.web.site/specs/coomercial-kit.xlsx` | :zero: | :zero: | :zero: | :zero: +|CF | Compound formula | `C2H2O` | | | | :zero: +|=== + +In addition to specifying the component and its quantity, the injected mass of the main sample SHOULD be specified as _characteristics[mass]_. + +An example of SDRF-Metabolomics for a sample spiked with a peptide would be: + +|=== +|characteristics[mass] | charateristics[spiked compound] +|1 ug | CT=peptide;PS=PEPTIDESEQ;QY=10 fmol +|=== + +For multiple spiked components, the column _characteristics[spiked compound]_ may be repeated. + +If the spiked component is another biological sample (e.g. __E. coli__ lysate spiked into human sample), then the spiked component MUST be annotated in its own row. Both components of the sample SHOULD have `characteristics[mass]` specified. Inclusion of _characteristics[spiked compound]_ is optional in this case; if provided, it SHOULD be the string `spiked` for the spiked sample. + +=== Normal and healthy samples + +Samples from healthy patients or individuals normally appear in manuscripts and annotations as healthy or normal. We RECOMMEND using the word “normal” mapped to term PATO_0000461 that is in EFO: normal PATO term. Example: + +|=== +| source name | characteristics[organism] | characteristics[organism part] | characteristics[phenotype] | characteristics[compound] | factor value[phenotype] + +|sample_treat | homo sapiens | Whole Organism | necrotic tissue | drug A | necrotic tissue +|sample_control | homo sapiens | Whole Organism | normal | none | normal +|=== + +=== Encoding sample technical and biological replicates + +Different measurements of the same biological sample are often categorized as (i) Technical or (ii) Biological replicates, based on whether they are (i) matched on all variables, e.g. same sample and same protocol; or (ii) different samples matched on explanatory variable(s), e.g. different patients receiving a placebo, in a placebo vs. drug trial. Technical and biological replicates have different levels of independence, which must be taken into account during data interpretation. + +For a given experiment, there are different levels to which samples can be matched - e.g. same sample, sample protocol, covariates - the definition of technical replicate can therefore vary based on the number of variables included. In addition, an experiment might be used in multiple models with different explanatory variable(s), and biological replicates in one model would not be replicates in another. Therefore, Technical vs. Biological considerations, while sometimes relevant to analytical and statistical interpretation, fall beyond the scope of the SDRF-Proteomics format. However, data providers are encouraged to provide any identifier - e.g. Biological_replicate_1, Technical_replicate_2 - that would help linking the samples to their analytical and statistical analysis as comments. A good starting point for the SDRF-Proteomics specification is the following: + +**technical replicate**: It is defined as repeated measurements of the same sample that represent independent measures of the random noise associated with protocols or equipment [4]. + +In MS-based proteomics a technical replicate can be, for example, doing the full sample preparation from extraction to MS multiple times to control variability in the instrument and sample preparation. Another valid example would be to replicate only one part of the analytical method, for example, run the sample twice on the LC-MS/MS. technical replicates indicate if measurements are scientifically robust or noisy, and how large the measured effect must be to stand out above that noise. + +In the following example, only if the technical replicate column is provided, one can distinguish quantitative values of the same fraction but different technical replicates. + +|=== +| source name | assay name | comment[label] | comment[fraction identifier] | comment[technical replicate] | comment[data file] +| Sample 1 | run 1 | label free sample | 1 | 1 | 000261_C05_P0001563_A00_B00K_F1_TR1.RAW +| Sample 1 | run 2 | label free sample | 2 | 1 | 000261_C05_P0001563_A00_B00K_F2_TR1.RAW +| Sample 1 | run 3 | label free sample | 1 | 2 | 000261_C05_P0001563_A00_B00K_F1_TR2.RAW +| Sample 1 | run 4 | label free sample | 2 | 2 | 000261_C05_P0001563_A00_B00K_F2_TR2.RAW +|=== + +The _comment[technical replicate]_ column is MANDATORY. Please fill it with 1 if technical replicates are not performed in a study. + +**Biological replicate**: parallel measurements of biologically distinct samples that capture biological variation, which may itself be a subject of study or a source of noise. Biological replicates address if and how widely the results of an experiment can be generalized. For example, repeating a particular assay with independently generated samples, individuals or samples derived from various cell types, tissue types, or organisms, to see if similar results can be observed. Context is critical, and appropriate biological replicates will indicate whether an experimental effect is sustainable under a different set of biological variables or an anomaly itself. + +In SDRF-Proteomics biological replicates can be annotated using _characteristics[biological replicate]_ and it is MANDATORY. Please fill it with 1 if biological replicates are not performed in a study. + +Some examples with explicit annotation of the biological replicates can be found here: + +- https://github.com/bigbio/proteomics-metadata-standard/blob/c3a56b076ef381280dfcb0140d2520126ace53ff/annotated-projects/PXD006401/sdrf.tsv + +[[sample-prep]] +=== Sample preparation properties + +In order to encode sample preparation details, we strongly RECOMMEND specifying the following parameters. + +- **comment [depletion]**: The removal of specific components of a complex mixture of proteins or peptides based on some specific property of those components. The values of the columns will be `no depletion` or `depletion`. In the case of depletion `depleted fraction` of `bound fraction` can be specified. + +- **comment [reduction reagent]**: The chemical reagent that is used to break disulfide bonds in proteins. The values of the column are under the term https://www.ebi.ac.uk/ols/ontologies/pride/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FPRIDE_0000607&viewMode=All&siblings=false[reduction reagent]. For example, DTT. + +- **comment [alkylation reagent]**: The alkylation reagent that is used to covalently modify cysteine SH-groups after reduction, preventing them from forming unwanted novel disulfide bonds. The values of the column are under the term https://www.ebi.ac.uk/ols/ontologies/pride/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FPRIDE_0000598&viewMode=All&siblings=false[alkylation reagent]. For example, IAA. + +- **comment [fractionation method]**: The fraction method used to separate the sample. The values of this term can be read under PRIDE ontology term https://www.ebi.ac.uk/ols/ontologies/pride/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FPRIDE_0000550[Fractionation method]. For example, Off-gel electrophoresis. + +[[fragment-proper]] +=== MS/MS properties + +- **comment[collision energy]**: Collision energy can be added as non-normalized (10000 eV) or normalized (1000 NCE) value. + +- **comment[dissociation method]**: This property will provide information about the fragmentation method, like HCD, CID. The values of the column are under the term https://www.ebi.ac.uk/ols/ontologies/ms/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMS_1000044&viewMode=All&siblings=false[dissociation method]. + +[[raw-file-uri]] +=== RAW file URI + +We RECOMMEND to include the public URI of the file if available. For example for ProteomeXchange datasets the URI from the FTP can be provided: + +|=== +| |... |comment[file uri] + +|sample 1| ... |https://ftp.ebi.ac.uk/pride-archive/2017/09/PXD005946/000261_C05_P0001563_A00_B00K_R1.RAW +|=== + +[[multiple-projects]] +=== Multiple projects into one annotation file + +Curators can decide to annotate multiple ProteomeXchange datasets into one large SDRF-Proteomics file for reanalysis purposes. If that is the case, it is RECOMMENDED to use the comment[proteomexchange accession number] to differentiate between different datasets. + +[[use-cases]] +== SDRF-Proteomics use cases representation (templates) + +Please visit the following document to read about SDRF-Proteomics use cases, templates and https://github.com/bigbio/proteomics-metadata-standard/blob/master/templates/README.adoc[checklists]. + +[[example-annotated-datasets]] +== Examples of annotated datasets + +|=== +|Dataset Type | ProteomeXchange / Pubmed Accession | SDRF URL +|Label-free | PXD008934 | https://github.com/bigbio/proteomics-metadata-standard/tree/master/annotated-projects/PXD008934 +|TMT | PXD017710 | https://github.com/bigbio/proteomics-metadata-standard/tree/master/annotated-projects/PXD017710 + +|=== + +== Ongoing use case discussions + +We have created a file in GitHub https://github.com/bigbio/proteomics-metadata-standard/blob/master/sdrf-proteomics/use-cases-under-development.adoc[Ongoing use case discussions] where we aggregate all the ongoing discussions about the format. + +== Intellectual Property Statement + +The PSI takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Copies of claims of rights made available for publication and any assurances of licenses to be made available or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the PSI Chair. + +The PSI invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this recommendation. Please address the information to the PSI Chair (see contacts information at PSI website). + +== Copyright Notice + +Copyright (C) Proteomics Standards Initiative (2020). All Rights Reserved. + +This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without the restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the PSI or other organizations, except as needed for the purpose of developing Proteomics Recommendations in which case the procedures for copyrights defined in the PSI Document process must be followed, or as required to translate it into languages other than English. + +The limited permissions granted above are perpetual and will not be revoked by the PSI or its successors or assigns. + +This document and the information contained herein is provided on an "AS IS" basis and THE PROTEOMICS STANDARDS INITIATIVE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." + +== References + + +- [1] Y. Perez-Riverol, S. European Bioinformatics Community for Mass, Toward a Sample Metadata Standard in Public Proteomics Repositories, J Proteome Res 19(10) (2020) 3906-3909. +- [2] A. Gonzalez-Beltran, E. Maguire, S.A. Sansone, P. Rocca-Serra, linkedISA: semantic representation of ISA-Tab experimental metadata, BMC Bioinformatics 15 Suppl 14 (2014) S4. +- [3] T.F. Rayner, P. Rocca-Serra, P.T. Spellman, H.C. Causton, A. Farne, E. Holloway, R.A. Irizarry, J. Liu, D.S. Maier, M. Miller, K. Petersen, J. Quackenbush, G. Sherlock, C.J. Stoeckert, Jr., J. White, P.L. Whetzel, F. Wymore, H. Parkinson, U. Sarkans, C.A. Ball, A. Brazma, A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB, BMC Bioinformatics 7 (2006) 489. +- [4] P. Blainey, M. Krzywinski, N. Altman, Points of significance: replication, Nat Methods 11(9) (2014) 879-80. +