1
1
Plotting the distribution of taxa
2
2
================
3
3
Sur Herrera Paredes
4
- 2022-05-24
4
+ 2022-06-06
5
5
6
+ - [ Introduction] ( #introduction )
7
+ - [ Getting ready] ( #getting-ready )
6
8
- [ Read data] ( #read-data )
7
9
- [ Basic barplot] ( #basic-barplot )
8
10
- [ Adding sample metadata] ( #adding-sample-metadata )
@@ -14,6 +16,61 @@ Sur Herrera Paredes
14
16
- [ Extra excercises] ( #extra-excercises )
15
17
- [ Session info] ( #session-info )
16
18
19
+ # Introduction
20
+
21
+ In this extended example I go through every step to produce a relative
22
+ abundance barplot that represents bacterial communities living in
23
+ individual hosts.
24
+
25
+ Bacterial communities are everywhere, and when we characterize them it
26
+ is important to describe they overall taxonomic structure as that gives
27
+ us clues as to what types of functions might be performed by the
28
+ community. At the same time, it is important to show the variability of
29
+ these communities and thus it is useful to plot them at the lowest
30
+ aggregation level possible.
31
+
32
+ In this example, I utilize data from a big experiment that was published
33
+ [ here] ( https://www.nature.com/articles/nature11237 ) . In that experiment,
34
+ we planted individual * Arabidopsis thaliana* plants in individual pots.
35
+ The pots each had one of two types of natural soil, each from two
36
+ different seasons. We planted eight different accessions in each of
37
+ those soils, and plants were harvested at two developmental stages.
38
+ Additionally we had unplanted soil only pots, these are the “soil”
39
+ samples in the example. For each individual plant, we harvested two
40
+ fractions (i.e. E & R), one which we called the Endophytic Compartment
41
+ (E) and corresponds to the interior of the root after removing the outer
42
+ cell wall, and another which we called Rhizosphere (R) which is the soil
43
+ within 1mm of the plant root. So ** E** samples contains bacteria inside
44
+ the root, and ** R** samples contain bacteria immediately surrounding the
45
+ root.
46
+
47
+ Plant-bacteria interactions in the root are incredibly important because
48
+ the root is both the gut and the brain of the plant. Microbes there can
49
+ benefit of the products of the plant photosynthesis as sources of
50
+ nutrition, and can also provide chemistry that the plant couldn’t
51
+ perform by itself. However, the story is more complicated because
52
+ microbial competition and the plant immune system also provide a fertile
53
+ evolutionary environment for antagonistic interactions.
54
+
55
+ Ultimately understanding and being able to manipulate plant-bacteria
56
+ interactions has a lot of implications as hunger is one of the most
57
+ pressing problems of humanity with 800 million people living in hunger.
58
+ Agriculture is our only sustainable tool against hunger, and even though
59
+ it currently employs a quarter of the World population it has not been
60
+ enough to tackle this challenge.
61
+
62
+ # Getting ready
63
+
64
+ First you need to get the data. If you haven’t check the
65
+ [ README] ( https://github.com/surh/scip_barplot/blob/master/README.md )
66
+ file of the GitHub repository of the workshop. It is also recommended
67
+ that you watch the YouTube
68
+ [ video] ( https://www.youtube.com/watch?v=siIoupAnILk ) that runs through
69
+ the example. Finally, you will need to install the ` tidyverse ` package.
70
+
71
+ Once you have everything you need, start an R session and load the
72
+ tidyverse package:
73
+
17
74
``` r
18
75
library(tidyverse )
19
76
```
@@ -31,6 +88,9 @@ library(tidyverse)
31
88
32
89
# Read data
33
90
91
+ First read the OTU table. You may need to change the file path to
92
+ wherever you downloaded the files in your machine.
93
+
34
94
``` r
35
95
Tab <- read_tsv(" data/rhizo/otu_table.tsv" )
36
96
```
69
129
## # D416 <dbl>, D417 <dbl>, D418 <dbl>, D419 <dbl>, D420 <dbl>, D421 <dbl>,
70
130
## # D422 <dbl>, D423 <dbl>, D424 <dbl>, D425 <dbl>, D426 <dbl>, D427 <dbl>, …
71
131
132
+ The code above reads the file into a ` tibble ` , which is a type of
133
+ ` data.frame ` that has some neat additional properties. You don’t need to
134
+ concern yourself too much with the differences.
135
+
136
+ The code above also produces a warning, indicating that ` read_tsv ` tried
137
+ to guess the types of data in each column of the table. It guessed
138
+ correctly but you should always specify the expected columns with the
139
+ option ` col_types ` (use ` ?read_tsv ` for additional details).
140
+
72
141
``` r
73
142
Tab <- read_tsv(" data/rhizo/otu_table.tsv" ,
74
143
col_types = cols(otu_id = col_character(),
100
169
# Basic barplot
101
170
102
171
We need to think back to the original figure and reformat our data to
103
- have one column for the x-axis and another for the y-axis
172
+ have one column for the x-axis and another for the y-axis. This is a
173
+ requirement for ` ggplot2 ` . We can to that with ` pivot_longer ` , a
174
+ function of the ` tidyverse ` .
104
175
105
176
``` r
106
177
Tab %> %
@@ -122,6 +193,9 @@ Tab %>%
122
193
## 10 OTU_14834 D196 1
123
194
## # … with 8,891 more rows
124
195
196
+ In the code above the options ` samples_to ` and ` names_to ` indicate the
197
+ names of the new columns in the new tibble.
198
+
125
199
Lets create a smaller subset of the data to make some basic plots
126
200
127
201
``` r
492
566
```
493
567
494
568
![ ] ( extended_example_files/figure-gfm/unnamed-chunk-22-1.png ) <!-- -->
495
- \#\ # Excercise
569
+ \# # Excercise
496
570
497
571
Use ` scale_color_manual ` to manually select a good set of colors for
498
572
this plot
@@ -574,7 +648,7 @@ ggsave("rhizo_phylo_distribution.png", p1, width = 8, height = 4)
574
648
575
649
# Extra excercises
576
650
577
- Look at the files at [ data/hmp \_ v13 ] ( data/hmp_v13 ) which contain much
651
+ Look at the files at [ data/hmp_v13 ] ( data/hmp_v13 ) which contain much
578
652
bigger data tables generated from the Human Microbiome Project (HMP).
579
653
580
654
Can you make similar plots illustrating the bacterial taxonomic
0 commit comments