-
Notifications
You must be signed in to change notification settings - Fork 17
/
Copy pathintro_r_session3_slides.Rmd
120 lines (79 loc) · 3.13 KB
/
intro_r_session3_slides.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
title: "Intro to R Workshop: Session 3"
#author: "Yuxiao Wang"
date: "April 13, 2017"
output: ioslides_presentation
subtitle: UCI Data Science Initiative
---
```{r, include=FALSE, echo=FALSE, warning=FALSE, error=FALSE, message=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
## Session 3 - Agenda
1. Statistical Distributions in R
## Statistical Distributions in R:
+ R has many built-in statistical distributions
+ e.g., binomial, poisson, normal, chi square, ...
+ Each distribution in R has four functions:
+ These functions begin with a "d", "p", "q", or "r" and are followed by the name of the distribution
+ ```ddist()```: gives the density of the distribution
+ ```rdist()```: generates random numbers out of the distribution
+ ```qdist()```: gives the quantile of the distribution
+ ```pdist()```: gives the cumulative distribution function (CDF)
## Discrete Distribution: Binomial
+ Consider tossing a coin 10 times
+ The probability distribution for the two possible outcomes follows a binomial distribution
+ Let's calculate the probability of getting five heads using the function ```dbinom()```
```{r echo=TRUE}
str(dbinom) # binomial probability mass func
dbinom(5, 10, 0.5) # Pr[X = 5] = ?
```
## Discrete Distribution: Binomial
+ Next, let's calculate the probability of getting 5 or fewer heads using the function ```pbinom()```
```{r echo=TRUE}
str(pbinom) # binomial CDF
pbinom(5, 10, 0.5) # Pr[X <= 5] = ?
```
## Discrete Distribution: Binomial
+ Now, suppose we have the probability 0.75 and we want to calculate the number of heads whose CDF is equal to that using ```qnorm()``` (note that this is the inverse of ```pnorm()```)
```{r echo=TRUE}
str(qbinom) # binomial quantile func
qbinom(0.75, 10, 0.5) # get the value of ? s.t. Pr[X <= ?] = 0.75
```
## Discrete Distribution: Binomial
+ Finally, let's generate 20 independent samples from a binomial(10, 0.5). This is equivalent to repeatedly (i.e., 20 times) flipping a coin 10 times and counting the number of heads.
```{r echo=TRUE}
str(rbinom) # binomial random number generator
rbinom(20, 10, 0.5) # 20 ind samples from binomial(10, 0.5)
```
## Continuous Distribution: Standard Normal
+ Calculate the value of the probability density function at $X = 0$
```{r echo=TRUE}
str(dnorm) # normal pdf
dnorm(x = 0, mean = 0, sd = 1)
```
## Continuous Distribution: Standard Normal
+ Calculate the probability that $X \leq 0$
```{r echo=TRUE}
str(pnorm) # normal CDF
pnorm(0, mean = 0, sd = 1) # Pr[X <= 0] = ?
```
## Continuous Distribution: Standard Normal
+ Find the value for which the CDF = 0.975
```{r echo=TRUE}
str(qnorm) # normal quantile func
qnorm(0.975, mean = 0, sd = 1) # PR[X <= ?] = 0.975
```
## Continuous Distribution: Standard Normal
+ Generate 10 independent random numbers from a standard normal distribution
```{r echo=TRUE}
str(rnorm) # generate random number from normal dist
rnorm(10, mean = 0, sd = 1)
```
##
Let's try plotting a normal curve (more on plotting later)
```{r echo=TRUE, fig.height = 4.5, fig.align='center'}
x <- seq(from = -3, to = 3, by = 0.05)
y <- dnorm(x, mean = 0, sd = 1)
plot(x, y, type = "l")
```
## Break Time