-
Notifications
You must be signed in to change notification settings - Fork 7
/
index.Rmd
337 lines (256 loc) · 10.3 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
---
title: "Working with dates and times"
author: "Taavi Päll"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Loading Libraries
```{r}
library(tidyverse)
library(lubridate)
```
In this session, we'll explore the **lubridate** R package created by Garrett Grolemund and Hadley Wickham.
You can read more about **lubridate** from R4DS: http://r4ds.had.co.nz/dates-and-times.html
For more help:
```{r}
?lubridate
```
Also search for package vignette!
According to the package authors, 'lubridate has a consistent, memorable syntax, that makes working with dates fun instead of frustrating.'
If you've ever worked with dates in R you know what this means.
There are different date and time representations, in order to view our locale:
```{r}
Sys.getlocale("LC_TIME")
Sys.Date()
Sys.time()
Sys.timezone()
```
**lubridate** contains many useful functions.
Type **help(package = lubridate)** to bring up an overview of the package, including the package DESCRIPTION, a list of available functions, and a link to the official package vignette.
The `today()` function returns today’s date. Lets store the result in a new variable called this_day.
```{r}
this_day <- today()
this_day
```
## Extract information from date
There are three components to this date.
In order, they are year, month, and day.
We can extract any of these components using the `year()`, `month()`, or `day()` function, respectively.
We try them on this_day now.
```{r}
y <- year(this_day)
m <- month(this_day)
d <- day(this_day)
rbind(y, m, d)
```
We can also get the day of the week from this_day using the `wday()` function.
It will be represented as a number, such that 1 = Sunday, 2 = Monday, 3 = Tuesday, etc.
Similarly we can extract day of month and day of year.
```{r}
w <- wday(this_day)
m <- mday(this_day)
y <- yday(this_day)
rbind(w, m, y)
```
We can add a second argument, `label = TRUE`, to display the name of the weekday (represented as an ordered factor).
```{r}
wday(this_day, label = TRUE)
```
In addition to handling dates, lubridate is great for working with date and time combinations, referred to as date-times.
The `now()` function returns the date-time representing this exact moment in time.
We store the result in a variable called this_moment.
```{r}
this_moment <- now()
this_moment
class(this_moment)
```
Just like with dates, we can extract the year, month, day, or day of week. However, we can also use `hour()`, `minute()`, and `second()` to extract specific time information.
```{r}
h <- hour(this_moment)
m <- minute(this_moment)
s <- second(this_moment)
rbind(h, m, s)
```
## Parsing dates
`today()` and `now()` provide neatly formatted date-time information.
When working with dates and times 'in the wild', this won't always (and perhaps rarely will) be the case.
Fortunately, lubridate offers a variety of functions for parsing date-times.
These functions take the form of `ymd()`, `dmy()`, `hms()`, `ymd_hms()`, etc., where each letter in the name of the function stands for the location of years (y), months (m), days (d), hours (h), minutes (m), and/or seconds (s) in the date-time being read in.
To see how these functions work, we try `ymd("1989-05-17")`.
We must surround the date with quotes.
```{r}
my_date <- ymd("1989-05-17")
my_date
# mdy("January 31st, 2017")
```
It looks almost the same, except for the addition of a time zone, which we’ll discuss later in the lesson. Below the surface, there’s another important change that takes place when lubridate parses a date.
```{r}
class(my_date)
```
So `ymd()` took a character string as input and returned an object of class POSIXct. It’s not necessary that you understand what POSIXct is, but just know that it is one way that R stores date-time information internally.
"1989-05-17" is a fairly standard format, but lubridate is smart enough to figure out many different date-time formats. We use `ymd()` to parse "1989 May 17".
```{r}
ymd("1989 May 17")
```
Despite being formatted differently, the last two dates had the year first, then the month, then the day.
Hence, we used `ymd()` to parse them.
What about the appropriate function is for parsing "March 12, 1975"?
```{r}
mdy("March 12, 1975")
```
We can even throw something funky at it and lubridate will often know the right thing to do.
Lets parse 25081985, which is supposed to represent the 25th day of August 1985.
Note that we are actually parsing a numeric value here – not a character string – so leave off the quotes.
```{r}
dmy(25081985)
```
But be careful, it's not magic.
Lets try `ymd("192012")` to see what happens when we give it something more ambiguous.
We surround the number with quotes again, just to be consistent with the way most dates are represented (as character strings).
```{r}
ymd("192012")
dmy("192012")
```
We got a warning message because it was unclear what date you wanted.
When in doubt, it's best to be more explicit.
Lets repeat the same command, but add two dashes OR two forward slashes to “192012” so that it's clear we want January 2, 1920.
```{r}
# or ymd("1920-1-2")
ymd("1920/1/2")
```
## Parsing date-times
In addition to dates, we can parse date-times. Lets created a date-time object called dt1 and we parse it with `ymd_hms()`.
```{r}
dt1 <- '2017-10-11 18:05:02'
dt1
```
```{r}
train <- ymd_hms(dt1)
train
class(train)
```
What if we have a time, but no date? Lets use the appropriate lubridate function to parse "03:22:14" (hh:mm:ss).
```{r}
tim <- hms("03:22:14")
tim
class(tim)
```
**lubridate** is also capable of handling vectors of dates, which is particularly helpful when you need to parse an entire column of data.
```{r}
dt2 <- c("2014-05-14", "2014-09-22", "2014-07-11")
```
```{r}
ymd(dt2)
```
Different date formats in vector.
```{r}
dt3 <- c("2014-05-14", "2014-09-22", "2014 July 11")
ymd(dt3)
```
## Update date-time
The `update()` function allows us to update one or more components of a date-time. For example, let’s say the current time is 08:34:55 (hh:mm:ss).
Lets update this_moment to the new time using update(this_moment, hours = 8, minutes = 34, seconds = 55).
```{r}
this_moment
update(this_moment, hours = 8, minutes = 34, seconds = 55)
```
Lets update() this_moment, so that it contain the new time.
```{r}
## update time using previously created objects
this_moment <- update(this_moment, hours = h, minutes = m, seconds = s)
this_moment
```
## DST and time zones and arithmetic operations
- Now, lets pretend you are in Helsinki and you are planning to visit a friend in Hong Kong.
- You seem to have misplaced your itinerary, but you know that your Air China flight departs Helsinki at 16:50 the day after tomorrow.
- You also know that your flight is scheduled to arrive in Hong Kong exactly 17 hours and 35 minutes after departure.
Let's reconstruct your itinerary from what you can remember, starting with the full date and time of your departure.
W
e will approach this by finding the current date in Helsinki, adding 2 full days, then setting the time to 16:50.
To find the current date in Helsinki, we'll use the `now()` function again.
This time, however, we'll specify the time zone that we want: "Europe/Helsinki".
For a complete list of valid time zones for use with **lubridate**, check out the following Wikipedia [http://en.wikipedia.org/wiki/List_of_tz_database_time_zones](http://en.wikipedia.org/wiki/List_of_tz_database_time_zones).
```{r}
hel <- now("Europe/Helsinki")
hel
```
Your flight is the day after tomorrow (in Helsinki time), so we want to add two days to hel.
One nice aspect of **lubridate** is that it allows you to use arithmetic operators on dates and times.
In this case, we'd like to add two days to hel, so we can use the following expression: hel + days(2)
```{r}
depart <- hel + days(2)
depart
```
So now depart contains the date of the day after tomorrow.
We use update() to add the correct hours (16) and minutes (50) to depart.
Reassign the result to depart.
```{r}
depart <- update(depart, hours = 16, minutes = 50)
depart
```
Your friend wants to know what time she should pick you up from the airport in Hong Kong.
Now that we have the exact date and time of your departure from Helsinki, we can figure out the exact time of your arrival in Hong Kong.
The first step is to add 17 hours and 35 minutes to your departure time.
Recall that hel + days(2) added two days to the current time in Helsinki.
Use the same approach to add 17 hours and 35 minutes to the date-time stored in depart.
```{r}
arrive <- depart + hours(17) + minutes(35)
class(hours(17))
arrive
```
The arrive variable contains the time that it will be in Helsinki when you arrive in Hong Kong.
What we really want to know is what time is will be in Hong Kong when you arrive, so that your friend knows when to meet you.
The `with_tz()` function returns a date-time as it would appear in another time zone. We use with_tz() to convert arrive to the “Asia/Hong_Kong” time zone.
```{r}
arrive <- with_tz(arrive, "Asia/Hong_Kong")
arrive
```
## Date-time interval
Fast forward to your arrival in Hong Kong.
You and your friend have just met at the airport and you realize that the last time you were together was in Singapore on June 17, 2008.
Naturally, you'd like to know exactly how long it has been.
We only need to use the appropriate **lubridate** function to parse "June 17, 2008".
This time, however, we should specify an extra argument, tz = "Singapore".
```{r}
last_time <- mdy("6 17 2008", tz = "Asia/Singapore")
last_time
```
We can create a `interval()` that spans from last_time to arrive.
```{r}
how_long <- interval(last_time, arrive)
how_long
class(how_long)
```
Now use `as.period()` to see how long it’s been.
```{r}
as.period(how_long)
```
What to do with daylight saving time DST?
Because durations represent an exact number of seconds, sometimes you might get an unexpected result:
```{r}
one_pm <- ymd_hms("2016-03-12 13:00:00", tz = "America/New_York")
one_pm
# because of DST Mar-12 has only 23 hours
one_pm + ddays(1)
## here we have everything ok?
one_pm <- ymd_hms("2017-10-31 13:00:00", tz = "Europe/Tallinn")
one_pm
one_pm + ddays(1)
```
# How old are you?
```{r}
my_birthday <- "1974-01-05"
age <- today() - ymd(my_birthday) # age as difftime
```
```{r}
as.period(age) / as.period(years(1))
```
OR
```{r}
interval(ymd(my_birthday), today()) / years(1)
```
Reference
Adapted from https://rpubs.com/davoodastaraky/lubridate