Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #2302 New parameter, group_var added #2362

Merged
merged 30 commits into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b4fa466
#2302 New argument, group_var added
jerryekohe Mar 5, 2024
0e43eef
#2302 Fixed spelcheck and styler error
jerryekohe Mar 6, 2024
3b29697
#2302 Fixed lintr and roxygen related issues
jerryekohe Mar 6, 2024
d08c5ee
#2302 Fixed styler issue
jerryekohe Mar 7, 2024
35069bd
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
ashachakma Mar 7, 2024
bcde79b
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
bms63 Mar 13, 2024
fbfbf41
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
bms63 Mar 16, 2024
98f310b
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
bms63 Mar 16, 2024
900885b
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
bms63 Apr 19, 2024
d4ee6c6
Resolve merge conflict
jerryekohe May 23, 2024
1ad6c83
#2302 Updated new global variables, added `lag()` in NAMESPACE
jerryekohe May 24, 2024
df1afb1
#2302 Fixed lintr and styler issue
jerryekohe May 24, 2024
500bc90
#2302 Fixed issue related to roxygen check
jerryekohe May 24, 2024
f3526cd
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
bms63 May 24, 2024
574f32f
#2302 header updated with group_var details and temporary variables a…
jerryekohe May 27, 2024
9d3545c
Merge branch '2302_enhance_derive_var_trtemfl' of github.com:pharmave…
jerryekohe May 27, 2024
e4b77c2
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
bms63 May 28, 2024
ecf5317
Update R/derive_var_trtemfl.R
kaz462 Jun 1, 2024
5998b30
Update R/derive_var_trtemfl.R
kaz462 Jun 1, 2024
f165dfa
Update R/derive_var_trtemfl.R
kaz462 Jun 1, 2024
f3d62c5
Update R/derive_var_trtemfl.R
kaz462 Jun 1, 2024
4facbe7
#2302: add subject_keys, update check for initial_intensity
Jun 1, 2024
3f78ebf
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
Jun 1, 2024
b83f866
#2302: update new AE derivation/get_new_tmp_var
Jun 1, 2024
b9908a4
#2302: devtools::document()
Jun 1, 2024
68e0bb6
#2302: upversion roxygen2
Jun 1, 2024
a3a0153
Merge branch 'main' into 2302_enhance_derive_var_trtemfl
Jun 3, 2024
0542e1f
#2302: update group_var/subject_key description, check, and example
Jun 4, 2024
70fdf3a
#2302 enhance_derive_var_trtemfl: update subject_keys argument
bundfussr Jun 4, 2024
4e9de5e
#2302 enhance_derive_var_trtemfl: fix example
bundfussr Jun 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@ importFrom(dplyr,full_join)
importFrom(dplyr,group_by)
importFrom(dplyr,group_by_at)
importFrom(dplyr,if_else)
importFrom(dplyr,lag)
importFrom(dplyr,mutate)
importFrom(dplyr,n)
importFrom(dplyr,n_distinct)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

## Updates of Existing Functions

- `group_var` (optional) parameter is added to `derive_var_trtemfl()` to derive `TRTEMFL` for AE data if the data are collected as one episode of AE with multiple lines. (#2302)

- Templates for ADPC, ADPPK and ADPP are updated to handle urine records. (#2392)

- `create_single_dose_dataset()` has been updated to error if the `lookup_table` contains duplicates. (#2247)
Expand Down
2 changes: 1 addition & 1 deletion R/admiral-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#' desc distinct ends_with everything filter first full_join
#' group_by group_by_at if_else mutate n n_distinct na_if pull
#' rename rename_with row_number select semi_join slice starts_with
#' summarise summarise_all tibble tribble ungroup union
#' summarise summarise_all tibble tribble ungroup union lag
#' @importFrom hms as_hms
#' @importFrom lifecycle deprecate_warn deprecate_stop deprecated
#' @importFrom lubridate %--% as_datetime ceiling_date date days duration
Expand Down
248 changes: 206 additions & 42 deletions R/derive_var_trtemfl.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,12 @@
#'
#' @param initial_intensity Initial severity/intensity or toxicity
#'
#' This derivation assumes AE data collection method as single record per AE
#' with "initial" and "most extreme" severity/intensity recorded separately.
#' `initial_intensity` is ignored when `group_var` is specified.
#'
#' If the argument is specified, events which start before treatment start and
#' end after treatment start (or are ongoing) and worsened (i.e., the
#' intensity is greater than the initial intensity), are flagged.
#' If this argument is specified and `group_var` is `NULL`, events which start
#' before treatment start and end after treatment start (or are ongoing) and
#' worsened (i.e., the intensity is greater than the initial intensity), are
#' flagged.
#'
#' The values of the specified variable must be comparable with the usual
#' comparison operators. I.e., if the intensity is greater than the initial
Expand All @@ -72,6 +72,22 @@
#' *Permitted Values:* A symbol referring to a variable of the input dataset
#' or `NULL`
#'
#' @param group_var Grouping variable
#'
#' If the argument is specified, it assumes that AEs are recorded as one episode
#' of AE with multiple lines using a grouping variable.
#'
#' Events starting during treatment or before treatment and worsening afterward
#' are flagged. Once an AE record in a group is flagged, all subsequent records
#' in the treatment window are flagged regardless of severity.
#'
#' *Permitted Values:* A symbol referring to a variable of the input dataset
#' or `NULL`
#'
#' @param subject_keys Variables to uniquely identify a subject.
#'
#' A list of symbols created using `exprs()` is expected. This argument is only
#' used when `group_var` is specified.
#'
#' @details For the derivation of the new variable the following cases are
#' considered in this order. The first case which applies, defines the value
Expand All @@ -90,10 +106,15 @@
#' if `start_date` is on or after `trt_start_date` and `start_date` is on
#' or before `trt_end_date` + `end_window` days, it is set to `"Y"`,
#' - *event started before treatment and (possibly) worsened on treatment*:
#' - if `initial_intensity` and `intensity` is specified: if
#' `initial_intensity < intensity` and `start_date` is before
#' `trt_start_date` and `end_date` is on or after `trt_start_date` or
#' `end_date` is `NA`, it is set to `"Y"`.
#' - if `initial_intensity`, `intensity` is specified and `group_var` is not specified:
#' if `initial_intensity < intensity` and `start_date` is before `trt_start_date`
#' and `end_date` is on or after `trt_start_date` or `end_date` is `NA`, it
#' is set to `"Y"`;
#' - if `group_var` is specified:
#' if previous `intensity` < `intensity` and `start_date` is after `trt_start_date`
#' and `end_date` is on or after `trt_start_date` or `end_date` is `NA`, it
#' is set to `"Y"`;
#'
#' - Otherwise it is set to `NA_character_`.
#'
#' @return The input dataset with the variable specified by `new_var` added
Expand All @@ -109,7 +130,7 @@
#' library(dplyr, warn.conflicts = FALSE)
#' library(lubridate)
#'
#' adae <- expected <- tribble(
#' adae <- tribble(
#' ~USUBJID, ~ASTDTM, ~AENDTM, ~AEITOXGR, ~AETOXGR,
#' # before treatment
#' "1", "2021-12-13T20:15", "2021-12-15T12:45", "1", "1",
Expand Down Expand Up @@ -153,6 +174,54 @@
#' initial_intensity = AEITOXGR,
#' intensity = AETOXGR
#' ) %>% select(ASTDTM, AENDTM, AEITOXGR, AETOXGR, TRTEM2FL)
#'
#' adae2 <- tribble(
#' ~USUBJID, ~ASTDTM, ~AENDTM, ~AEITOXGR, ~AETOXGR, ~AEGRPID,
#' # before treatment
#' "1", "2021-12-13T20:15", "2021-12-15T12:45", "1", "1", "1",
#' "1", "2021-12-14T20:15", "2021-12-14T22:00", "1", "3", "1",
#' # starting before treatment and ending during treatment
#' "1", "2021-12-30T20:15", "2022-01-14T01:23", "3", "3", "2",
#' "1", "2022-01-05T20:00", "2022-06-01T11:00", "3", "1", "2",
#' "1", "2022-01-10T20:15", "2022-01-11T01:23", "3", "2", "2",
#' "1", "2022-01-13T20:15", "2022-03-01T01:23", "3", "1", "2",
#' # starting during treatment
#' "1", "2022-01-01T12:00", "2022-01-02T23:25", "4", "4", "3",
#'
#' # after treatment
#' "1", "2022-05-10T11:00", "2022-05-10T13:05", "2", "2", "4",
#' "1", "2022-05-10T12:00", "2022-05-10T13:05", "2", "2", "4",
#' "1", "2022-05-11T11:00", "2022-05-11T13:05", "2", "2", "4",
#' # missing dates
#' "1", "", "", "3", "4", "5",
#' "1", "2021-12-30T09:00", "", "3", "4", "5",
#' "1", "2021-12-30T11:00", "", "3", "3", "5",
#' "1", "", "2022-01-04T09:00", "3", "4", "5",
#' "1", "", "2021-12-24T19:00", "3", "4", "5",
#' "1", "", "2022-06-04T09:00", "3", "4", "5",
#' # without treatment
#' "2", "", "2021-12-03T12:00", "1", "2", "1",
#' "2", "2021-12-01T12:00", "2021-12-03T12:00", "1", "2", "2",
#' "2", "2021-12-06T18:00", "", "1", "2", "3"
#' ) %>%
#' mutate(
#' STUDYID = "ABC12345",
#' ASTDTM = ymd_hm(ASTDTM),
#' AENDTM = ymd_hm(AENDTM),
#' TRTSDTM = if_else(USUBJID == "1", ymd_hm("2022-01-01T01:01"), ymd_hms("")),
#' TRTEDTM = if_else(USUBJID == "1", ymd_hm("2022-04-30T23:59"), ymd_hms(""))
#' )

#' # derive TRTEMFL taking treatment end and worsening into account within a grouping variable
#' derive_var_trtemfl(
#' adae2,
#' new_var = TRTEMFL,
#' trt_end_date = TRTEDTM,
#' end_window = 10,
#' intensity = AETOXGR,
#' group_var = AEGRPID
#' ) %>% select(ASTDTM, AENDTM, AEITOXGR, AETOXGR, AEGRPID, TRTEMFL)

derive_var_trtemfl <- function(dataset,
new_var = TRTEMFL,
start_date = ASTDTM,
Expand All @@ -162,46 +231,79 @@ derive_var_trtemfl <- function(dataset,
end_window = NULL,
ignore_time_for_trt_end = TRUE,
initial_intensity = NULL,
intensity = NULL) {
intensity = NULL,
group_var = NULL,
subject_keys = get_admiral_option("subject_keys")) {
# Convert inputs to symbols
new_var <- assert_symbol(enexpr(new_var))
start_date <- assert_symbol(enexpr(start_date))
end_date <- assert_symbol(enexpr(end_date))
trt_start_date <- assert_symbol(enexpr(trt_start_date))
trt_end_date <- assert_symbol(enexpr(trt_end_date), optional = TRUE)
trt_end_date <-
assert_symbol(enexpr(trt_end_date), optional = TRUE)
assert_integer_scalar(end_window, subset = "non-negative", optional = TRUE)
assert_logical_scalar(ignore_time_for_trt_end)
initial_intensity <- assert_symbol(enexpr(initial_intensity), optional = TRUE)
initial_intensity <-
assert_symbol(enexpr(initial_intensity), optional = TRUE)
intensity <- assert_symbol(enexpr(intensity), optional = TRUE)
if (is.null(initial_intensity) && !is.null(intensity)) {
cli_abort(c(
"{.arg intensity} argument was specified but not {.arg initial_intensity}",
"Either both or none of them must be specified."
))
}
if (!is.null(initial_intensity) && is.null(intensity)) {
cli_abort(c(
"{.arg initial_intensity} argument was specified but not {.arg intensity}",
"Either both or none of them must be specified."
))
group_var <- assert_symbol(enexpr(group_var), optional = TRUE)

# group_var is not specified
# Check if both initial_intensity and intensity are provided
if (is.null(group_var)) {
if (is.null(initial_intensity) && !is.null(intensity)) {
cli_abort(c(
"{.arg intensity} argument was specified but not {.arg initial_intensity}",
"Either both or none of them must be specified."
))
}
if (!is.null(initial_intensity) && is.null(intensity)) {
cli_abort(c(
"{.arg initial_intensity} argument was specified but not {.arg intensity}",
"Either both or none of them must be specified."
))
}
# group_var is specified
} else {
if (!is.null(initial_intensity)) {
cli_warn(c(
"{.arg initial_intensity} argument is ignored when {.arg group_var} is specified",
"Please only specify one of them."
))
}
if (is.null(subject_keys)) {
cli_abort(c(
"{.arg group_var} argument was specified but not {.arg subject_keys}",
"{.arg subject_keys} argument must be provided when {.arg group_var} is specified."
))
}
assert_vars(subject_keys)
}
assert_data_frame(
dataset,
required_vars = expr_c(

# Assert required variables
required_vars <-
expr_c(
start_date,
end_date,
trt_start_date,
trt_end_date,
initial_intensity,
intensity
)
)
if (!is.null(group_var)) {
required_vars <- c(required_vars, group_var)
}
assert_data_frame(dataset, required_vars = required_vars)

# Assert date variables
assert_date_var(dataset, var = !!start_date)
assert_date_var(dataset, var = !!end_date)
assert_date_var(dataset, var = !!trt_start_date)
if (!is.null(trt_end_date)) {
assert_date_var(dataset, var = !!trt_end_date)
}

# end window condition
if (is.null(end_window)) {
end_cond <- expr(TRUE)
} else {
Expand All @@ -212,27 +314,89 @@ derive_var_trtemfl <- function(dataset,
))
}
if (ignore_time_for_trt_end) {
end_cond <- expr(is.na(!!trt_end_date) |
date(!!start_date) <= date(!!trt_end_date) + days(end_window))
end_cond <- expr(
(is.na(!!trt_end_date) |
date(!!start_date) <= date(!!trt_end_date) + days(end_window))
)
} else {
end_cond <- expr(is.na(!!trt_end_date) | !!start_date <= !!trt_end_date + days(end_window))
end_cond <-
expr(
(is.na(!!trt_end_date) |
!!start_date <= !!trt_end_date + days(end_window))
)
}
}


# new_ae_cond: Y - new AE, N - AE exists before trt_start_date
new_ae_cond <- get_new_tmp_var(dataset)
if (is.null(group_var)) {
dataset <- dataset %>%
mutate(
!!new_ae_cond := if_else(!!start_date >= !!trt_start_date, "Y", "N")
)
} else {
dataset <- dataset %>%
derive_vars_merged(
dataset_add = dataset,
by_vars = expr_c(subject_keys, group_var),
order = exprs(!!start_date),
new_vars = exprs(!!new_ae_cond := "N"),
filter_add = !!start_date < !!trt_start_date,
mode = "last"
) %>%
mutate(
!!new_ae_cond := if_else(is.na(!!new_ae_cond), "Y", "N")
)
}

if (is.null(intensity)) {
worsening_cond <- expr(FALSE)
} else {
worsening_cond <-
expr(!!start_date < !!trt_start_date &
(!!initial_intensity < !!intensity | is.na(!!initial_intensity) | is.na(!!intensity)))
if (is.null(group_var)) {
worsening_cond <-
expr(
!!start_date < !!trt_start_date &
(!!initial_intensity < !!intensity | is.na(!!initial_intensity) |
is.na(!!intensity))
)
} else {
prev_intensity <- get_new_tmp_var(dataset)
worsen_date <- get_new_tmp_var(dataset)

dataset <- dataset %>%
arrange(USUBJID, !!group_var, !!start_date) %>%
group_by(USUBJID, !!group_var) %>%
kaz462 marked this conversation as resolved.
Show resolved Hide resolved
mutate(
!!prev_intensity := lag(!!intensity),
!!worsen_date :=
case_when(
!is.na(!!start_date) & !is.na(!!trt_start_date) & !is.na(!!prev_intensity) &
!!start_date >= !!trt_start_date &
(!!intensity > !!prev_intensity) ~ !!start_date,
TRUE ~ NA
)
) %>%
fill(!!worsen_date, .direction = "down") %>%
ungroup()

worsening_cond <- expr(!is.na(!!worsen_date))
}
}

dataset %>%
mutate(!!new_var := case_when(
is.na(!!trt_start_date) ~ NA_character_,
!!end_date < !!trt_start_date ~ NA_character_,
is.na(!!start_date) ~ "Y",
!!start_date >= !!trt_start_date & !!end_cond ~ "Y",
!!worsening_cond ~ "Y"
))

# Derive TRTEMFL based on conditions

dataset <- dataset %>%
mutate(
!!new_var := case_when(
is.na(!!trt_start_date) ~ NA_character_,
!!end_date < !!trt_start_date ~ NA_character_,
is.na(!!start_date) ~ "Y",
!!new_ae_cond == "Y" & !!end_cond ~ "Y", # new AE
!!worsening_cond ~ "Y" # worsened AE
)
) %>%
# Remove temporary variable
remove_tmp_vars()
}
6 changes: 5 additions & 1 deletion R/globals.R
Original file line number Diff line number Diff line change
Expand Up @@ -129,5 +129,9 @@ globalVariables(c(
"atoxgr_criteria_ctcv4",
"DTYPE",
"where", # this entry should be moved to @importFrom tidyselect once we use tidyselect 1.2.0
"tmp_col_type"
"tmp_col_type",
"num_records",
"worsen_date",
"prev_intensity",
"srfl"
))
Loading
Loading