generated from IQSS/dss-template-quarto
-
Notifications
You must be signed in to change notification settings - Fork 1
/
conceptual.qmd
19 lines (10 loc) · 4.91 KB
/
conceptual.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Conceptual aspects of the analysis {#sec-conceptual}
The conceptual step involves ensuring compatibility between the proposed analysis and the decisions made in the planning phase, described in @sec-planning. In particular, this requires attention to the estimand and to whether the assumptions for causal inference can be met given the data available.
## Choosing the estimand
First, one needs to decide on the desired estimand. The choice of which portion of the population to target should be driven primarily by the intended use of the study, even if such a use is not explicitly stated in the paper. For example, a study may examine the causal effect of the choice between traditional surgery and minimally invasive surgery on recovery time. The implied policies being compared are a policy in which traditional surgery is always used and a policy in which minimally invasive surgery is always used (even if the study does not intend to make such a sweeping policy). If a study is examining the effects of smoking on throat cancer, the implied policies being compared are a policy in which smokers continue smoking and one in which smokers no longer smoke. However, it may not be coherent to investigate a policy in which non-smokers are made to smoke.
The implied policy affects which estimand is chosen; for the first research question, the ATE might be of interest if there are few existing medical reasons to prefer one form of surgery to other, and the ATO might of interest if, for most patients, the choice is well understood, but there are some patients for whom a definitive choice is less clear (i.e., those at equipoise). For the second research question, which examines a harmful exposure, the ATT is more of interest because it corresponds to a realistic policy, i.e., getting smokers to quit. To help match the research question to the estimand, see @greiferChoosingEstimandWhen2021.
The target population of the estimand is not the only choice one must make. As previously described, one must consider the time scale of the outcome, the scale the treatment effect is to be measured, how the treatment should be defined, etc. These choices are completely separate from the method of analysis but must be chosen beforehand so the analysis can proceed. They should be driven primarily by substantive concerns, e.g., at what time scale the treatment is supposed to have an effect, the effect measure that will make the most sense for clinicians and stakeholders, the implied policy that is most realistic or informative, etc. In some cases, though, proceeding through the analysis will reveal that some of the chosen options cannot be upheld, and they can be respecified in a dynamic process that maximizes the utility of the resulting research while respecting the degree of information supplied in the data.
## Assessing assumptions
After selecting a quantity to target, one must assess the assumptions required for causality, described previously in @sec-planning. One must decide whether the data available are sufficient to satisfy the backdoor criterion, and, if so, which variables must be adjusted for and which must not be. For example, there may be a strong predictor of the outcome that is affected by the treatment; this variable should not be adjusted for because doing so would induce bias [@elwertEndogenousSelectionBias2014]. There may be a well-known confounder that is missing from the data because it was not collected in the database used; in that case, it must be made clear that the resulting estimates have no valid causal interpretation (and therefore little utility for practice) or the research design can be changed to use a method that requires assumptions other than satisfaction of the backdoor criterion like instrumental variables analysis [@hernanInstrumentsCausalInference2006].
One must assess whether consistency is met, i.e., whether there are no unmeasured versions of treatment. For example, if treatment is a binarized version of a truly continuous or multi-category variable, then the causal effect is not well defined and will not generalize to other categorization strategies. Using BMI as a treatment can incur this problem; defining treatment as having low vs. high BMI as determined by a cutoff assumes there is no difference among those with BMI just above the cutoff point and those with BMI far above the cutoff point. A similar problem can occur when using education as a treatment; education may be examined as having a college degree vs. not having a college degree, but within each level includes a large variety of different amounts of education one can have (e.g., the group of those having a college degree include both people who graduated college and did no more schooling and those who went on to get PhDs, MDs, or MBAs). Instead, these variables might be considered as continuous treatments, with methods appropriate for that treatment type used.
Once the assumptions are determined to be met, one can proceed with the analysis of the data.