Summary
When predictor_type = "binary", the wrappers currently require binary_predictor_prevalence to be supplied. If it is omitted, the call eventually fails in generate_predictors() with an error that predictor_prop must be provided.
This behavior is valid but the issue is that the package contract is not fully settled: should binary predictor prevalence be mandatory, or should the wrappers provide a sensible default?
Current behavior
simulate_binary(), simulate_continuous(), and simulate_survival() accept binary_predictor_prevalence = NULL.
- For binary predictors,
generate_predictors() errors when the prevalence is missing.
- As a result, the public interface currently suggests the argument is optional, while the implementation effectively requires it for binary predictors.
Decision needed
Choose one of these directions:
-
Keep prevalence required for binary predictors.
Then validate early in the wrappers and give a clear user-facing error.
-
Add a default prevalence.
Then document the default clearly and assess how sensitive estimated sample size is to that choice.
If we add a default
The implementation should include:
- one explicit default prevalence value
- documentation explaining why that default was chosen
- tests covering omitted prevalence for all three wrappers with binary predictors
- a brief sensitivity check or rationale showing the default does not create misleading sample-size recommendations
Goal
Make the interface consistent: either binary predictor prevalence is required and validated clearly, or it has a documented default with justification.
Summary
When
predictor_type = "binary", the wrappers currently requirebinary_predictor_prevalenceto be supplied. If it is omitted, the call eventually fails ingenerate_predictors()with an error thatpredictor_propmust be provided.This behavior is valid but the issue is that the package contract is not fully settled: should binary predictor prevalence be mandatory, or should the wrappers provide a sensible default?
Current behavior
simulate_binary(),simulate_continuous(), andsimulate_survival()acceptbinary_predictor_prevalence = NULL.generate_predictors()errors when the prevalence is missing.Decision needed
Choose one of these directions:
Keep prevalence required for binary predictors.
Then validate early in the wrappers and give a clear user-facing error.
Add a default prevalence.
Then document the default clearly and assess how sensitive estimated sample size is to that choice.
If we add a default
The implementation should include:
Goal
Make the interface consistent: either binary predictor prevalence is required and validated clearly, or it has a documented default with justification.