Skip to content

Clarify or add default handling for binary predictor prevalence #79

@GForb

Description

@GForb

Summary

When predictor_type = "binary", the wrappers currently require binary_predictor_prevalence to be supplied. If it is omitted, the call eventually fails in generate_predictors() with an error that predictor_prop must be provided.

This behavior is valid but the issue is that the package contract is not fully settled: should binary predictor prevalence be mandatory, or should the wrappers provide a sensible default?

Current behavior

  • simulate_binary(), simulate_continuous(), and simulate_survival() accept binary_predictor_prevalence = NULL.
  • For binary predictors, generate_predictors() errors when the prevalence is missing.
  • As a result, the public interface currently suggests the argument is optional, while the implementation effectively requires it for binary predictors.

Decision needed

Choose one of these directions:

  1. Keep prevalence required for binary predictors.
    Then validate early in the wrappers and give a clear user-facing error.

  2. Add a default prevalence.
    Then document the default clearly and assess how sensitive estimated sample size is to that choice.

If we add a default

The implementation should include:

  • one explicit default prevalence value
  • documentation explaining why that default was chosen
  • tests covering omitted prevalence for all three wrappers with binary predictors
  • a brief sensitivity check or rationale showing the default does not create misleading sample-size recommendations

Goal

Make the interface consistent: either binary predictor prevalence is required and validated clearly, or it has a documented default with justification.

Metadata

Metadata

Assignees

No one assigned

    Labels

    easyfutureFor a future release

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions