Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Modify single IV to accept nans #1296

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AshwinParanjape
Copy link

I modified the binary and continuous single IV case to see how it be made to work with partial data.
For binary IV, np.mean -> np.nanmean is sufficient
For continuous valued IV, I'm using np.ma (masked array) to mask the NaNs from covariance computations

Copy link

This PR is stale because it has been open for 60 days with no activity.

@github-actions github-actions bot added the stale label Mar 14, 2025
@amit-sharma
Copy link
Member

thanks for adding this @AshwinParanjape (and for your patience--I somehow missed this PR).

Can you add a test case for this based on the referenced issue? Have also added an inline comment on informing the user about missing data.

Copy link
Member

@amit-sharma amit-sharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this fix. added a few comments.

y0_z = np.mean(data[self._target_estimand.outcome_variable[0]][instrument == 0])
x1_z = np.mean(data[self._target_estimand.treatment_variable[0]][instrument == 1])
x0_z = np.mean(data[self._target_estimand.treatment_variable[0]][instrument == 0])
y1_z = np.nanmean(data[self._target_estimand.outcome_variable[0]][instrument == 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the motivation, but sometimes the user may not be aware of the missing data (especially when it may be due to a bug in their data processing code).
For this reason, can we issue a warning message as this nanmean operation is done?

the code structure can be: try np.mean operations, if it fails, then we issue a warning that "The dataset contains missing data, proceeding to ignore it", and then using nanmean as a fallback.
for warning, you can use, self.logger.warning("message")

@github-actions github-actions bot removed the stale label Mar 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants