Add QC `explained by Susie credible set in the region` #3464

d0choa · 2024-09-13T16:20:27Z

For the final credible sets datasets, we need to be able to decide which credible sets are included in the outputs and which ones are discarded. For example, the same locus/region might have been fine-mapped through different methods, so we need an algorithm that decides which credible set will be included in the final set.

Because inclusion and validation both involve filtering rows in credible sets, we want to reuse this step to implement this logic. Also, parts of the validation (e.g., duplicate studyLocusId) would only make sense after the inclusion filter.

This ticket only covers one of the required QCs. PICS disambiguation would require a different QC flag.

QC - "Explained by Susie credible set in the region"

For every credible set
- if there is another credible set in the same STUDY and REGION as the credible set LEADVARIANT
  - if the lead has NOT a flag reporting the variant is not in the LD matrix
    - flag credible set as "Explained by SuSie credible set in the region"

QC method needs to be added to StudyLocus and QC reason to the enum.

The method must be called from the study_locus_validation step before checking the duplicated studyLocusIds.
The orchestration repo will then need an additional flag to run this validation.

The text was updated successfully, but these errors were encountered:

d0choa · 2024-09-20T15:53:49Z

Implemented logic flags credible sets for later filtering if:

any variant in the credible set overlaps with a locus finemapped using SuSiE
the credible set is NOT based on SuSiE-Inf
the credible set is NOT flagged as UNRESOLVED_LD (by LD clumping)

I'm still trying to figure out the last condition regarding UNRESOLVED_LD. After applying this filter with a realistic set of credible sets, I'd like to see how things look. We might have things with an R^2 0.4 not in LDIndex, but fine-mapping decides they are part of the credible set. We could generate duplicates in these not-so-extreme cases.

@addramir watch this space

d0choa added Data Relates to Open Targets data team Gentropy Relates to the genetics ETL labels Sep 13, 2024

addramir self-assigned this Sep 16, 2024

d0choa assigned d0choa and unassigned d0choa and addramir Sep 20, 2024

d0choa linked a pull request Sep 20, 2024 that will close this issue

feat: flag credible sets explained by SuSiE regions opentargets/gentropy#780

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QC `explained by Susie credible set in the region` #3464

Add QC `explained by Susie credible set in the region` #3464

d0choa commented Sep 13, 2024

d0choa commented Sep 20, 2024

Add QC explained by Susie credible set in the region #3464

Add QC explained by Susie credible set in the region #3464

Comments

d0choa commented Sep 13, 2024

d0choa commented Sep 20, 2024

Add QC `explained by Susie credible set in the region` #3464

Add QC `explained by Susie credible set in the region` #3464