-
Notifications
You must be signed in to change notification settings - Fork 19
Improvements for single cell proteomics DIANN #593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Optimize and add single-cell DIANN example
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughAdds a new GitHub Actions job to test single-cell DIANN datasets, two docs/config entries for PXD053464, and refactors DIA utility peptide handling to use Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pmultiqc/modules/common/dia_utils.py`:
- Around line 229-231: The selection of "Proteotypic" into report_data is
unguarded and will KeyError if the column is missing; update the code around
report_data (where df is filtered) to check for "Proteotypic" in df.columns and
if absent add a default (e.g., a column indicating all entries are proteotypic
or a sensible default value) or filter accordingly, and emit a warning/log via
the module logger; ensure changes reference the existing df and report_data
variables so downstream code continues to work with the same schema.
🧹 Nitpick comments (1)
docs/README.md (1)
245-245: Consider linking directly to the single-cell ZIP for consistency.Most rows point to an archive file for one-click download; this one links to a folder. Linking to
03_SingleCell_Searches.zipwould align with the rest of the table.✏️ Suggested update
-| Single cell (DIA-NN) | Single cell dataset | [Single cell Example](https://pmultiqc.quantms.org/PXD053464/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD053464_disable_hoverinfo/multiqc_report.html)) | [PXD053464 folder](https://ftp.pride.ebi.ac.uk/pride/data/archive/2024/08/PXD053464/) | +| Single cell (DIA-NN) | Single cell dataset | [Single cell Example](https://pmultiqc.quantms.org/PXD053464/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD053464_disable_hoverinfo/multiqc_report.html)) | [03_SingleCell_Searches.zip](https://ftp.pride.ebi.ac.uk/pride/data/archive/2024/08/PXD053464/03_SingleCell_Searches.zip) |
Make the code robust when the Proteotypic column is not present in the DataFrame by conditionally adding it. If missing, log a warning and default all peptides to proteotypic (value 1).
Fix syntax error in dataframe column selection
Pull Request
Description
Brief description of the changes made in this PR.
Type of Change
Summary by CodeRabbit
New Features
Documentation
Improvements
Chores
✏️ Tip: You can customize this high-level summary in your review settings.