Skip to content

Conversation

@ypriverol
Copy link
Member

@ypriverol ypriverol commented Jan 22, 2026

Pull Request

Description

Brief description of the changes made in this PR.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Test addition/update
  • Updates to the dependencies has been done.

Summary by CodeRabbit

  • New Features

    • Added support for registering and analyzing single-cell DIA-NN datasets.
  • Documentation

    • Added example reports and config entries showcasing single-cell DIA-NN workflows.
  • Improvements

    • More accurate peptide and proteotypic handling in per-run statistics and visualizations.
  • Chores

    • Added end-to-end automated testing and artifact upload for single-cell DIANN datasets.

✏️ Tip: You can customize this high-level summary in your review settings.

yueqixuan and others added 2 commits January 22, 2026 17:21
Optimize and add single-cell DIANN example
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 22, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds a new GitHub Actions job to test single-cell DIANN datasets, two docs/config entries for PXD053464, and refactors DIA utility peptide handling to use Modified.Sequence and Proteotypic fields.

Changes

Cohort / File(s) Summary
CI workflow
/.github/workflows/python-app.yml
Adds test_single_cell_diann job: checks out code, sets up Python 3.10, installs package, downloads/unzips PXD053464 DIANN bundle, runs MultiQC with DIANN plugin, uploads results_single_cell_diann artifact.
Documentation & config
docs/README.md, docs/config.json
Adds two example report rows for "Single cell (DIA-NN)" and two docs/config.json project entries (PXD053464 and PXD053464_disable_hoverinfo) pointing to the same ZIP URL and file_type metadata.
DIA utilities refactor
pmultiqc/modules/common/dia_utils.py
Replaces internal sequence field usage with Modified.Sequence for peptide IDs, uses Proteotypic to determine unique peptides (with warning fallback), updates peptide/protein mapping and per-run statistics accordingly.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

Review effort 4/5

Suggested reviewers

  • yueqixuan

Poem

🐰 Hopping through code with glee,
Single cells hum in harmony.
Proteotypic hops, sequences sing,
Tests and docs — a joyous spring.
Zip, unzip, report — carrots for me! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Improvements for single cell proteomics DIANN' directly relates to the main changes across the PR, which add comprehensive support for single-cell DIANN datasets including workflow jobs, documentation, configuration entries, and utility updates.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@pmultiqc/modules/common/dia_utils.py`:
- Around line 229-231: The selection of "Proteotypic" into report_data is
unguarded and will KeyError if the column is missing; update the code around
report_data (where df is filtered) to check for "Proteotypic" in df.columns and
if absent add a default (e.g., a column indicating all entries are proteotypic
or a sensible default value) or filter accordingly, and emit a warning/log via
the module logger; ensure changes reference the existing df and report_data
variables so downstream code continues to work with the same schema.
🧹 Nitpick comments (1)
docs/README.md (1)

245-245: Consider linking directly to the single-cell ZIP for consistency.

Most rows point to an archive file for one-click download; this one links to a folder. Linking to 03_SingleCell_Searches.zip would align with the rest of the table.

✏️ Suggested update
-| Single cell (DIA-NN) | Single cell dataset | [Single cell Example](https://pmultiqc.quantms.org/PXD053464/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD053464_disable_hoverinfo/multiqc_report.html)) | [PXD053464 folder](https://ftp.pride.ebi.ac.uk/pride/data/archive/2024/08/PXD053464/) |
+| Single cell (DIA-NN) | Single cell dataset | [Single cell Example](https://pmultiqc.quantms.org/PXD053464/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD053464_disable_hoverinfo/multiqc_report.html)) | [03_SingleCell_Searches.zip](https://ftp.pride.ebi.ac.uk/pride/data/archive/2024/08/PXD053464/03_SingleCell_Searches.zip) |

claude and others added 2 commits January 22, 2026 14:46
Make the code robust when the Proteotypic column is not present in the
DataFrame by conditionally adding it. If missing, log a warning and
default all peptides to proteotypic (value 1).
Fix syntax error in dataframe column selection
@ypriverol ypriverol merged commit d5e4d9f into main Jan 22, 2026
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants