FIX sensitivity_specificity_support: correct specificity with sample_weight (#1180) by immu4989 · Pull Request #1181 · scikit-learn-contrib/imbalanced-learn

immu4989 · 2026-06-15T00:10:21Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

sensitivity_specificity_support computes an incorrect specificity when sample_weight is provided — the value can even exceed 1, which is impossible for a rate. Because specificity_score, geometric_mean_score, classification_report_imbalanced and the make_index_balanced_accuracy wrappers all delegate to it, the bug propagates through the whole metrics module.

Root cause. When sample_weight is given, tp_sum, pred_sum and true_sum are weighted sums, but the true-negative count was formed using the raw sample count:

tn_sum = y_true.size - (pred_sum + true_sum - tp_sum)

Mixing a count (y_true.size) with weighted sums makes tn_sum wrong — it can go negative — and the downstream specificity = tn_sum / (tn_sum + pred_sum - tp_sum) is then wrong as well.

import numpy as np
from imblearn.metrics import sensitivity_specificity_support

_, spe, _ = sensitivity_specificity_support(
    [0, 0, 1, 1], [0, 1, 1, 1], sample_weight=[1.0, 2.0, 3.0, 4.0], average=None
)
print(spe)
# master: [1.         1.66666667]   <- 1.667 is impossible
# fixed:  [1.         0.33333333]   <- TN=1, FP=2 -> 1/(1+2)

Fix. Use the total sample weight as the population size when sample_weight is provided, falling back to y_true.size otherwise. The unweighted path is unchanged (every term scales uniformly there, so it was already correct).

Scope: a module-wide audit, single root cause

I audited every sample_weight-aware metric in imblearn.metrics using the invariant that integer sample_weight must equal physically repeating each sample. The results:

Metric	Before fix	After fix
`sensitivity_specificity_support` (sensitivity)	✅ correct	✅ correct
`sensitivity_specificity_support` (specificity)	❌ wrong (negative TN, values > 1)	✅ correct
`specificity_score`	❌ wrong (delegates)	✅ correct
`geometric_mean_score`	❌ wrong, incl. `nan` (delegates)	✅ correct
`make_index_balanced_accuracy(...)` wrappers	❌ wrong (delegates)	✅ correct
`classification_report_imbalanced`	❌ wrong specificity column (delegates)	✅ correct
`macro_averaged_mean_absolute_error`	✅ correct (independent path)	✅ correct

So the entire weighted-metric bug class traces to this one line; there are no other independent weighting bugs in the module.

Tests

The existing test_geometric_mean_sample_weight parametrization asserted the weighted value produced under the buggy behaviour (0.333). I corrected it to the right value (0.609), verified by an independent hand-computation of the weighted confusion matrix (per-class specificity [0.667, 0.5], sensitivity [1.0, 0.5]).
Added test_sensitivity_specificity_support_sample_weight (the [BUG] sensitivity_specificity_support returns wrong specificity with sample_weight (can exceed 1) #1180 reproducer; asserts specificity ≤ 1; integer weights == repeated samples).
Added a parametrized test_metrics_sample_weight_repeat_equivalence that enforces the repeat-equivalence invariant across the whole delegation chain (sensitivity, specificity, specificity_score, geometric_mean_score, the IBA wrappers) and every average mode (None/macro/weighted/micro), plus a [0, 1] rate-bound check — 28 cases. This locks the fix in across the module, not just where the bug originated.
Full metrics suite passes locally: pytest imblearn/metrics/ → 236 passed (scikit-learn 1.9.0).

…weight (scikit-learn-contrib#1180) When sample_weight is provided, tp_sum/pred_sum/true_sum are weighted sums but the true-negative count was formed as y_true.size - (pred_sum + true_sum - tp_sum), mixing a raw sample count with weighted sums. This makes tn_sum wrong (it can go negative), so the resulting specificity is incorrect and can exceed 1. specificity_score, geometric_mean_score and classification_report_imbalanced all delegate here, so their weighted results were affected too. Use the total sample weight as the population size when sample_weight is given, falling back to y_true.size otherwise (unweighted path unchanged). The existing test_geometric_mean_sample_weight parametrization asserted the weighted value produced under the buggy behaviour (0.333); it is corrected to the right value (0.609). A dedicated non-regression test is added that checks specificity never exceeds 1 and that integer weights match repeated samples.

…t-learn-contrib#1180) Audit of the metrics module showed the specificity weighting bug propagated through every metric that delegates to sensitivity_specificity_support: specificity_score, geometric_mean_score, and the make_index_balanced_accuracy wrappers were all affected, while sensitivity and macro_averaged_mean_absolute_error were already correct. Add a parametrized test asserting that integer sample_weight equals physically repeating each sample, across all of these metrics and every average mode, plus the rate-bound [0, 1] check. This locks in the fix across the whole delegation chain rather than only the function where the bug originated.

immu4989 added 2 commits June 14, 2026 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX sensitivity_specificity_support: correct specificity with sample_weight (#1180)#1181

FIX sensitivity_specificity_support: correct specificity with sample_weight (#1180)#1181
immu4989 wants to merge 2 commits into
scikit-learn-contrib:masterfrom
immu4989:fix/specificity-sample-weight-1180

immu4989 commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

immu4989 commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Scope: a module-wide audit, single root cause

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

immu4989 commented Jun 15, 2026 •

edited

Loading