-
Notifications
You must be signed in to change notification settings - Fork 100
Only mean centering PCA components #1271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1271 +/- ##
==========================================
- Coverage 89.86% 89.61% -0.26%
==========================================
Files 29 29
Lines 4383 4392 +9
Branches 725 727 +2
==========================================
- Hits 3939 3936 -3
- Misses 295 304 +9
- Partials 149 152 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
More familiar with how PCA is used in the upstream sliding-window thermal denoising than how it's used in tedana, but I'll play out this train of thought and see if it's applicable. Some will be familiar with the noise map estimate that comes out of the sliding window noise level estimation. This can be strongly influenced by scanner B1- bias field correction, which makes images homogeneous in brightness in the brain but heteroscedastic. Each of those individual noise level estimates is however derived assuming that the noise level within the finite sliding window patch is constant, which for strong bias fields / large patches may be violated. This can be addressed through use of a variance-stabilising transform, which rescales (and potentially non-linearly transforms) the data based on some prior estimate of the noise level map prior to PCA. My own denoising tool does this in an iterative fashion, using the noise estimate from the previous iteration to standardise variance in the current iteration. From a cursory inspection, I think you're trying to achieve a similar thing here, only that you're using the variance of the empirical data, which is likely not as suitable a reference magnitude for such a transformation as is a principled noise level estimate. So permitting the propagation of a noise level estimate from a prior pipeline step through to inform this step may be an overall more robust approach. Where this is not implemented or unavailable, I don't have any fundamental objections to One thing to watch out for is image voxels that are consistently zero-filled. I've recently seen data from multiple vendors where voxels outside the brain are filled with zeroes across all volumes, and this breaks PCA. Just be sure it's not the Z-transform dividing by zero and introducing NaNs before you even attempt PCA. I also recall when investigating demeaning & whitening for fixing MP-PCA issues that it was somewhere advocated to do an iterative double-centering approach of demeaning across both axes multiple times, as otherwise there could be order-dependency issues. |
I think the adaptive masking step should remove any voxels that are all zeroes, so we should be safe in this case.
The "variance normalize everything" just z-scores across the whole array with a single mean value and SD value. As far as I can recall, it doesn't have much effect since the mean across each column will be zero and the SD will be one from the previous z-score step. I'm not actually sure why we have it in the code. |
Ah OK. So having a first normalisation done independently per time series before anything else might be beneficial in the presence of a strong B1- bias field (whether intensity corrected or not), as it will help to level out differences in signal / noise level across space. I think normalising along the other axis in this case is going to correct for any temporal signal drift? But I'm skeptical about Z-transforming the entire data matrix in one go, ie. an axis-agnostic way. I do have some limited recollection seeded from this comment (unfortunately link is dead) that normalising across each axis individually, then iterating over all axes again at least once, was beneficial for PCA. But I'm not having success finding documented justification for it. So unless someone can either find such justification, or is willing to try it out as a potential alternative solution to the problematic data that seeded the PR, happy for the idea to be written off as out of scope. |
|
As a brief update here, @marms7 and I are trying to figure out more quantitative tests of "betterness" We saw that mean centering without any variance scaling meant ICA nearly always converged (a good thing), but we weren't sure if the components were better. One metric we tested was a kappa rho difference score. For every component, we calculated a kappa & rho value and the score was Since ICA has multiple components each with a KappaRho Difference score, we also needed to figure out how to combine those value for each run. The figure shows taking the mean across components in green and the median in pink. Since each component represents a different amount of the overall variance of the data, cyan is the weighted sum of each components variance explained by the KappaRho Difference score. Particularly since we're testing ways to scale variance of time series, this seems to be the most appropriate option, but we included the simpler values anyway. The solid lines are just mean centering the voxel time series while the dashed lines also zscore (variance scale) the date.
The main observation is that the KappaRho difference scores decrease as the number of components increase with just mean centering. For zscoring, with the variance-weighted sum, there's less of a relationship between the difference score and the number of components, which seems to be the desired outcome. Without variance weighting it also decreases, but that's because very low variance components have more mixed kappa & rho values, but also don't contribute as much to the overall signal. There's a bit more investigation we plan to do on this, but wanted to add an update to this PR. |

In playing around with ICA convergence failures @marms7 and I realized that not variance scaling the data pre & post PCA seems to remove almost all convergence failures and (n=1) might even give better final ICA components for denoising. This is a work in progress and we're still testing, but I wanted to share here in case others have thoughts.
Changes proposed in this pull request:
--mean_center_onlyoption that removes variance scaling and zscoring of dataunit-variancefor whitening to match the recommended default fastICA setting and to now match what we're doing in our single call to fastICA (@BahmanTahayori & @Lestropie any reason this might be an issue with robustICA?)