Skip to content

Conversation

@eurunuela
Copy link
Collaborator

Summary

This PR refactors the monolithic tedana_workflow() function (~800 lines) into a modular pipeline architecture with distinct stages, improving maintainability, testability, and memory efficiency.

Changes

  • New PipelineContext dataclass (pipeline_context.py): Centralizes all workflow state (configuration, data arrays, masks, decomposition results) in a single container, reducing parameter passing between functions
  • New pipeline stages module (pipeline_stages.py): Organizes the workflow into 9 logical stages with 20+ self-contained functions
  • Refactored tedana_workflow(): Now creates a PipelineContext and delegates to run_tedana_pipeline(), reducing the function from ~800 to ~30 lines
  • Memory management: Added explicit cleanup methods (clear_intermediate_data(), get_memory_usage()) with garbage collection

Pipeline Stages

Stage Functions Purpose
1 setup_output_directory, validate_inputs, initialize_component_selector, setup_io_generator Setup & initialization
2 load_data, validate_tr, load_external_regressors, handle_precomputed_files Data loading
3 create_masks Adaptive mask creation
4 fit_decay_model T2*/S0 estimation
5 compute_optimal_combination Echo combination
6 perform_pca_decomposition, perform_ica_decomposition, compute_component_metrics, perform_component_selection, run_decomposition_with_restarts PCA/ICA decomposition
7 save_component_outputs, apply_tedort, write_denoised_data, save_registry_and_metadata, finalize_report_text Output generation
8 generate_reports HTML/figure generation
9 cleanup Memory cleanup & teardown

Benefits

  • Maintainability: Each stage is a self-contained function with clear inputs/outputs
  • Testability: Individual stages can be tested in isolation
  • Memory efficiency: Explicit cleanup of intermediate data with garbage collection
  • API enhancement: tedana_workflow() now returns PipelineContext for programmatic access to results (component table, mixing matrix, selector, etc.)
  • Backward compatibility: Identical function signature, same CLI behavior, same output files

File Changes

File Lines Description
pipeline_context.py +428 New: PipelineContext dataclass and factory function
pipeline_stages.py +1019 New: Modular pipeline stage functions
tedana.py -660 Refactored to use new modular structure
__init__.py +6 Added new exports

Test Plan

  • All 130 unit tests pass
  • Linting passes (flake8, black)
  • CLI help command works correctly
  • Module imports verified
  • Integration testing with real multi-echo fMRI data (pending)
  • Memory profiling to verify improvements (pending)

🤖 Generated with Claude Code

Introduced PipelineContext and modularized the tedana workflow into discrete pipeline stages for improved maintainability and memory management. Added new modules pipeline_context.py and pipeline_stages.py, updated __init__.py to expose new workflow API, and refactored tedana_workflow to use the new context and orchestration function.
@eurunuela eurunuela added the refactoring issues proposing/requesting changes to the code which do not impact behavior label Dec 18, 2025
@eurunuela
Copy link
Collaborator Author

I have found Claude Opus 4.5 to be incredibly good at refactoring code in the past, so I thought I'd give this a go at modularizing our tedana.py workflow. Let me know what you think.

@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 75 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.41%. Comparing base (ac50072) to head (453be91).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
tedana/workflows/pipeline_stages.py 86.22% 32 Missing and 10 partials ⚠️
tedana/workflows/pipeline_context.py 75.91% 31 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1280      +/-   ##
==========================================
- Coverage   89.86%   89.41%   -0.46%     
==========================================
  Files          29       31       +2     
  Lines        4383     4584     +201     
  Branches      725      748      +23     
==========================================
+ Hits         3939     4099     +160     
- Misses        295      334      +39     
- Partials      149      151       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring issues proposing/requesting changes to the code which do not impact behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant