Skip to content

Conversation

@groberts-flex
Copy link
Contributor

@groberts-flex groberts-flex commented Jan 5, 2026

In my test case, I am seeing about a 1.5x memory reduction which is helpful for very large geometry groups that have large monitors and frequency points.

Greptile Summary

Optimized memory usage in adjoint gradient postprocessing by replacing label-based array slicing (sel) with integer-based indexing (isel). The _slice_field_data function now detects contiguous frequency indices and creates zero-copy views instead of copies, achieving ~1.5x memory reduction for large geometry groups with extensive monitors and frequency points.

  • Refactored _slice_field_data in backward.py:96-148 to use get_indexer and check for contiguous indices before slicing
  • Added comprehensive performance test infrastructure with CPU and memory profiling capabilities
  • Created pytest markers (postprocess_adj_generate, postprocess_adj_profile) to control execution of performance tests that require simulation runs
  • Test dataset generator creates realistic workload (35×35 cylinder arrays) to measure optimization impact

Confidence Score: 4/5

  • Safe to merge with minor style improvements suggested
  • The core memory optimization is sound and well-implemented with proper error handling. The contiguity check ensures zero-copy views when possible while falling back to copies when necessary. Performance testing infrastructure is comprehensive. Minor style issues include error message formatting and redundant length check.
  • No files require special attention - changes are focused and well-tested

Important Files Changed

Filename Overview
tidy3d/web/api/autograd/backward.py Optimized _slice_field_data to use integer indexing (isel) with contiguity checks for memory-efficient array slicing, replacing label-based sel which created copies
tests/test_components/autograd/performance/test_shape_performance.py Added performance test suite with CPU/memory profiling capabilities for postprocess_adj, using pytest markers to control test execution
tests/test_components/autograd/performance/postprocess_adj_utils.py Added utilities for generating, persisting, and loading large-scale test datasets (35x35 cylinder arrays) for postprocess_adj performance testing
CHANGELOG.md Added changelog entry documenting the memory performance improvement in postprocess_adj
pyproject.toml Added pytest markers for controlling performance test execution: postprocess_adj_generate and postprocess_adj_profile

Sequence Diagram

sequenceDiagram
    participant Caller as postprocess_adj caller
    participant PP as postprocess_adj
    participant Slice as _slice_field_data
    participant XR as xarray DataArray
    
    Caller->>PP: postprocess_adj(sim_data_adj, sim_data_orig, sim_data_fwd, sim_fields_keys)
    PP->>PP: Extract frequency chunks for processing
    loop For each frequency chunk
        PP->>Slice: _slice_field_data(field_components, freqs)
        Slice->>XR: get_index("f")
        XR-->>Slice: frequency index
        Slice->>Slice: get_indexer(freqs) → integer indices
        Slice->>Slice: Check if indices are contiguous (diff == 1)
        alt Contiguous indices
            Slice->>XR: isel(f=slice(start, stop))
            Note over XR: Zero-copy view (memory efficient)
            XR-->>Slice: View of data
        else Non-contiguous indices
            Slice->>XR: isel(f=array_of_indices)
            Note over XR: Copy required
            XR-->>Slice: Copy of data
        end
        Slice-->>PP: Sliced field data
        PP->>PP: Compute VJP for chunk
    end
    PP-->>Caller: AutogradFieldMap with gradients
Loading

Note

Improves memory efficiency and robustness of adjoint postprocessing.

  • Refactors postprocess_adj to use index-based frequency slicing: _slice_field_data now uses isel with validated slices; chunking switches from value-based sel to index slices; forward/adjoint datasets are sorted by f and frequency-aligned before processing
  • Updates tests to reflect index-based slicing and error behavior; adds a numerical test validating gradient consistency under varied frequency selections
  • Introduces performance profiling suite for postprocess_adj with dataset generation and CPU/memory profiling; adds pytest markers and config to control these runs
  • Updates CHANGELOG.md to note memory improvement

Written by Cursor Bugbot for commit b93c56e. This will update automatically on new commits. Configure here.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch 2 times, most recently from a7121d1 to d6b5ab2 Compare January 6, 2026 19:45
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

  • tidy3d/web/api/autograd/backward.py (100%)

Summary

  • Total: 34 lines
  • Missing: 0 lines
  • Coverage: 100%

@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch 2 times, most recently from 8346749 to f43988c Compare January 13, 2026 20:48
Copy link
Contributor

@marcorudolphflex marcorudolphflex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch and great to have that in.
Looks fine overall with some comments/questions left from my side.
Think would not be bad to have an unit test on _slice_field_data against expectation and/or the legacy code? Saw some "weak" tests in test_frequency_coordinate_alignment, but they don't test on len(freqs) > 1.

Copy link
Collaborator

@yaugenst-flex yaugenst-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! can merge when marco's comments are resolved

@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch 2 times, most recently from eddc2db to 1effa96 Compare January 27, 2026 17:28
@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch from 1effa96 to ce57257 Compare January 27, 2026 21:03
@groberts-flex
Copy link
Contributor Author

thanks! @marcorudolphflex let me know how the unresolved conversations sound to you. I also added an additional numerical test that should test the frequency slicing more - let me know if that looks good to you or if you want some additional tests there. I ended up being able to simplify the frequency slicing by filtering the frequencies in the forward field.

@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch from ce57257 to ed4d120 Compare January 27, 2026 21:24
@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch from ed4d120 to e06df93 Compare January 27, 2026 23:27
@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch from e06df93 to 7c40948 Compare January 27, 2026 23:45
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@groberts-flex groberts-flex force-pushed the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch from 7c40948 to b93c56e Compare January 28, 2026 01:25
@groberts-flex groberts-flex added this pull request to the merge queue Jan 28, 2026
Merged via the queue into develop with commit 83ba2e3 Jan 28, 2026
48 of 52 checks passed
@groberts-flex groberts-flex deleted the groberts-flex/FXC-4551-optimize-postprocess-adj-memory branch January 28, 2026 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants