-
Notifications
You must be signed in to change notification settings - Fork 70
optimize memory usage in postprocess_adj #3127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize memory usage in postprocess_adj #3127
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 files reviewed, 2 comments
a7121d1 to
d6b5ab2
Compare
Diff CoverageDiff: origin/develop...HEAD, staged and unstaged changes
Summary
|
8346749 to
f43988c
Compare
marcorudolphflex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch and great to have that in.
Looks fine overall with some comments/questions left from my side.
Think would not be bad to have an unit test on _slice_field_data against expectation and/or the legacy code? Saw some "weak" tests in test_frequency_coordinate_alignment, but they don't test on len(freqs) > 1.
tests/test_components/autograd/performance/test_shape_performance.py
Outdated
Show resolved
Hide resolved
tests/test_components/autograd/performance/postprocess_adj_utils.py
Outdated
Show resolved
Hide resolved
yaugenst-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! can merge when marco's comments are resolved
eddc2db to
1effa96
Compare
tests/test_components/autograd/numerical/test_autograd_frequency_selection.py
Show resolved
Hide resolved
1effa96 to
ce57257
Compare
|
thanks! @marcorudolphflex let me know how the unresolved conversations sound to you. I also added an additional numerical test that should test the frequency slicing more - let me know if that looks good to you or if you want some additional tests there. I ended up being able to simplify the frequency slicing by filtering the frequencies in the forward field. |
tests/test_components/autograd/numerical/test_autograd_frequency_selection.py
Outdated
Show resolved
Hide resolved
tests/test_components/autograd/numerical/test_autograd_frequency_selection.py
Outdated
Show resolved
Hide resolved
ce57257 to
ed4d120
Compare
ed4d120 to
e06df93
Compare
e06df93 to
7c40948
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
tests/test_components/autograd/numerical/test_autograd_frequency_selection.py
Outdated
Show resolved
Hide resolved
tests/test_components/autograd/numerical/test_autograd_frequency_selection.py
Outdated
Show resolved
Hide resolved
tests/test_components/autograd/numerical/test_autograd_frequency_selection.py
Outdated
Show resolved
Hide resolved
7c40948 to
b93c56e
Compare
In my test case, I am seeing about a 1.5x memory reduction which is helpful for very large geometry groups that have large monitors and frequency points.
Greptile Summary
Optimized memory usage in adjoint gradient postprocessing by replacing label-based array slicing (
sel) with integer-based indexing (isel). The_slice_field_datafunction now detects contiguous frequency indices and creates zero-copy views instead of copies, achieving ~1.5x memory reduction for large geometry groups with extensive monitors and frequency points._slice_field_datainbackward.py:96-148to useget_indexerand check for contiguous indices before slicingpostprocess_adj_generate,postprocess_adj_profile) to control execution of performance tests that require simulation runsConfidence Score: 4/5
Important Files Changed
_slice_field_datato use integer indexing (isel) with contiguity checks for memory-efficient array slicing, replacing label-basedselwhich created copiespostprocess_adj, using pytest markers to control test executionpostprocess_adjperformance testingpostprocess_adjpostprocess_adj_generateandpostprocess_adj_profileSequence Diagram
sequenceDiagram participant Caller as postprocess_adj caller participant PP as postprocess_adj participant Slice as _slice_field_data participant XR as xarray DataArray Caller->>PP: postprocess_adj(sim_data_adj, sim_data_orig, sim_data_fwd, sim_fields_keys) PP->>PP: Extract frequency chunks for processing loop For each frequency chunk PP->>Slice: _slice_field_data(field_components, freqs) Slice->>XR: get_index("f") XR-->>Slice: frequency index Slice->>Slice: get_indexer(freqs) → integer indices Slice->>Slice: Check if indices are contiguous (diff == 1) alt Contiguous indices Slice->>XR: isel(f=slice(start, stop)) Note over XR: Zero-copy view (memory efficient) XR-->>Slice: View of data else Non-contiguous indices Slice->>XR: isel(f=array_of_indices) Note over XR: Copy required XR-->>Slice: Copy of data end Slice-->>PP: Sliced field data PP->>PP: Compute VJP for chunk end PP-->>Caller: AutogradFieldMap with gradientsNote
Improves memory efficiency and robustness of adjoint postprocessing.
postprocess_adjto use index-based frequency slicing:_slice_field_datanow usesiselwith validated slices; chunking switches from value-basedselto index slices; forward/adjoint datasets are sorted byfand frequency-aligned before processingpostprocess_adjwith dataset generation and CPU/memory profiling; adds pytest markers and config to control these runsCHANGELOG.mdto note memory improvementWritten by Cursor Bugbot for commit b93c56e. This will update automatically on new commits. Configure here.