New Analysis Class API #252

RHammond2 · 2023-08-08T21:38:01Z

This PR addresses a longstanding issue with a lack of consistency between the analysis classes, and the differences between what is passed through __init__() vs run(). The updated paradigm is as follows:

All run and init arguments are now able to be passed to the init method
Only parameters that do not change the nature of what data is validated/checked for existence can be passed to run(). For example, the keyword argument wind_direction_col can be passed during initialization of WakeLosses, but not during the run method because we are changing the underlying data.
Changing reanalysis products is still allowed in run() because this is about the sensitivity to long term weather data, and not pertaining to the underlying wind power plant data.
Just as in v2, analysis classes do not operate on the PlantData object, but use the underlying data to create the permutations necessary for the focal analysis.

What this means is that run() can be used to iterate over permutations of an analysis on the same data, but if we need to swap out any data, e.g. different wind direction columns, additional SCADA columns, & etc., that should be considered to be an entirely new analysis, and therefore a new analysis class object is required.

Updated features:

openoa/schema/metadata.py:ANALYSIS_REQUIREMENTS contains all the available modifiers for running analyses. For example, there is now "WakeLosses-scada" and "WakeLosses-tower" to indicate which data will be used, depending on if your wind direction is coming from the scada or tower.
analysis_class.plant is now a deep copy of the passed data object to ensure that no underlying data are updated. The only exception to analysis.plant remaining unchanged is in the case of PlantData.analysis_type gets updated to include the analysis that will be performed.

Additionally, this PR expands the analysis requirements settings per #241.

… docstring

…eferences

…ared to develop_v3

codecov-commenter · 2023-08-08T21:47:36Z

Codecov Report

Attention: 83 lines in your changes are missing coverage. Please review.

Comparison is base (2807556) 65.50% compared to head (7de60a8) 65.28%.

Additional details and impacted files

@@              Coverage Diff               @@
##           develop_v3     #252      +/-   ##
==============================================
- Coverage       65.50%   65.28%   -0.22%     
==============================================
  Files              29       29              
  Lines            4061     4197     +136     
==============================================
+ Hits             2660     2740      +80     
- Misses           1401     1457      +56

Files	Coverage Δ
openoa/analysis/eya_gap_analysis.py	`95.00% <100.00%> (+1.45%)`	⬆️
openoa/schema/metadata.py	`88.38% <100.00%> (+0.40%)`	⬆️
openoa/analysis/_analysis_validators.py	`71.05% <76.19%> (+2.63%)`	⬆️
openoa/analysis/electrical_losses.py	`77.08% <61.11%> (-2.77%)`	⬇️
openoa/analysis/turbine_long_term_gross_energy.py	`75.63% <72.41%> (-1.61%)`	⬇️
openoa/analysis/yaw_misalignment.py	`84.50% <85.71%> (-0.27%)`	⬇️
openoa/analysis/wake_losses.py	`86.97% <88.65%> (+0.11%)`	⬆️
openoa/analysis/aep.py	`84.78% <68.18%> (-1.20%)`	⬇️
openoa/schema/schema.py	`49.50% <0.00%> (-16.29%)`	⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ejsimley

Hi Rob, I like the consistent format for initializing the analysis classes and specifying/changing parameters in the run methods! I have some minor questions and comments, but I think this is mostly ready to go. I will also update the doc strings for the wake losses class and/or change the default values for some of the arguments since the defaults without UQ weren't originally always the average of the default range with UQ.

openoa/analysis/_analysis_validators.py

openoa/analysis/aep.py

ejsimley · 2023-08-31T17:18:28Z

openoa/analysis/aep.py

@@ -1003,7 +1098,10 @@ def run_AEP_monte_carlo(self):
            iav[n] = gross_lt_annual.std() / gross_lt_annual.mean()
            avail_pct[n] = avail_lt_losses
            curt_pct[n] = curt_lt_losses
-            lt_por_ratio[n] = (gross_lt.sum() / self._run.num_years_windiness) / gross_por.sum()
+            gross_por_sum = gross_por.sum()
+            if isinstance(gross_por_sum, pd.Series):


What was causing gross_por_sum to be a pd.Series sometimes?

In the future pandas will just break for the original usage when gross_por_sum is a single element Series.

ejsimley · 2023-08-31T17:24:29Z

openoa/analysis/aep.py

+        self,
+        num_sim: int,
+        reg_model: str = None,
+        reanalysis_products: list[str] = None,


It looks like changing reanalysis_products in run will work fine, but doesn't including this as an argument break the rule you mention that "Only parameters that do not change the nature of what data is validated/checked for existence can be passed to run()"?

That's a great point! I think I originally envisioned this being as a subset of any that were originally provided, so if you add more, the method wouldn't work, but this was never actually implemented either.

I'm curious as to your thoughts here on whether it's an appropriate breaking of the paradigm or not. My own thoughts are that reanalysis data is already validated, so it's all there to be used, but at the same time it's still changing the nature of the analysis.

openoa/analysis/electrical_losses.py

openoa/analysis/turbine_long_term_gross_energy.py

ejsimley · 2023-08-31T19:22:40Z

openoa/analysis/turbine_long_term_gross_energy.py

@@ -159,19 +170,78 @@ def __attrs_post_init__(self):
        logger.info("Processing SCADA data into dictionaries by turbine (this can take a while)")
        self.sort_scada_by_turbine()

+    def finalize_reanalysis_products(self):


This seems to be a common function for some of the analysis classes. Not necessary, but could it be moved to the "_analysis_validators" module or a similar higher-level module?

We definitely could, and I think my main hangup is that it's not universally shared across the analysis classes, nor is it doing much to begin with, so I felt it was easier to just keep in the methods where it was needed. That said, I could be convinced if you think we should make another mixin class like with the FromDictMixin that manages the dictionary initialization processes.

@ejsimley I was thinking on this a bit more, and I created and another tool in _analysis_validators to do this. Let me know what you think.

openoa/analysis/turbine_long_term_gross_energy.py

jordanperr · 2023-09-12T15:37:30Z

Two thoughts:

Thought 1: How will this new paradigm extend to the workflow where analysis classes can "pull" data from an underlying datastore? For example, you might want to use MonteCarloAEP schema to pull all data necessary for a MonteCarloAEP from an ENTR warehouse. But then after pulling all the data, you might want some way to "exclude" one of the columns, say, to compare its impact on the AEP.

One possibility would be to extract the underlying PlantData object and then create two new MonteCarloAEP analyses using the same PlantData. Would that work?

I'm trying to think through if there might be a benefit to preserving the ability to include/exclude data in run().

Thought 2: Are PlantData and Analysis classes immutable? It didn't used to be. Please verify that no option in the run() function will mutate the PlantData and Analysis class. Or if they do, it should be noted somehow. The user will expect any combination of PlantData and Analysis class to be "run" multiple times with different options.

…gle attrs validator

…tion

RHammond2 · 2023-09-26T00:27:41Z

Two thoughts:

Thought 1: How will this new paradigm extend to the workflow where analysis classes can "pull" data from an underlying datastore? For example, you might want to use MonteCarloAEP schema to pull all data necessary for a MonteCarloAEP from an ENTR warehouse. But then after pulling all the data, you might want some way to "exclude" one of the columns, say, to compare its impact on the AEP.

One possibility would be to extract the underlying PlantData object and then create two new MonteCarloAEP analyses using the same PlantData. Would that work?

I'm trying to think through if there might be a benefit to preserving the ability to include/exclude data in run().

Thought 2: Are PlantData and Analysis classes immutable? It didn't used to be. Please verify that no option in the run() function will mutate the PlantData and Analysis class. Or if they do, it should be noted somehow. The user will expect any combination of PlantData and Analysis class to be "run" multiple times with different options.

@jordanperr thanks for the comments! I updated the description of the PR, but just to put it all in one place:

Only analysis_type gets updated in PlantData from an analysis class, but the analysis.plant object is also created as a deep copy of the original, so
you can create many new analysis classes to compare results of various modifications with ease
Naturally, if you modify an analysis parameter, the analysis class itself will be updated, otherwise you'd not be able to rerun an analysis with a new variation of a parameter, but what isn't modified is the underlying data.
The only thing excluded from the run() parameters are those that modify which data are validated, and therefore changing the fundamental basis of the analysis, with the exception of reanalysis_products.
For pulling analysis requirements from a data store, we still don't explicitly support this outside of grabbing requirements schemas for the analysis types in openoa/schema/ or from ANALYSIS_REQUIREMENTS

Let me know if I've missed anything here, or if you still have any reservations.

RHammond2 added 19 commits August 3, 2023 14:20

add init parameters to run for aep

b01df55

handle None in rean prod and fix deprecation warning for 1-value series

2bc9946

update run docstring for new parameters and adjust ordering for class…

5d7987c

… docstring

update changelog

04bd5de

support analysis schema variations

c3d9fbf

update changelog

1bac74b

fix issue with renamed naming convention and drop aep.

4a849e4

update wording of docstring

f612e69

update electrical losses with new analysis API framework

4a4de26

fix control flow bug to get tests passing

7b5f993

update naming convention for half-closed range validators

b3d55c5

fix spacing and docstring

6faa687

update renalysis product naming and convert to new analysis API

669f11c

fix merge conflict

5f85ea6

update wake analysis to new API and attribute naming conventions

e47b152

fix issues with wake losses not actually running bc of bad variable r…

adccd04

…eferences

fix typos

1f9ad8c

update wake losses to ensure the correct values are being tested comp…

a962d31

…ared to develop_v3

remove print statements

683df38

RHammond2 added the v3 All updates and changes for OpenOA v3 label Aug 8, 2023

RHammond2 added this to the V3.0 milestone Aug 8, 2023

RHammond2 requested review from jordanperr and ejsimley August 8, 2023 21:38

RHammond2 mentioned this pull request Aug 17, 2023

Static yaw misalignment analysis method #249

Merged

ejsimley requested changes Sep 1, 2023

View reviewed changes

RHammond2 added 4 commits September 1, 2023 13:15

update typing

bd47a8c

update docstring

e49fb9e

fix typos

d51afb3

fix other typos

36ff84b

RHammond2 added 3 commits September 1, 2023 13:29

simplify control flow

c8ea2f3

fix bug in half-closed copypasta

bbed25c

fix merge conflicts

8f74ae5

RHammond2 added 15 commits September 22, 2023 14:17

use base data for tests every time

9bf1825

update plantdata validations

2066265

add new analysis classes and non-base cases into the schematics

255ddb3

update schema files for new settings

219921a

convert the yaw misalignment to the new api

7098619

add missing connectors for static yaw misalignment

4c56e1e

add attrs validator for PlantData and convert reanalysis check to sin…

30b963a

…gle attrs validator

Merge branch 'develop_v3' into enhancement/analysis_api

c3549f7

update the analysis requirements process for WakeLosses

00de16c

update valid tower test

e61a56d

update TowerMetaData.col_map to reflect changes

ab171bd

ensure the original plantdata object is not modified on analysis crea…

820d027

…tion

update analysis requirements to reflect actual uses

d400c63

remove extra space

015f017

update the schemas and reference guide

0359601

RHammond2 added 8 commits September 26, 2023 12:38

update changelog with more details on new analysis api

18e3105

sunset 'M' for 'MS' due to lack of pandas support

8c163c2

remove unused function

57af10b

update Cubico for changes in OpenOA schema

c50365c

rerun example notebooks

42d90c2

update examples in the documentation

cefee1c

udpate schema files and tests for removed 'M' in place of 'MS'

7de60a8

default to using full data pipeline instead pre-cleansed data

f426744

RHammond2 merged commit 83ef14f into NREL:develop_v3 Sep 27, 2023
2 checks passed

RHammond2 deleted the enhancement/analysis_api branch September 27, 2023 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Analysis Class API #252

New Analysis Class API #252

RHammond2 commented Aug 8, 2023 •

edited

Loading

codecov-commenter commented Aug 8, 2023 •

edited

Loading

ejsimley left a comment

ejsimley Aug 31, 2023

RHammond2 Sep 1, 2023

ejsimley Aug 31, 2023

RHammond2 Sep 1, 2023

ejsimley Aug 31, 2023

RHammond2 Sep 1, 2023

RHammond2 Sep 25, 2023

jordanperr commented Sep 12, 2023 •

edited

Loading

RHammond2 commented Sep 26, 2023

New Analysis Class API #252

New Analysis Class API #252

Conversation

RHammond2 commented Aug 8, 2023 • edited Loading

codecov-commenter commented Aug 8, 2023 • edited Loading

Codecov Report

ejsimley left a comment

Choose a reason for hiding this comment

ejsimley Aug 31, 2023

Choose a reason for hiding this comment

RHammond2 Sep 1, 2023

Choose a reason for hiding this comment

ejsimley Aug 31, 2023

Choose a reason for hiding this comment

RHammond2 Sep 1, 2023

Choose a reason for hiding this comment

ejsimley Aug 31, 2023

Choose a reason for hiding this comment

RHammond2 Sep 1, 2023

Choose a reason for hiding this comment

RHammond2 Sep 25, 2023

Choose a reason for hiding this comment

jordanperr commented Sep 12, 2023 • edited Loading

RHammond2 commented Sep 26, 2023

RHammond2 commented Aug 8, 2023 •

edited

Loading

codecov-commenter commented Aug 8, 2023 •

edited

Loading

jordanperr commented Sep 12, 2023 •

edited

Loading