-
Notifications
You must be signed in to change notification settings - Fork 32
Automated EIA baseline comparison #554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Introduced a GitHub Actions workflow for EIA update checks. - Added a new script for comparing Scout microsegments with EIA AEO data. - Updated documentation to include instructions for running the EIA update check. - Enhanced .gitignore to exclude local environment variables. - Included python-dotenv as a dependency for managing environment variables.
| for bldg in uv.bldg_class_translator.keys(): | ||
| for fuel in uv.fuel_type: | ||
| for end_use in uv.end_use_translator.keys(): | ||
| filters = FilterStrings(bldg_class=bldg, fuel=fuel, end_use=end_use) | ||
| compare_one_combination(mseg, filters, year, verbose, uv, api_key) | ||
|
|
||
| # After all combinations, print summary information | ||
| print_rollups() | ||
| report_large_errors() | ||
| report_zero_division_cases() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crux of this script:
Go through each building class -> each fuel type-> each end use:
compare mseg vs eia API.
Then print results and errors.
| # 1. Aggregate JSON | ||
| json_dict = recursive_aggregate(mseg, filters, uv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script also relies on recursion, in the same way as the EIA update scripts. Maybe, a future PR could focus on unravelling this recursion for ease of understanding.
| # If EIA has data but the JSON aggregate is empty, treat this as a hard error. | ||
| if json_is_empty and eia_dict: | ||
| raise RuntimeError( | ||
| "EIA data exists but JSON aggregate is empty for " | ||
| f"{bldg} | {fuel} | {end_use} (series {series_id})." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, this does not hit. But just left this case as a hard error in case a json field is not present in a future EIA update.
| bldg = filters.bldg_class | ||
| fuel = filters.fuel | ||
| end_use = filters.end_use | ||
|
|
||
| # Conditions used inside the ID string | ||
| if bldg == "residential": | ||
| condition_1 = uv.end_use_translator[end_use] | ||
| condition_2 = "NA" | ||
| if end_use == "heating" and fuel == "electricity": | ||
| condition_3 = "hhd" # special case for electric heating | ||
| else: | ||
| condition_3 = "NA" | ||
| else: # commercial | ||
| # Commercial electricity is labeled as "Purchased Electricity" in API | ||
| fuel_for_api = "Purchased Electricity" if fuel == "electricity" else fuel | ||
| condition_1 = "NA" | ||
| condition_2 = uv.end_use_translator[end_use] | ||
| condition_3 = "NA" | ||
| fuel = fuel_for_api | ||
|
|
||
| eia_series_id = ( | ||
| f"cnsm_{condition_3}_" | ||
| f"{uv.bldg_class_translator[bldg]}_" | ||
| f"{condition_1}_" | ||
| f"{uv.fuel_type_translator[fuel]}_" | ||
| f"{condition_2}_usa_qbtu" | ||
| ) | ||
|
|
||
| return eia_series_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secret sauce for EIA APIs for Scout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment specifying what condition 1, 2, and 3 mean? Mostly so later on we don't need to go back to the EIA docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added here 👍🏾
| if position is None: | ||
| position = [] | ||
|
|
||
| energy_by_year: dict[str, float] = {} | ||
|
|
||
| for key, value in data.items(): | ||
| if isinstance(value, dict) and key != "energy": | ||
| # Keep going down one level | ||
| sub_result = recursive_aggregate(value, filters, uv, position + [key]) | ||
| for yr, val in sub_result.items(): | ||
| energy_by_year[yr] = energy_by_year.get(yr, 0.0) + val | ||
| continue | ||
|
|
||
| if key != "energy": | ||
| # Only interested in energy leaves | ||
| continue | ||
|
|
||
| path = position + [key] | ||
| if len(path) < 5: | ||
| # We expect: climate_zone / bldg_type / fuel / end_use / ... / energy | ||
| continue | ||
|
|
||
| _cz, bldg_type, fuel_name, eu_name, *rest = path | ||
| subkey = rest[0] if rest else "" | ||
|
|
||
| # Only consider the requested building class and fuel | ||
| if bldg_type not in uv.all_bldg_types[filters.bldg_class]: | ||
| continue | ||
| if fuel_name != filters.fuel: | ||
| continue | ||
|
|
||
| # Decide whether to include this particular leaf based on end‑use rules | ||
| end_use = filters.end_use | ||
| accept = False | ||
|
|
||
| if eu_name == "other" and subkey in uv.other_end_uses and end_use == "other": | ||
| accept = True | ||
| elif eu_name == "ceiling fan" and end_use == "other": | ||
| accept = True | ||
| elif ( | ||
| eu_name == "other" | ||
| and subkey in uv.separate_other_end_uses | ||
| and end_use == subkey | ||
| ): | ||
| accept = True | ||
| elif ( | ||
| eu_name in uv.heating_end_uses | ||
| and end_use == "heating" | ||
| and subkey == "supply" | ||
| ): | ||
| accept = True | ||
| elif eu_name == "cooling" and end_use == "cooling" and subkey == "supply": | ||
| accept = True | ||
| elif eu_name in uv.remaining_end_uses and end_use == eu_name: | ||
| accept = True | ||
| elif ( | ||
| eu_name in ("other", "unspecified") | ||
| and fuel_name != "electricity" | ||
| and end_use == "other" | ||
| ): | ||
| accept = True | ||
| elif ( | ||
| eu_name in ("MELs", "unspecified") | ||
| and fuel_name == "electricity" | ||
| and end_use == "other" | ||
| ): | ||
| accept = True | ||
| elif eu_name == end_use and subkey == "energy": | ||
| accept = True | ||
|
|
||
| if not accept: | ||
| continue | ||
|
|
||
| # At this point we decided that this leaf contributes to our total | ||
| for yr, val in value.items(): | ||
| energy_by_year[yr] = energy_by_year.get(yr, 0.0) + val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is how we unravel the mseg file. Again, this could become a non-recursive set of functions for much easier comprehension.
| MSEG_PATH = "scout/supporting_data/stock_energy_tech_data/mseg_res_com_cz.json" | ||
|
|
||
|
|
||
| @dataclass(frozen=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL you can make a dataclass immutable with this decorator. Will keep in mind to check if it can be used more in the codebase/in ecm_prep.py or run.py.
| return api_key | ||
|
|
||
|
|
||
| @on_exception(expo, Exception, max_tries=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another cool decorator. If an exception hits (like a timeout), retry up to 5 times with exponentially increasing delays.
|
TODO: Done |
| "security system", | ||
| "portable electric spas", | ||
| "smart speakers", | ||
| "tablets", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please tell me this is GPT hallucinating and this isn't actually a thing. Also smart speakers?!?!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually a thing 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
| dev = [ | ||
| "openpyxl", | ||
| "flake8 >= 7.0", | ||
| "tabulate", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still failing in the actions? Can you confirm that it is actually loading this dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed
rHorsey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much Jeremy! Sadly for you I have one more request, which is that you add a way for this to actually fail fail. Basically, if things exceed / get worse than current levels I'd like this to throw a red x, not a green checkmark. Otherwise we're never going to notice if / when the comparisons get worse...
|
Like it love it! Thanks so much @jmythms !!! Many thanks also for commenting the EIA conditions - I wouldn't have guessed those in a hurry. |
rHorsey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🫡
This PR addresses #517 by: