Skip to content

Conversation

@yingli-NREL
Copy link
Collaborator

@yingli-NREL yingli-NREL commented Nov 18, 2025

related issue: #535

This PR will update the current thermal load components data to the latest ComStock data.
COMSTOCK vs Current Scout Data comparison:

Key notes:

  1. Current Scout data has extreme high (>1) and low values (<-1).
  2. Heating Window Conduction Load: High values in current Scout data (average absolute difference = 0.5). Some building types (e.g., large office) show component loads >1 in the current Scout data.
  3. Heating Light: Low values in current Scout data (average absolute difference = 0.59). some building types (e.g., food sales) show component loads <-1 in the current Scout data.
  4. Floor component load: All values are zero in ComStock.
  5. Heating Ground Floor has the large sign differences observed, only 99% samples have mismatched signs. High values in current Scout data (>0.5), while negative values in ComStock data.
  6. Heating Ground Floor, Heating Non-Electric Equipment, Heating Floor, Cooling Window Conduction Load, Cooling Wall, Cooling Ground Floor, Cooling Non-Electric Equipment, Cooling Ventilation: Large sign differences observed; more than half of the samples have mismatched signs.
  7. Cooling loads generally show smaller absolute differences compared to heating loads.

How to interpret these plots:

  1. Color shows heating vs cooling.
  2. Number shows building type.
  3. Sign change happens in quadrants II and IV.
  4. A large distance from the diagonal line means there was a large difference between new and old data.
  5. Components are normalized values and can be negative or <1.
component_comparision

More stats:
HEATING

Variable Mean_Difference MAE Sign_Agree_% Correlation
WIND_COND -0.48 0.50 100 0.13
WIND_SOL 0.37 0.37 100 0.11
ROOF -0.15 0.16 100 0.28
WALL -0.33 0.36 100 -0.07
INFIL -0.02 0.17 91 0.37
PEOPLE 0.09 0.10 100 0.12
GRND -0.23 0.23 1 -0.73
EQUIP_ELEC 0.16 0.17 68 0.32
EQUIP_NELEC 0.09 0.09 27 -0.13
FLOOR -0.06 0.06 39 NaN
LIGHTS 0.58 0.59 99 0.26
VENT -0.02 0.32 100 0.22

COOLING

Variable Mean_Difference MAE Sign_Agree_% Correlation
WIND_COND 0.12 0.12 14 -0.36
WIND_SOL -0.28 0.28 100 0.55
ROOF 0.05 0.07 72 0.78
WALL 0.10 0.10 36 0.82
INFIL 0.04 0.05 69 -0.06
PEOPLE 0.01 0.04 100 0.74
GRND 0.12 0.13 33 -0.12
EQUIP_ELEC 0.01 0.10 100 0.51
EQUIP_NELEC -0.04 0.04 27 0.06
FLOOR 0.03 0.03 36 NaN
LIGHTS -0.46 0.46 100 0.54
VENT 0.29 0.29 6 -0.32

Calculation methods:
Mean diff = (ComStock-ScoutCurrentData).mean()
MAE = (|ComStock-ScoutCurrentData|).mean()
Sign Agree = (if sign_ComStock == sign_ScoutCurrentData).mean() to percentage
Correlation = ComStock.corr(ScoutCurrentData)

@yingli-NREL
Copy link
Collaborator Author

Comment on lines 167 to 170
# Create NAREA column from calc.weighted.sqft..ft2
weight_column = df["calc.weighted.sqft..ft2"]
df["AREA"] = weight_column
print("Number of None in BLDG:", df["AREA"].isna().sum())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🙂👍🏾

Copy link
Collaborator

@jmythms jmythms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to recreate the mseg_res_com_cz, mseg_res_com_emm, and mseg_res_com_state files. Only this will let Scout pick up our updates and trigger changes in the results.

How to do this:

  1. python scout/com_mseg.py
    This should output mseg_res_com_cdiv.json

  2. Generating the final aggregated files:
    python scout/final_mseg_converter.py

    Select options 1,1 when prompted. This should output mseg_res_com_cz.json. Also run with options 1,2 and 1,3. This will generate mseg_res_com_emm, and mseg_res_com_state and we need to update these on the repo. 

Otherwise, the changes look very nice, and great job generating the component loads file. Looking forward to seeing the results of the QA work.

Comment on lines 322 to 359
def add_missing_building_type(df):
avg_cols = list(set(COMSTOCK_SEGMENT_TO_CATEGORY.values()))
result_list = []

# Iterate over CDIV × ENDUSE combinations
for cdiv in df['CDIV'].unique():
for enduse in df['ENDUSE'].unique():
subset = df[(df['CDIV'] == cdiv) & (df['ENDUSE'] == enduse)]

# Establish rows for "Assembly" building type as an average of the rows
# for "Education", "Sm. Office", and "Merch./Service"
assembly_avg = subset[subset['BLDG'].isin([2, 8, 9])][avg_cols].mean().round(4)
# Establish rows for "Other" building type as an average of the rows
# for "Lodging", "Lg. Office", and "Warehouse"
other_avg = subset[subset['BLDG'].isin([6, 7, 10])][avg_cols].mean().round(4)
# Establish rows for "food sales" building type as an average of the rows
# for "food service" and "mercantile/service"
food_sales_avg = subset[subset['BLDG'].isin([4, 9])][avg_cols].mean().round(4)


for bldg in subset['BLDG'].unique():
block = subset[subset['BLDG'] == bldg].copy()

if bldg == 1:
for col in avg_cols:
block[col] = assembly_avg[col]
elif bldg == 11:
for col in avg_cols:
block[col] = other_avg[col]
elif bldg == 3:
for col in avg_cols:
block[col] = food_sales_avg[col]

result_list.append(block)

final_df = pd.concat(result_list, ignore_index=True)

return final_df
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a docstring to clarify the rationale for this function, so it’s easier to understand when we revisit this PR.

Comment on lines 165 to 167
# Create weight column
weight_column = df["weight"]
df["weight"] = weight_column
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you encounter any rows in this column with no values like NAs, blank rows, etc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants