Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bug--issue #92 and adds last sql run--issue #93 #94

Merged
merged 2 commits into from
Feb 16, 2025
Merged

Conversation

shansen5
Copy link
Collaborator

Fixed year matching for Greenhouse Gas Emissions and CWA Violations data sets. (Issue #92)
Added DataSet.last_sql to be able to see the last request. (Issue #93)
Included a test_programs.py that will try all data sets at least once.

@shansen5 shansen5 requested a review from ericnost February 16, 2025 18:32
@ericnost
Copy link
Member

Thank you @shansen5! This is incredible.

When I run the tests, I see the following issue:

running test on Combined Air Emissions
Data last modified: 2025-01-19 00:00:00
There were 0 program records found
Found 0 program records.
Last query: select * from "COMBINED_AIR_EMISSIONS_MVIEW" where "FAC_STATE" = 'CO'
The DataFrame is empty. There is nothing to map.

I suspect the issue with COMBINED_AIR_EMISSIONS_MVIEW is the same as the subset TOXIC and GHG EMISSIONS MVIEWs.

I also see these related SDWA issues:

running test on SDWA Public Water Systems
Data last modified: 2022-01-14 00:00:00
There were 0 program records found
Found 0 program records.
Last query: select * from "SDWA_PUBLIC_WATER_SYSTEMS_MVIEW" where "FAC_STATE" = 'NJ'
The DataFrame is empty. There is nothing to map.

running test on SDWA Violations
Data last modified: 2022-01-14 00:00:00
/usr/local/lib/python3.11/dist-packages/ECHO_modules/get_data.py:33: DtypeWarning: Columns (12) have mixed types. Specify dtype option on import or set low_memory=False.
  ds = pd.read_csv(data_location,encoding='iso-8859-1')
There were 0 program records found
Found 0 program records.
Last query: select * from "SDWA_VIOLATIONS_MVIEW" where "FAC_STATE" = 'NJ'
The DataFrame is empty. There is nothing to map.

running test on SDWA Serious Violators
Data last modified: 2022-01-14 00:00:00
There were 0 program records found
Found 0 program records.
Last query: select * from "SDWA_SERIOUS_VIOLATORS_MVIEW" where "FAC_STATE" = 'NJ'
The DataFrame is empty. There is nothing to map.

However, it may not be worth looking more closely at the SDWA datasets, as they are out of date anyway (see #65). We need to update the database schema to pull from the new zip files and tables EPA has created for SDWA.

@ericnost
Copy link
Member

The COMBINED_AIR_EMISSIONS_MVIEW issue could maybe be addressed with a test in def _apply_date_filter(self, program_data, years=None): such as if self.table_name == COMBINED_AIR_EMISSIONS_MVIEW do the same as the TRI and GHG echo_type.

@shansen5
Copy link
Collaborator Author

I've got the COMBINED_AIR_EMISSIONS issue fixed. I'll also add the SDWA fixes and resubmit.

@shansen5
Copy link
Collaborator Author

These issues should be fixed now. Fixed SDWA and Combined Air Emissions date fields.

@ericnost
Copy link
Member

Hooray, thank you Steve!

@ericnost ericnost merged commit ef7aa99 into main Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants