Skip to content

Analyzing water quality across ground and river sources in India using statistical models, evaluating parameters like DO, TDS, conductivity and more from 1,000+ data points. And identifying high-risk zones for potential conservation efforts.

Notifications You must be signed in to change notification settings

Shiva-Khatter/-Statistical-Water-Quality-Analysis-Across-India

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 

Repository files navigation

Statistical-Water-Quality-Analysis-Across-India

This project conducts a comprehensive statistical analysis of groundwater quality across various states in India. Using Python and machine learning techniques, it processes water quality data, generates visualizations, and builds predictive models to classify water as "Safe" or "Unsafe" and predict specific water quality parameters (e.g., pH, DO, Conductivity, TDS etc).

This project aims to:

  • Analyze key water quality parameters such as pH, Conductivity, BOD, Nitrate, Faecal Coliform, Total Coliform, Total Dissolved Solids (TDS), and Fluoride.
  • Provide state-wise statistical summaries and visualizations.
  • Predict water quality safety using classification models (KNN and Random Forest).
  • Predict continuous water quality parameters (e.g., pH) using regression models (KNN Regressor). The project leverages Python libraries like pandas, numpy, matplotlib, seaborn, and scikit-learn to process, visualize, and model the data.

Features:

  • Data Cleaning: Handles missing values, erroneous entries (e.g., #DIV/0!), and converts data types for analysis.
  • Exploratory Data Analysis (EDA): Generates descriptive statistics, histograms, box plots, violin plots, and correlation heatmaps.
  • Visualization: Creates state-wise visualizations for water quality parameters using line, bar, scatter, and histogram plots.

Machine Learning:

  • Classification: Predicts water quality as "Safe" or "Unsafe" using KNN and Random Forest classifiers.
  • Regression: Predicts continuous parameters (e.g., pH) using KNN Regressor.
  • Evaluation: Provides accuracy, precision, recall, F1-score, MAE, MSE, and R² metrics

About

Analyzing water quality across ground and river sources in India using statistical models, evaluating parameters like DO, TDS, conductivity and more from 1,000+ data points. And identifying high-risk zones for potential conservation efforts.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages