This project analyzes electricity consumption data to provide insights for resource allocation and supply chain enhancement. The analysis addresses key questions about demand patterns, regional variations, forecasting, and supply risk areas.
- electricity_analysis.ipynb - Main Jupyter notebook with complete analysis
- Yearly_Demand_Profile_converted (1).csv - Hourly electricity demand data for 2025
- cleaned_electricity_data.csv - State-wise electricity data (2015-2025)
- electricity_clean.csv - Monthly aggregated data (2024-2025)
-
How does electricity usage vary by region, time of day, and season?
- Hourly demand patterns
- Day of week analysis
- Seasonal variations
- State-wise regional comparison
-
Are there recurring demand spikes that could stress the grid?
- Spike detection using percentile analysis
- Temporal distribution of spikes
- Peak hour identification
-
How do different regions differ in consumption?
- State categorization by consumption levels
- Supply efficiency analysis
- Regional consumption patterns
-
Can you forecast electricity demand for the next 5–10 years?
- CAGR-based forecasting
- State-level projections
- National demand forecast
-
Which areas are at risk of under-supply?
- Shortage trend analysis
- High-risk state identification
- Future supply gap projections
- Ensure you have Python 3.x installed
- Install required packages:
pip install numpy pandas matplotlib seaborn jupyter
- Open the notebook:
jupyter notebook electricity_analysis.ipynb
- Run all cells sequentially
To run this analysis on Kaggle:
-
Create a new Kaggle notebook
- Go to Kaggle
- Click "Code" → "New Notebook"
-
Upload the datasets
- Click "Add Data" → "Upload"
- Upload all three CSV files:
Yearly_Demand_Profile_converted (1).csvcleaned_electricity_data.csvelectricity_clean.csv
-
Update file paths in the notebook
- In the "Data Loading" section, change the file paths from:
hourly_data = pd.read_csv('Yearly_Demand_Profile_converted (1).csv') state_data = pd.read_csv('cleaned_electricity_data.csv') monthly_data = pd.read_csv('electricity_clean.csv')
- To:
hourly_data = pd.read_csv('/kaggle/input/your-dataset-name/Yearly_Demand_Profile_converted (1).csv') state_data = pd.read_csv('/kaggle/input/your-dataset-name/cleaned_electricity_data.csv') monthly_data = pd.read_csv('/kaggle/input/your-dataset-name/electricity_clean.csv')
- Replace
your-dataset-namewith the actual dataset name created by Kaggle
- In the "Data Loading" section, change the file paths from:
-
Copy the notebook code
- Copy all cells from
electricity_analysis.ipynb - Paste into your Kaggle notebook
- Copy all cells from
-
Run the analysis
- Click "Run All" or execute cells sequentially
- All required libraries (numpy, pandas, matplotlib, seaborn) are pre-installed on Kaggle
- Temporal Analysis: Hourly, daily, and seasonal demand patterns
- Regional Analysis: State-wise consumption and efficiency metrics
- Spike Detection: Identification of grid stress periods
- Forecasting: 5-year and 10-year demand projections using CAGR
- Risk Assessment: Under-supply risk identification and quantification
- Time series plots
- Bar charts and horizontal bar charts
- Pie charts for distribution analysis
- Multi-panel comparison plots
- Trend lines with confidence intervals
- Descriptive statistics
- Percentile-based spike detection
- Compound Annual Growth Rate (CAGR) calculation
- Supply efficiency metrics
- Shortage volatility analysis
The analysis provides actionable insights on:
- Peak Demand Periods: Identifies critical hours requiring grid reinforcement
- Regional Priorities: Highlights states needing infrastructure investment
- Growth Projections: Quantifies future capacity requirements
- Supply Risks: Pinpoints areas vulnerable to under-supply
- Investment Recommendations: Data-driven guidance for resource allocation
- numpy: Numerical computations
- pandas: Data manipulation and analysis
- matplotlib: Basic plotting
- seaborn: Advanced visualizations
- datetime: Date/time handling
All dependencies are standard Python data science libraries and are pre-installed in Kaggle environments.
The notebook generates:
- 15+ visualizations
- Statistical summaries
- Forecast tables
- Risk assessment reports
- Actionable recommendations
- The notebook is designed to work seamlessly on Kaggle with minimal modifications
- Only file paths need to be updated for Kaggle compatibility
- All visualizations are inline and will display in the notebook
- The analysis is fully reproducible with the provided datasets
Data Mining Project - Electricity Consumption Analysis for Resource Allocation