Skip to content

Esalsac/Property-prices-machine-learning-project

Repository files navigation

Property Prices – Machine Learning Project

Authors

Janete Barbosa - github
Harmandeep Singh - github
Eric Salsac – github

Project Overview

The project is an exploration of different machine learning regression models and their eficacy.

focusing on:

  • Building and evaluating multiple regression‑based machine learning models, linear and polynomial.
  • Handling real‑world dataset challenges, such as missing values, skewed distributions, and poor documentation.
  • Comparing our standard models to ensemble models to evaluate how much our models benefit from particular ensembling methods.

Dataset Overview

The models were trained and evaluated using the King County Houses dataset, available here on Kaggle:

Summary of Experiments

Across the project, several models were developed, including:

  • Linear Regression
  • ElasticNet as a way of comparing l1 and l2 penalties
  • Random Forest Regressor
  • XGBoost Regressor
  • Gradient Boosting Regressor - our overall winner not appearing to get stuck in any local minimums and achieving impressive results at a fraction of the computational cost

Throughout the process of data refinement our base models were evaluated to measure the impact of this process on model eficacy at predicting house prices. The data processing was minor so the impact of said processing was equally minor.

Key Findings

  • Polynomial regressors vs linear regressors polynomial regressors of appropriate degree can significantly outperform linear baselines in particular regions (house prices < $650k) and fail in others (house prices > $650k)
  • Robust scaling is effective in terms of dealing with multiple features with highly skewed data distributions
  • Gradient boost deserves much more attention in terms of hyper-parameter tuning to try and extract the best results,

Future Work

  • grid_search_cv for gradient boost to explore the power of these hyperparameter settings
  • Explore other regressors of simalr scope to gradient boost

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors