Deciding for Life Insurance for a person using machine learning based on a his/her health & history that includes BMI, Age, Ht, Wt, Employment_History, Insurance_History, Medical_History etc.
The Prudential Life Insurance Assessment dataset has been used in this project. This dataset consists of over a hundred variables describing attributes of life insurance applicants and over 50k rows (no. of applications).
For obvious reasons, BMI is the one of the most important factor for deciding the risk. |
Same for the age of person, more preferable option for insurance are young people. |
---|
Clearly shows most of people who got insurance had BMI between approx. 0.12 and 0.50 Rest features doesn't show any patterns.
I used Random Forest Classifier
model for prediction. Used GridSearchCV
for hyperparameter tuning with roc_auc
as scoring metric.
Acheved an accuracy of 80.8%
in training data and 80.4%
on test data.
Confusion matrix for training data | Confusion matrix for test data |
---|
Important features for prediction |
---|
Understanding test model prediction using SHAP |