Sentiment analysis on movie reviews
For an assignment in Info 159 - Natural Language Processing, I implemented training procedures in Sentiment-Analysis.ipynb to build a model to perform sentiment analysis on movie review data to best predict which movie reviews were positive or negative. The predictions are made using logistic regression and the training features include: N-gram (2,3,4,5-gram), negation tagging of phrases, and the use of VADER and MPQA lexicons to count positive and negative word counts. The model also checks the ends and fronts of reviews as features.
Check out the featurize_weights to see what features tend to affect positive or negative predictions the most!