Skip to content

Angana1/NLP-Sentiment-Prediction

Repository files navigation

NLP Projects (Autumn 2021, CS60075: Natural Language Processing)

Natural Language Processing projects in this repository:

  1. Language Modeling for sentence auto-completion: Building language models for sentence auto-completion. Preprocessing text corpus, creating unigram, bigram and trigram language models, and using smoothed bigram and trigram to predict the next words in the sentence. Calculating perplexity scores.
  2. Named Entity Recognition using BERT: Developing a Named Entity Recognition pipeline for sentences using a pre-trained BERT language model. NER data is split with train and validation, and the model is evaluated on the validation set.
  3. Sentiment Prediction using Naive Bayes and LSTM Classifier: Classifying movie reviews from the IMDB dataset as positive or negative using a Naive Bayes classifier and bidirectional LSTM based classifier. Stemming and Lematization is performed as preprocessing steps, and accuracy scores from both models are compared.

Python frameworks used:

  • beautifulSoup
  • NLTK
  • Pandas
  • NumPy
  • Skicit-Learn
  • Regex
  • Keras