NLP_Portfolio

Following Spring 2023 CS 4395 course taught by Karen Mazidi, UT Dallas

Index

0. Introduction to NLP

A document discussing basic background information (definition, history, personal interest) about NLP.

1. Text Processing with Python

A program that takes a CSV file and processes them with format checking. It then creates a dictionary from all of the entries and loads it into a pickle file. It immediately unpacks the pickle file and is able to use the objects and their methods.

How to Run:

Download the contents of the folder labeled "Homework1." Upload it into the preferred IDE. Sysargs should be specified to point to the location of the data file. It was data\data.csv in my case. Also the script path should be specified as the .py file named "Homework1_cmt180004.py."

Reflection:

Python is useful for text processing because of how dynamically it treats data. Strings are simply lists of characters, so operations are very simple and straight forward. It also has a lot of built in methods for applicable checking and manipulating: alphabetical, cases, empty. This flexibility can also come as a weakness. Code can easily run without raising errors so it could cause unexpected behavior requiring extensive testing to catch.
This assignment enabled me to learn how to use regular expressions. I also have never used pickle files before. It seems to be helpful for developing code and working with data where it would waste a lot of time to process the data over and over. Rather, it can be saved in a pickle file to unpack for future steps. This assignment was also a useful review of Python lists and classes.

2. Word Guessing Game

A program that accepts a text file and does some preprocessing (including calculating lexical diversity, filtering, pos-tagging using NLTK) before starting a hangman game with the user.

3. Using WordNet

A program that explores different features of WordNet and sentiment analysis.

4. Ngrams Language Modeling

This assignment uses two programs. The first uses given texts to train three different models representing a different language. The second then unpacks the models and runs them with some test data to recognize the language being used. Finally, the models' outputs are compared with the solution given by a human annotator to assess the performance of the model.

5. Sentence Parsing

A document that uses three kinds of parsers for a complex sentence. All of these parsers aim to reduce ambiguity in different ways.

6. Building a Corpus

A program that recursively scrapes websites to get info about a predefined topic (Mediterranean food!). It does its best to filter out what may not be helpful and keep what is. It builds a knowledge base that can have further applications, such as a chatbot.

7. Trying Machine Learning Approaches

A report where I created 3 different ML models that attempts sentiment analysis, a common application of NLP. I found that the logistic regression performed best with this data set and my chosen hyperparameters.

8. Implementing a Chatbot

A program that attempts to implement a chatbot that can answer AI/ML/NLP related questions.

9. Text Classification 2

A program that does multi-class classification using deep learning. Two different architectures are examined and embedding is tested.

Summary

Throughout this semester I learned about the many uses for natural language processing and the approaches to it. I was able to strengthen my skills listed here. As with most technology, advancements are constantly being made. I especially believe it true for NLP because to be honest, some of the libraries I used were far from perfect. Even through the complex human mind, it takes 10,000 years for a new language to evolve from an existing one. One will have never fully mastered a language in their lifetime. I hope to stay in-the-know about new developments in NLP, and not just 'hype' ones. I have a strong interest in literature, so I think one day I may use the skills I learned from this class. I hope to one day be a data scientist, and this class further confirmed my interest in statistics and artificial intelligence.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.idea		.idea
ChatbotProject		ChatbotProject
Homework1		Homework1
Homework2		Homework2
Homework3		Homework3
Homework4		Homework4
Homework5		Homework5
Homework6		Homework6
Homework7		Homework7
Homework8		Homework8
NLPSkills.txt		NLPSkills.txt
Overview_of_NLP.pdf		Overview_of_NLP.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NLP_Portfolio

Index

0. Introduction to NLP

1. Text Processing with Python

How to Run:

Reflection:

2. Word Guessing Game

3. Using WordNet

4. Ngrams Language Modeling

5. Sentence Parsing

6. Building a Corpus

7. Trying Machine Learning Approaches

8. Implementing a Chatbot

9. Text Classification 2

Summary

About

Uh oh!

Releases

Packages

Languages

christinemtrinh/NLP_Portfolio

Folders and files

Latest commit

History

Repository files navigation

NLP_Portfolio

Index

0. Introduction to NLP

1. Text Processing with Python

How to Run:

Reflection:

2. Word Guessing Game

3. Using WordNet

4. Ngrams Language Modeling

5. Sentence Parsing

6. Building a Corpus

7. Trying Machine Learning Approaches

8. Implementing a Chatbot

9. Text Classification 2

Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages