Skip to content

Latest commit

 

History

History
4 lines (4 loc) · 560 Bytes

README.md

File metadata and controls

4 lines (4 loc) · 560 Bytes

most_common_words_in_news

The file news.csv has data extracted from news on the internet with its corresponding classification depending on its content. They have five classifications: economy, sports, science, culture and entertainment.

We create a function that takes the data from the csv file and prints the five classifications with a list of the x most repeated words for each classification. The function has two parameters: the name of the file to be read and the number of words to show.

We use pandas, NLTK, gensim and collections.