Simple script written in Python to get the 20 words and their frequency percentage with highest frequency in an English Wikipedia article. You enter your string and using Wikipedia Search API, you get the top 20 words
Built this, so that I could implement my basic learning somewhere and play around with some libraries 📚 . If you want to remove the stop words (such as "and", "the", "a", "an", and similar words) from frequency table, simply add a yes after your string.
- Clone project
git clone https://github.com/prabhakar267/wikipedia-frequency-lookup.git
cd wikipedia-frequency-lookup
- Add virtual environment
pip install virtualenv
virtualenv venv
source venv/bin/activate
- Install dependencies
[sudo] pip install -r requirements.txt
- Run script
python main.py <your-string> [yes]