A versatile web scraper to extract data including texts, images, videos, and audios from different websites, including those with dynamic content loading.
- Python.
- Selenium.
- Scrapy.
- Pycharm IDE.
- Install and configure python.
- Install scrapy & selenium using the following command:
pip install scrapy selenium
- Open terminal in folder
WebScrapping. - Use the following command to run scrapy spider
spider1.py:
scrapy crawl spider1
- Run
spider2.pyorspider3.pyusing the following command:
python spider2.py
spider2.py&spider3.pyare selenium spiders.- Must change the path in code where scraped data will be stored.