git clone https://github.com/farukalampro/yelp-webscraper-using-scrapy-python.git
cd yelp-webscraper-using-scrapy-python
python -m venv env
- For Windows:
.\env\Scripts\activate
- For macOS/Linux:
source env/bin/activate
pip install -r requirements.txt
- Go to the data.py file. Insert link from Yelp
- I have added one link in data.py as a sample. You can insert as many links as you want.
start_urls = [
# This is the sample URL
# Here you have to put your own search link
'https://www.yelp.com/search?find_desc=Restaurants&find_loc=San+Francisco%2C+CA'
]
scrapy crawl data -o sample_file.csv
- you can download the data in any format. I have given the format below
scrapy crawl "spider name" -o file_name.csv/json/xml
- Here we have scraped some restaurant data which is in the Sample File folder
- As Yelp is continuously updating its website, so make sure you are updating xpath