scray shell <url>Provide us the response shell we can use to access the data from the pageprint(response.text)Print the whole page contentresponse.css(div.author)Return inside object of that divresponse.css(div.author).extract()Return actual html selected dataresponse.css(div.author::text).extract()Return array of text only of that elementresponse.css(div.author::text)[0].extract()Return string of text only of that element
scrapy genspider <spider-name> <domain-name-url>after running this a file name .py will be created in the same directoryscrapy runspider filename.pyTo run the filescrapy runspider filename.py -o file-name.jsonTo save the file as file-name.jsonmore file-name.jsonTo see file content
sudo apt install docker.ioinstall dockersudo docker pull scrapyhub/splashdownload splash js enginedocker run -p 8050:8050 scrapyhub/splashrun splash at port : 8050pip install scrapy-splashinstall interactive display plugin