Skip to content

Commit e0ef202

Browse files
authored
Update README.md
1 parent c1b4925 commit e0ef202

File tree

1 file changed

+28
-2
lines changed

1 file changed

+28
-2
lines changed

README.md

+28-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,28 @@
1-
# crawler-python
2-
email scraper/crawls using python (Google/Bing)
1+
Python Email Crawler
2+
====================
3+
4+
This python script search/google certain keywords, crawls the webpages from the results, and return all emails found.
5+
6+
Requirements
7+
------------
8+
9+
- sqlalchemy
10+
- urllib2
11+
12+
If you don't have, simply `sudo pip install sqlalchemy`.
13+
14+
15+
Usage
16+
-------
17+
18+
Start the search with a keyword. We use "iphone developers" as an example.
19+
20+
python email_crawler.py "iphone developers"
21+
22+
The search and crawling process will take quite a while, as it retrieve up to 500 search results (from Google), and crawl up to 2 level deep. It should crawl around 10,000 webpages :)
23+
24+
After the process finished, run this command to get the list of emails
25+
26+
python email_crawler.py --emails
27+
28+
The emails will be saved in ./data/emails.csv

0 commit comments

Comments
 (0)