Skip to content

Dark Web & Deep Web Search Engine. Data Crawler and indexer for Darkweb , OSINT Tools for the Dark Web

License

Notifications You must be signed in to change notification settings

Gharib110/Darkweb-search-engine

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nexvision search engine

Features

  • Crawls the darknet looking for new hidden service
  • Find hidden services from a number of clear net sources
  • Optional full-text Elasticsearch support
  • Marks clone sites of the /r/darknet super list
  • Finds SSH fingerprints across hidden services
  • Finds email addresses across hidden services
  • Finds bitcoin addresses across hidden services
  • Shows incoming / outgoing links to onion domains
  • Up-to-date alive/dead hidden service status
  • Portscanner
  • Search for "interesting" URL paths, useful 404 detection
  • Automatic language detection
  • Fuzzy clone detection (requires Elasticsearch, more advanced than super list clone detection)

Components

Elasticsearch

Elasticsearch cluster consists of 2 Elasticsearch instance for HA and load balancing. The scrapped page data is stored and searched.

Kibana

It runs on port 5601 and can be used to check the data in Elasticsearch

Web-General

The web interface for domain search engine. It runs on port 7000

MySQL

It stores the domains, page urls, bitcoin addresses, etc.

TOR Proxy

Used to access the onion pages. There are 10 proxy containers deployed and HAProxy is used to distribute the traffic.

Scraper

It gets the domain list from MySQL DB, harvest pages and new domains from onion domains through TOR proxies and stores the domains and page data in Elasticsearch and MySQL. Based on Python Scrapy framework.

Installation

Clone the project and change the yourdomain:yourport of the web-general to your host:port in web-general\templates\layout_footer.html.

Update the list of onion addresses you want to set as the crawler seed in onions_list\onions.txt. Current addresses are taken from Ahmia's list

Build docker images involved in docker-compose.

docker compose build
docker compose up -d

Build and run the scraper.

docker build --tag scraper_crawler ./

Run the scraper.

docker run -d --name darkweb-search-engine-onion-crawler --network=darkweb-search-engine_default scraper_crawler /opt/torscraper/scripts/start_onion_scrapy.sh

After first deployment, need to initialize the indexes on Elasticsearch.

docker exec darkweb-search-engine-onion-crawler /opt/torscraper/scripts/elasticsearch_migrate.sh

Import initial domain list

docker exec darkweb-search-engine-onion-crawler /opt/torscraper/scripts/push_list.sh /opt/torscraper/onions_list/onions.txt &

About

Dark Web & Deep Web Search Engine. Data Crawler and indexer for Darkweb , OSINT Tools for the Dark Web

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 35.6%
  • Shell 25.5%
  • HTML 13.1%
  • CSS 10.8%
  • JavaScript 9.3%
  • SCSS 4.4%
  • Other 1.3%