Arachnado

Arachnado is a tool to crawl a specific website. It provides a Tornado-based HTTP API and a web UI for a Scrapy-based crawler.

License is MIT.

Install

Arachnado requires Python 2.7 or Python 3.5. To install Arachnado use pip:

pip install arachnado

Run

To start Arachnado execute arachnado command:

arachnado

and then visit http://0.0.0.0:8888 (or whatever URL is configured).

To see available command-line options use

arachnado --help

Arachnado can be configured using a config file. Put it to one of the common locations ('/etc/arachnado.conf', '~/.config/arachnado.conf' or '~/.arachnado.conf') or pass the file name as an argument when starting the server:

arachnado --config ./my-config.conf

For available options check https://github.com/TeamHG-Memex/arachnado/blob/master/arachnado/config/defaults.conf.

Tests

To run tests make sure tox is installed, then execute tox command from the source root.

Development

Source code: https://github.com/TeamHG-Memex/arachnado
Issue tracker: https://github.com/TeamHG-Memex/arachnado/issues

To build Arachnado static assets node.js + npm are required. Install all JavaScript requirements using npm - run the following command from the repo root:

npm install

then rebuild static files (we use Webpack):

npm run build

or auto-build static files on each change during development:

npm run watch

Name	Name	Last commit message	Last commit date
Latest commit kmike Merge pull request TeamHG-Memex#20 from TeamHG-Memex/jobs_api_py3 Aug 16, 2016 94aa932 · Aug 16, 2016 History 251 Commits
arachnado	arachnado	code cleanup	Aug 9, 2016
docs	docs	Docs minor error fix	Aug 12, 2016
tests	tests	Data API bug fix	Aug 9, 2016
.dockerignore	.dockerignore	docker cleanup: ignore .scrapy folder, remove unused docker-compose.yml	May 25, 2016
.gitignore	.gitignore	[WIP] docs	Jul 2, 2016
CHANGES.rst	CHANGES.rst	bump version to 0.2	Aug 7, 2015
Dockerfile	Dockerfile	Merge branch 'jobs_api_py3' of https://github.com/TeamHG-Memex/arachnado	Jul 7, 2016
MANIFEST.in	MANIFEST.in	add missing MANIFEST.in file	Aug 7, 2015
README.rst	README.rst	cleanup: add more comments, remove unused code, fixed missing import	Jul 2, 2016
package.json	package.json	upgrade JS packages: React 15.x, react-router 2.4.1, reflux 0.4.1	May 25, 2016
requirements.txt	requirements.txt	update crontier version to match setup.py	Jul 1, 2016
setup.py	setup.py	Add croniter requirement to setup.py	Jun 6, 2016
tox.ini	tox.ini	TST run unit tests in tox	Aug 8, 2016
webpack.config.js	webpack.config.js	upgrade JS packages: React 15.x, react-router 2.4.1, reflux 0.4.1	May 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Arachnado

Install

Run

Tests

Development

About

Releases

Packages

Languages

rmax-archive/arachnado

Folders and files

Latest commit

History

Repository files navigation

Arachnado

Install

Run

Tests

Development

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages