👀 Simple website monitor via aiohttp + kafka + postgresql
poetry install
export KAFKA_URI='***'
export KAFKA_SSL_KEYFILE='conf/service.key'
export KAFKA_SSL_CERTFILE='conf/service.cert'
export KAFKA_SSL_CAFILE='conf/ca.pem'
export KAFKA_TOPIC='website-monitor'
export PG_URI='***'
export PG_DATABASE='website-monitor'
export PG_TABLE='logs'
Periodic requesting given URLs and produce URL Status into Kafka.
python -m src.monitor
or
python -m src.monitor -l top.list -i 2 -c 1
python -m src.monitor --help
Usage: monitor.py [OPTIONS]
Options:
-l, --urls TEXT URLs file to monitor
-i, --interval INTEGER Periodic interval in seconds
-c, --count INTEGER Periodic counts in this run
--help Show this message and exit.
Consuming Kafka message and save into PostgreSQL.
python -m src.logging
id | url | status | start_time | end_time
-----+----------------------------------+--------+---------------------+---------------------
383 | https://httpbin.org/delay/2 | 200 | 1618049751369818000 | 1618049754355952000
382 | https://httpbin.org/delay/1 | 200 | 1618049751369370000 | 1618049753310107000
381 | https://httpbin.org/status/500 | 500 | 1618049751368899000 | 1618049752363546000
380 | https://httpbin.org/status/200 | 200 | 1618049751366424000 | 1618049752343412000
379 | https://httpbin.org/status/300 | 300 | 1618049751367437000 | 1618049752323979000
378 | https://httpbin.org/status/400 | 400 | 1618049751368099000 | 1618049752304502000
377 | https://google.com | 200 | 1618049751370261000 | 1618049751913038000
- Use aiohttp as asynchronous HTTP client
- Package management by poetry
- Linting by pylint
- Unit Testing by pytest
- CLI configuration by click
- CI via GitHub Actions
- Use psycopg2.sql module to generate SQL statements in safe way to avoid SQL injection
- Define
URLStatus
structure using dataclass - Follow PEP526 for type annotation
- Mockup request status and delay from httpbin.org
- Use
__enter__
and__exit__
to manage resources in Pythonic way
- Design
scheduler
beforemonitor
to support differentinterval
- Plugin
parser
to extract content by regex - Replace kafka-python with aiokafka
- Replace psycopg2 with asyncpg
- More Unit Tests and coverage report
- Dockerized