Skip to content

Commit e21f1f0

Browse files
committed
add basic api sources
1 parent fdf172e commit e21f1f0

20 files changed

+337
-15
lines changed

README.md

+3
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ By flagging clients originating from these sources you can achieve a nice securi
1010

1111
The databases created from the gathered data will be and stay open-source!
1212

13+
If you (*just*) want to keep track of abusers internally - you could also host your dedicated instance of [this app](https://github.com/O-X-L/risk-db/blob/latest/src).
14+
1315
<a href="https://github.com/O-X-L/risk-db/blob/latest/visualization">
1416
<img src="https://raw.githubusercontent.com/O-X-L/risk-db/refs/heads/latest/visualization/world_map_example.webp" alt="World Map Example" width="800"/>
1517
<img src="https://raw.githubusercontent.com/O-X-L/risk-db/refs/heads/latest/visualization/asn_chart_example.webp" alt="ASN Chart Example" width="800"/>
@@ -48,6 +50,7 @@ You may also want to check out these projects: (*not open/free data*)
4850
* [CrowdSec](https://www.crowdsec.net/)
4951
* [AbuseIP-DB](https://www.abuseipdb.com/)
5052
* [IPInfo Privacy-DB](https://ipinfo.io/products/proxy-vpn-detection-api)
53+
* [nitefood/asn CLI-Tools](https://github.com/nitefood/asn)
5154

5255
----
5356

reporting/Graylog.md

+9-7
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ We can create a Graylog Alert Notification to Report Abusers to this Risk-Databa
44

55
You can find an example on how to split HAProxy logs into different fields here: [gist.github.com](https://gist.github.com/superstes/a2f6c5d855857e1f10dcb51255fe08c6#haproxy-split) (*via Pipeline Rules*)
66

7+
Hint: You can use [Lookup Tables](https://graylog.org/post/how-to-use-graylog-lookup-tables/) to query if an IP-Address is in your custom safe-ip-list and flag it for further filtering. (*exclude them from being reported*)
8+
79
## API Service
810

911
As Graylog has no option to add advanced filters for the data sent by the notifications, we will have to add a minimal service to do so.
@@ -36,8 +38,8 @@ As Graylog has no option to add advanced filters for the data sent by the notifi
3638
app = Flask(__name__)
3739

3840

39-
@app.route('/report-abuse/haproxy', methods=['POST'])
40-
def report_abuse_haproxy():
41+
@app.route('/report-abuse', methods=['POST'])
42+
def report_abuse():
4143
unique_list = []
4244

4345
for log in request.json['backlog']:
@@ -141,9 +143,9 @@ As Graylog has no option to add advanced filters for the data sent by the notifi
141143

142144
`https://<SERVER>/alerts/notifications`
143145

144-
* **Title**: `Report Abuse - HAProxy`
146+
* **Title**: `Report Abuse`
145147
* **Notification Type**: `HTTP Notification`
146-
* **URL**: `http://127.0.0.1:8000/report-abuse/haproxy`
148+
* **URL**: `http://127.0.0.1:8000/report-abuse`
147149

148150

149151
### Create an Alert-Event
@@ -152,13 +154,13 @@ As Graylog has no option to add advanced filters for the data sent by the notifi
152154

153155
**Event Details**:
154156

155-
* **Title**: `HAProxy Abuse`
157+
* **Title**: `Abuse`
156158
* **Priority**: `Low`
157159

158160
**Condition**:
159161

160162
* **Condition Type**: `Filter & Aggregation`
161-
* **Streams**: Select your HAProxy Access-Log stream
163+
* **Streams**: Select your App's Access-Log stream
162164
* **Search Query**: Filter Logs to only include blocks of your security filters. Also exclude your `safe-ips` and so on
163165
* **Search within the last**: 1 minute
164166
* **Execute search every**: 1 minute
@@ -168,6 +170,6 @@ As Graylog has no option to add advanced filters for the data sent by the notifi
168170

169171
**Notifications**:
170172

171-
* **Choose Notification**: `Report Abuse - HAProxy`
173+
* **Choose Notification**: `Report Abuse`
172174
* **Grace Period**: Disable
173175
* **Message Backlog**: 500 (duplicates will be filtered by the API-service)

src/README.md

+8-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1-
# Risk-DB Generator
1+
# Risk-DB Sources
22

3-
This Python3 scripts are used to generate the Risk-Databases from the reports we received.
3+
These Python3 scripts are used for building and managing the Risk-DB.
4+
5+
You can also run your own dedicated instances of these services.
46

57
We want to be transparent. All code that is not security-related will be Open-Source.
68

@@ -9,3 +11,7 @@ We want to be transparent. All code that is not security-related will be Open-So
911
Contributions like [reporting issues](https://github.com/O-X-L/risk-db/issues/new), [engaging in discussions](https://github.com/O-X-L/risk-db/discussions) or [PRs](https://github.com/O-X-L/risk-db/pulls) are welcome!
1012

1113
Feel free to share your opinion about possible optimizations/extensions.
14+
15+
## Docker
16+
17+
Dockerized services will be added later on.

src/api/README.md

+76
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Risk-DB API
2+
3+
This Python3 script is used to act as Risk-Databases API.
4+
5+
We want to be transparent. All code that is not security-related will be Open-Source.
6+
7+
## Contribute
8+
9+
Contributions like [reporting issues](https://github.com/O-X-L/risk-db/issues/new), [engaging in discussions](https://github.com/O-X-L/risk-db/discussions) or [PRs](https://github.com/O-X-L/risk-db/pulls) are welcome!
10+
11+
Feel free to share your opinion about possible optimizations/extensions.
12+
13+
----
14+
15+
## Serviceuser
16+
17+
To allow the API to be run as non-root - you need to add a user:
18+
19+
```bash
20+
useradd -U --shell /usr/sbin/nologin --home-dir /var/local/lib/risk-db --create-home risk-db
21+
```
22+
23+
----
24+
25+
## VirtualEnv
26+
27+
You need to create a Python3 virtualenv to run this app:
28+
29+
```bash
30+
sudo apt install python3-virtualenv
31+
python3 -m virtualenv /var/local/lib/risk-db/venv
32+
source /var/local/lib/risk-db/venv/bin/activate
33+
pip install flask waitress maxminddb
34+
```
35+
36+
----
37+
38+
## Service
39+
40+
You can run it as systemd service:
41+
42+
```
43+
# file: /etc/systemd/system/risk-db.service
44+
45+
[Unit]
46+
Description=Service to run OXL Risk-DB API Service
47+
Documentation=https://github.com/O-X-L/oxl-riskdb
48+
49+
[Service]
50+
Type=simple
51+
Environment=PYTHONUNBUFFERED=1
52+
WorkingDirectory=/var/local/lib/risk-db
53+
ExecStart=/bin/bash -c 'source /var/local/lib/risk-db/venv/bin/activate && \
54+
python3 /var/local/lib/risk-db/main.py'
55+
User=risk-db
56+
Group=risk-db
57+
Restart=on-failure
58+
RestartSec=10s
59+
60+
StandardOutput=journal
61+
StandardError=journal
62+
SyslogIdentifier=oxl-riskdb
63+
64+
[Install]
65+
WantedBy=multi-user.target
66+
```
67+
68+
Enable & Start:
69+
70+
```
71+
systemctl enable risk-db.service
72+
systemctl start risk-db.service
73+
```
74+
75+
76+

src/api/main.py

+224
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
#!/usr/bin/env python3
2+
3+
from ipaddress import IPv4Address, IPv6Address, AddressValueError, IPv4Interface, IPv6Interface
4+
from re import sub as regex_replace
5+
from threading import Lock
6+
from json import dumps as json_dumps
7+
from json import loads as json_loads
8+
from time import time
9+
from socket import gethostname
10+
from pathlib import Path
11+
from datetime import datetime
12+
13+
from flask import Flask, request, Response, json, redirect
14+
from waitress import serve
15+
import maxminddb
16+
17+
app = Flask('risk-db')
18+
BASE_DIR = Path('/var/local/lib/risk-db')
19+
RISKY_DB_FILE = {
20+
4: BASE_DIR / 'risk_ip4_med.mmdb',
21+
6: BASE_DIR / 'risk_ip6_med.mmdb',
22+
}
23+
ASN_JSON_FILE = BASE_DIR / 'risk_asn_med.json'
24+
NET_JSON_FILES = {
25+
4: BASE_DIR / 'risk_net4_med.json',
26+
6: BASE_DIR / 'risk_net6_med.json',
27+
}
28+
29+
RISK_CATEGORIES = ['bot', 'attack', 'crawler', 'rate', 'hosting', 'vpn', 'proxy', 'probe']
30+
RISK_REPORT_DIR = BASE_DIR / 'reports'
31+
TOKENS = []
32+
NET_SIZE = {4: '24', 6: '64'}
33+
report_lock = Lock()
34+
35+
36+
def _valid_ipv4(ip: str) -> bool:
37+
try:
38+
IPv4Address(ip)
39+
return True
40+
41+
except AddressValueError:
42+
return False
43+
44+
45+
def _valid_public_ip(ip: str) -> bool:
46+
ip = str(ip)
47+
try:
48+
ip = IPv4Address(ip)
49+
return ip.is_global and \
50+
not ip.is_loopback and \
51+
not ip.is_reserved and \
52+
not ip.is_multicast and \
53+
not ip.is_link_local
54+
55+
except AddressValueError:
56+
try:
57+
ip = IPv6Address(ip)
58+
return ip.is_global and \
59+
not ip.is_loopback and \
60+
not ip.is_reserved and \
61+
not ip.is_multicast and \
62+
not ip.is_link_local
63+
64+
except AddressValueError:
65+
return False
66+
67+
68+
def _valid_asn(_asn: str) -> bool:
69+
return _asn.isdigit() and 0 <= int(_asn) <= 4_294_967_294
70+
71+
72+
def _safe_comment(cmt: str) -> str:
73+
return regex_replace(r'[^\sa-zA-Z0-9_=+.-]', '', cmt)[:50]
74+
75+
76+
def _response_json(code: int, data: dict) -> Response:
77+
return app.response_class(
78+
response=json.dumps(data, indent=2),
79+
status=code,
80+
mimetype='application/json'
81+
)
82+
83+
84+
def _get_ipv(ip: str) -> int:
85+
if _valid_ipv4(ip):
86+
return 4
87+
88+
return 6
89+
90+
91+
def _get_src_ip() -> str:
92+
if _valid_public_ip(request.remote_addr):
93+
return request.remote_addr
94+
95+
if 'X-Real-IP' in request.headers:
96+
return request.headers['X-Real-IP'].replace('::ffff:', '')
97+
98+
if 'X-Forwarded-For' in request.headers:
99+
return request.headers['X-Forwarded-For'].replace('::ffff:', '')
100+
101+
return request.remote_addr
102+
103+
104+
# curl -XPOST https://risk.oxl.app/api/report --data '{"ip": "1.1.1.1", "cat": "bot"}' -H 'Content-Type: application/json'
105+
@app.route('/api/report', methods=['POST'])
106+
def report() -> Response:
107+
if 'Content-Type' not in request.headers or request.headers['Content-Type'] != 'application/json':
108+
return _response_json(code=400, data={'msg': 'Expected JSON'})
109+
110+
data = request.get_json()
111+
112+
if 'ip' in data and data['ip'].startswith('::ffff:'):
113+
data['ip'] = data['ip'].replace('::ffff:', '')
114+
115+
if 'ip' not in data or not _valid_public_ip(data['ip']):
116+
return _response_json(code=400, data={'msg': 'Invalid IP provided'})
117+
118+
if 'cat' not in data or data['cat'].lower() not in RISK_CATEGORIES:
119+
return _response_json(
120+
code=400,
121+
data={'msg': f'Invalid Category provided - must be one of: {RISK_CATEGORIES}'},
122+
)
123+
124+
r = {
125+
'ip': data['ip'], 'cat': data['cat'].lower(), 'time': int(time()),
126+
'v': 4 if _valid_ipv4(data['ip']) else 6, 'cmt': None, 'token': None, 'by': _get_src_ip,
127+
}
128+
129+
if 'cmt' in data:
130+
r['cmt'] = _safe_comment(data['cmt'])
131+
132+
if 'Token' in request.headers and request.headers['Token'] in TOKENS:
133+
r['token'] = request.headers['Token']
134+
135+
out_file = RISK_REPORT_DIR / f'{datetime.now().strftime("%Y-%m-%d")}_{gethostname()}.txt'
136+
with report_lock:
137+
with open(out_file, 'a+', encoding='utf-8') as f:
138+
f.write(json_dumps(r) + '\n')
139+
140+
return _response_json(code=200, data={'msg': 'Reported'})
141+
142+
143+
@app.route('/api/ip/<ip>', methods=['GET'])
144+
def check(ip) -> Response:
145+
if ip.startswith('::ffff:'):
146+
ip = ip.replace('::ffff:', '')
147+
148+
if not _valid_public_ip(ip):
149+
return _response_json(code=400, data={'msg': 'Invalid IP provided'})
150+
151+
try:
152+
with maxminddb.open_database(RISKY_DB_FILE[_get_ipv(ip)]) as m:
153+
r = m.get(ip)
154+
if r is None:
155+
return _response_json(code=404, data={'msg': 'Provided IP not reported'})
156+
157+
return _response_json(code=200, data=r)
158+
159+
except FileNotFoundError:
160+
return _response_json(code=404, data={'msg': 'Temporary lookup failure'})
161+
162+
163+
@app.route('/api/net/<ip>', methods=['GET'])
164+
def check_net(ip) -> Response:
165+
if ip.startswith('::ffff:'):
166+
ip = ip.replace('::ffff:', '')
167+
168+
if ip.find('/') != -1:
169+
ip = ip.split('/', 1)[0]
170+
171+
if not _valid_public_ip(ip):
172+
return _response_json(code=400, data={'msg': 'Invalid IP provided'})
173+
174+
ipv = _get_ipv(ip)
175+
176+
if ipv == 4:
177+
net = IPv4Interface(f"{ip}/{NET_SIZE[ipv]}").network.network_address.compressed
178+
179+
else:
180+
net = IPv6Interface(f"{ip}/{NET_SIZE[ipv]}").network.network_address.compressed
181+
182+
net = f"{net}/{NET_SIZE[ipv]}"
183+
184+
try:
185+
return _response_json(code=200, data={**NET_DATA[ipv][net], 'network': net})
186+
187+
except KeyError:
188+
return _response_json(code=404, data={'msg': 'Provided network not reported'})
189+
190+
191+
@app.route('/api/asn/<nr>', methods=['GET'])
192+
def check_asn(nr) -> Response:
193+
if not _valid_asn(nr):
194+
return _response_json(code=400, data={'msg': 'Invalid ASN provided'})
195+
196+
try:
197+
return _response_json(code=200, data=ASN_DATA[str(nr)])
198+
199+
except KeyError:
200+
return _response_json(code=404, data={'msg': 'Provided ASN not reported'})
201+
202+
203+
@app.route('/')
204+
def catch_base():
205+
return redirect(f"/api/ip/{_get_src_ip()}", code=302)
206+
207+
208+
@app.route('/<path:path>')
209+
def catch_all(path):
210+
del path
211+
return redirect(f"/api/ip/{_get_src_ip()}", code=302)
212+
213+
214+
if __name__ == '__main__':
215+
with open(ASN_JSON_FILE, 'r', encoding='utf-8') as f:
216+
ASN_DATA = json_loads(f.read())
217+
218+
NET_DATA = {}
219+
220+
for _ipv, file in NET_JSON_FILES.items():
221+
with open(file, 'r', encoding='utf-8') as f:
222+
NET_DATA[_ipv] = json_loads(f.read())
223+
224+
serve(app, host='127.0.0.1', port=8000)

0 commit comments

Comments
 (0)