Malicious URLs detection with autoencoder neural network

This repository contains the source code of Detecting malicious URLs using an autoencoder neural network. An article describing how it works is available at https://www.linkedin.com/pulse/anomaly-detection-autoencoder-neural-network-applied-urls-daboubi/

Requirements

Python 3.9
x64 CPU
Tensorflow-compatible NVIDIA GPU

Install required libraries

pip3 install -r requirements.txt

Merge Inversion blocklist (Google_hostnames.txt) with url_data.csv

python merge_url_data.py

Generated new enriched data

python enrich_urls_data.py

Build and test a model

python train_and_test_urls_autoencoder.py

TODO

To put in place a REST API

Name	Name	Last commit message	Last commit date
Latest commit slrbl tensorflow-gpu upgrade Oct 9, 2022 571eb93 · Oct 9, 2022 History 26 Commits
.vscode	.vscode	improve performance	Apr 1, 2022
.gitignore	.gitignore	improve performance	Apr 1, 2022
Google_hostnames.txt	Google_hostnames.txt	improve performance	Apr 1, 2022
LICENSE	LICENSE	improve performance	Apr 1, 2022
OPTED-Dictionary.csv	OPTED-Dictionary.csv	improve performance	Apr 1, 2022
OpenINTEL.nl_README.txt	OpenINTEL.nl_README.txt	improve performance	Apr 1, 2022
README.md	README.md	Update README.md	Oct 9, 2022
enrich_urls_data.py	enrich_urls_data.py	clean data	Apr 1, 2022
merge_url_data.py	merge_url_data.py	clean data	Apr 1, 2022
model.h5	model.h5	update model	Apr 2, 2022
parallel_compute.py	parallel_compute.py	improve performance	Apr 1, 2022
requirements.txt	requirements.txt	tensorflow-gpu upgrade	Oct 9, 2022
train_and_test_urls_autoencoder.py	train_and_test_urls_autoencoder.py	update model	Apr 2, 2022
url_data.csv	url_data.csv	improve performance	Apr 1, 2022
url_data_combined.csv	url_data_combined.csv	clean data	Apr 1, 2022
url_enriched_data.csv	url_enriched_data.csv	update model	Apr 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Malicious URLs detection with autoencoder neural network

Requirements

Install required libraries

Merge Inversion blocklist (Google_hostnames.txt) with url_data.csv

Generated new enriched data

Build and test a model

TODO

Dataset sources

About

Releases

Packages

Contributors 2

Languages

License

slrbl/malicious-urls-detection-with-autoencoder-neural-networks

Folders and files

Latest commit

History

Repository files navigation

Malicious URLs detection with autoencoder neural network

Requirements

Install required libraries

Merge Inversion blocklist (Google_hostnames.txt) with url_data.csv

Generated new enriched data

Build and test a model

TODO

Dataset sources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages