mlops-project

Project work for 02476 Machine Learning Operations

Goal

We plan to use the huggingface transformer framework to train a DETR model for object detection on the Pascal VOC 2012 Dataset. We wish to make our results reproducible and make the simplest and most readable implementation possible.

Framework

The tranformer framework supports the DETR object detection model and pretrained weights can easily be applied. We plan to use the pretraining feature as well as the DETR model classes to obtain a model which can readily be finetuned on a new dataset. The main motivation for using the transformer framework is to take care of the hard parts of the implementation, such that we do not have to think about model architecture and efficiency directly.

Model

DETR is an end-to-end trainable tranformer neural network for object detection. It solves some of the issues of previous methods, mainly by being simpler to implement.

Data

The Pascal VOC 2012 Dataset is an object detection dataset with 17112 images and 20 classes and is commonly used as a benchmark for image detection models.

User guide

In this section, a short guide to get started using the project is provided.

To get started, use the pip command

pip install -r requirements.txt

to install any and all dependencies needed for the project.

To download and process the Pascal VOC 2012 Dataset, use

make data

To train a new model use

make train

To make a prediction using a trained model use

python src/models/predict_model.py {path to model checkpoint} {path OR url to images}

If you wish to deploy a model using torchserve, serialize the model using:

python src/models/serialize_model.py {path to model checkpoint}

index_to_name.json and detr_handler.py can then be used to deploy the model to do inference.

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.dvc		.dvc
.github/workflows		.github/workflows
docs		docs
example_images		example_images
notebooks		notebooks
references		references
reports		reports
src		src
tests		tests
.coveragerc		.coveragerc
.dvcignore		.dvcignore
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cloudbuild.yaml		cloudbuild.yaml
config.properties		config.properties
data.dvc		data.dvc
detr_handler.py		detr_handler.py
index_to_name.json		index_to_name.json
models.dvc		models.dvc
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini
train_cloud.bash		train_cloud.bash
trainer.dockerfile		trainer.dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlops-project

Goal

Framework

Model

Data

User guide

Project Organization

About

Releases

Packages

Contributors 5

Languages

License

AntonJorg/mlops-project

Folders and files

Latest commit

History

Repository files navigation

mlops-project

Goal

Framework

Model

Data

User guide

Project Organization

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages