Image Captioning with CNN-LSTM

An implementation of an image captioning system using a CNN encoder and LSTM decoder. This project generates natural language descriptions of images using deep learning.

Features

Encoder: ResNet-152 pretrained on ImageNet for image feature extraction
Decoder: LSTM network for sequence generation
Attention Mechanism: Generates captions by focusing on relevant parts of the image
Pre-trained Models: Ready-to-use models available for quick inference

Installation

Clone the repository:

git clone https://github.com/Saksham7685/image-captioning.git

Install dependencies:
```
pip install -r requirements.txt
```

Training

To train the model:

Download the COCO dataset from download.sh file
Preprocess the data:
```
python build_vocab.py
python resize.py
```
Start training:
```
python train.py
```

Sample Image

Input

Output

a dog is sitting on a green grass covered field

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
build_vocab.py		build_vocab.py
data_loader.py		data_loader.py
download.sh		download.sh
image.png		image.png
model.py		model.py
requirements.txt		requirements.txt
resize.py		resize.py
sample.py		sample.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Captioning with CNN-LSTM

Features

Installation

Training

Sample Image

Input

Output

About

Uh oh!

Releases

Packages

Languages

saksham7685/Image-captioning

Folders and files

Latest commit

History

Repository files navigation

Image Captioning with CNN-LSTM

Features

Installation

Training

Sample Image

Input

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages