Bumblebee Media Search

A demo application that uses the CLIP model for natural language media search (searching images with text, and searching related images with an image).
Built using Phoenix Framework, Bumblebee, Axon, Nx and HNSWLib.

Sneak Peek: Searching for Images with Text

Sneak Peek: Searching for Images with an Image

Nx Servings

This uses Nx Servings for serving the CLIP model. There are two sets of Nx Servings in the codebase:
1. Nx Servings provided by Bumblebee for text & image embeddings: Using ready made Nx Servings provided by Bumblebee library.
2. Hand rolled Nx Servings for text & image embeddings: Custom implemented Nx Servings intended to learn how to implement Nx Servings from scratch.
Both provide the same output and can be used interchangeably. However, if you're interested in learning how Nx Serving works and how to implement them, the hand rolled Nx Serving files will be helpful.

Installation

Uses Nix for dependency management. Install Nix if you don't have it already.
Clone the repository and run direnv allow to activate the environment.
To install dependencies, execute run deps.
To start the server, execute run server.

Using with Your Images

Create a directory priv/images and copy all your images to this directory.
Run the function build_index to create an index from the images. It will vectorize the images, create index and save it to priv/clip_index.ann and priv/clip_index_filenames.json files. To run the function, start the server using run mix phx.server and then run the function in iex shell using MediaSearchDemo.Clip.build_index().

Using with Unsplash Sample Dataset (25,000 images)

Download the dataset from https://unsplash.com/data/lite/latest.
Extract and copy the photos.tsv000 file to priv directory. (You can directly download the photos.tsv file from here without downloading the whole dataset).
Run the script download_unsplash_dataset.ex by run elixir priv/scripts/download_unsplash_dataset.ex to download the images from the dataset. It will concurrently download images to priv/images directory.
Once the images are downloaded to priv/images directory, you have two options:
1. Follow the steps in Using with Your Images section to create an index from the 25k Unsplash images. (will take some time)
2. Download the pre-built index files from here and here and save both to priv directory.

How does it work?

The application uses the CLIP model with Bumblebee and Nx to create an index of images and then search the index for related images.
For more details, please check the talk slides. Slides can be found here

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/workflows		.github/workflows
.vscode		.vscode
assets		assets
bin		bin
config		config
docs		docs
lib		lib
nix		nix
priv		priv
rel/overlays/bin		rel/overlays/bin
test		test
.credo.exs		.credo.exs
.dockerignore		.dockerignore
.env.dev		.env.dev
.env.test		.env.test
.envrc		.envrc
.formatter.exs		.formatter.exs
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
fly.toml		fly.toml
mix.exs		mix.exs
mix.lock		mix.lock
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bumblebee Media Search

Sneak Peek: Searching for Images with Text

Sneak Peek: Searching for Images with an Image

Nx Servings

Installation

Using with Your Images

Using with Unsplash Sample Dataset (25,000 images)

How does it work?

About

Languages

rajrajhans/bumblebee-media-search

Folders and files

Latest commit

History

Repository files navigation

Bumblebee Media Search

Sneak Peek: Searching for Images with Text

Sneak Peek: Searching for Images with an Image

Nx Servings

Installation

Using with Your Images

Using with Unsplash Sample Dataset (25,000 images)

How does it work?

About

Resources

Stars

Watchers

Forks

Languages