Skip to content

A demo application that uses the CLIP model for natural language media search (searching images with text, and searching related images with an image).

Notifications You must be signed in to change notification settings

rajrajhans/bumblebee-media-search

Repository files navigation

Bumblebee Media Search

Sneak Peek: Searching for Images with Text

 Searching Images with text 1  Searching Images with text 1
 Searching Images with text 3  Searching Images with text 4

Sneak Peek: Searching for Images with an Image

 Searching Images with an Image 1  Searching Images with an Image 2
 Searching Images with an Image 3  Searching Images with an Image 4

Nx Servings

  • This uses Nx Servings for serving the CLIP model. There are two sets of Nx Servings in the codebase:
    1. Nx Servings provided by Bumblebee for text & image embeddings: Using ready made Nx Servings provided by Bumblebee library.
    2. Hand rolled Nx Servings for text & image embeddings: Custom implemented Nx Servings intended to learn how to implement Nx Servings from scratch.
  • Both provide the same output and can be used interchangeably. However, if you're interested in learning how Nx Serving works and how to implement them, the hand rolled Nx Serving files will be helpful.

Installation

  • Uses Nix for dependency management. Install Nix if you don't have it already.
  • Clone the repository and run direnv allow to activate the environment.
  • To install dependencies, execute run deps.
  • To start the server, execute run server.

Using with Your Images

  • Create a directory priv/images and copy all your images to this directory.
  • Run the function build_index to create an index from the images. It will vectorize the images, create index and save it to priv/clip_index.ann and priv/clip_index_filenames.json files. To run the function, start the server using run mix phx.server and then run the function in iex shell using MediaSearchDemo.Clip.build_index().

Using with Unsplash Sample Dataset (25,000 images)

  • Download the dataset from https://unsplash.com/data/lite/latest.
  • Extract and copy the photos.tsv000 file to priv directory. (You can directly download the photos.tsv file from here without downloading the whole dataset).
  • Run the script download_unsplash_dataset.ex by run elixir priv/scripts/download_unsplash_dataset.ex to download the images from the dataset. It will concurrently download images to priv/images directory.
  • Once the images are downloaded to priv/images directory, you have two options:
    1. Follow the steps in Using with Your Images section to create an index from the 25k Unsplash images. (will take some time)
    2. Download the pre-built index files from here and here and save both to priv directory.

How does it work?

  • The application uses the CLIP model with Bumblebee and Nx to create an index of images and then search the index for related images.
  • For more details, please check the talk slides. Slides can be found here

About

A demo application that uses the CLIP model for natural language media search (searching images with text, and searching related images with an image).

Resources

Stars

Watchers

Forks