Skip to content

Latest commit

 

History

History
64 lines (41 loc) · 3.04 KB

README.md

File metadata and controls

64 lines (41 loc) · 3.04 KB

Evaluate OpenAI's Whisper on (almost) any Hugging Face Hub dataset

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

It enables transcription in multiple languages, as well as translation from those languages into English.

This notebook provides an easy to use interface to evaluate Whisper on audio recordings of text passages that have been sampled from Hugging Face Datasets.

The notebook sets up a Gradio UI that allows the user to:

  1. Sample text passages from any Dataset hosted on the Hugging Face Hub
  2. Record an audio snippet narrating the text,
  3. Transcribe the audio with Whisper
  4. Save the audio, transcribed and reference text, and word error rate to Comet for further evaluation and analysis.

The Evaluation UI

Open In Colab

whisper-eval

To use the Evaluation UI, you will need a Comet account.

The UI has the following input components

  1. The Dataset Name. This is the root name of the text dataset on the Hugging Face hub.
  2. Subset. Certain datasets on the Hub are divided into subsets. Use this field to identify the subset, if there is no subset, you can leave it blank.
  3. Split. Which split of the dataset to sample from. e.g. train, test etc.
  4. Seed. Set the random seed for sampling. If the seed isn't set, a random one is automatically generated.
  5. Audio. One you have sampled a text passage, hit the record button to record yourself narrating the passage.

Finally hit the Transcribe button. That's it! All the relevant data will be logged to Comet for further analysis.

Logging Evaluations to Comet

The Evaluation UI will log the following data to a Comet Experiment.

You can check out an example project here

Model Parameters

  1. Model Type (tiny, base, small, medium etc)
  2. Beam Search Width
  3. Model Language

Dataset Parameters

  1. Dataset Name
  2. Dataset Subset
  3. Column in the Dataset that contained the text
  4. Split (train, test, etc.) of the dataset
  5. Seed used to sample the text passage from the dataset
  6. Sample Text Length

Model Metrics

  1. Word Error Rate Score

Evaluation Assets

  1. Audio Snippet of the narrated text
  2. Reference/Sampled Text
  3. Transcribed Text
invidual-experiment.mp4