|
||
---|---|---|
LICENSE | ||
README.md |
WhisperBenchmarks
This repository provides easy-to-use benchmarks using audio and video content from the Internet Archive, specifically targeting various challenging scenarios in audio recordings.
Links
Categories | Title | Links |
---|---|---|
Poor mic placement | Body camera footage from July 10 traffic stop | Internet Archive |
Thick accents | Moonshine for Medicine Popcorn Sutton | Internet Archive |
Low-quality audio | 1994 90210 Melrose Place Promos Commercial | Internet Archive |
Artifacts in audio | 2002 007 Movie Trailer Commercial Bad Video | Internet Archive |
Ideal audio (one speaker) | 8 Bit Bookclub | Internet Archive |
How to Run Whisper Benchmarks
1. Download Video from the Internet Archive
Visit the links provided and download the relevant item listed under MPEG or H264 in the download options. MP3 will also work.
2. Visit Whisper WebUI Repository
https://gitlab.com/aadnk/whisper-webui
Run whisper in docker or natively by following the provided instructions, make sure you set the appropriate options for the test you want to do (ie. omit "--gpus=all" if you would like to run a CPU benchmark)
Note: Occasionally the docker image provided does not work, if you run into any errors then try the gitlab registry. Also, the docker method is generally recommended as it's a lot easier to get up and running.
Results
Links are embeded for each category
CPU Benchmarks
CPU Model | Poor mic placement (s) | Thick accents (s) | Low-quality audio (s) | Artifacts in audio (s) | Ideal audio (one speaker) | (Docker/Native) |
---|
GPU Benchmarks
GPU Model | Poor mic placement (s) | Thick accents (s) | Low-quality audio (s) | Artifacts in audio (s) | Ideal audio (one speaker) | (Docker/Native) |
---|
Todo:
- Write easy bash scripts for running a set of benchmarks with an easy cleanup
- Finalize a standard format for exporting the data into a spreadsheet