This repository provides easy-to-use benchmarks using audio and video content from the Internet Archive, specifically targeting various challenging scenarios in audio recordings.
| Poor mic placement | Body camera footage from July 10 traffic stop | [Internet Archive](https://archive.org/details/cobmn-Body_camera_footage_from_July_10_traffic_stop) |
| Thick accents | Moonshine for Medicine Popcorn Sutton | [Internet Archive](https://archive.org/details/this-is-the-last-dam-run-of-likker-ill-ever-make-full-movie/+Moonshine+for+Medicine++++Popcorn+Sutton.mp4) |
| Artifacts in audio | 2002 007 Movie Trailer Commercial Bad Video | [Internet Archive](https://archive.org/details/2002variouscommercials/2002+007+Movie+Trailer+Commercial+Bad+Video.mp4) |
Visit the links provided and download the relevant item listed under MPEG or H264 in the download options. MP3 will also work.
### 2. Visit Whisper WebUI Repository
https://gitlab.com/aadnk/whisper-webui
Run whisper in docker or natively by following the provided instructions, make sure you set the appropriate options for the test you want to do (ie. omit "--gpus=all" if you would like to run a CPU benchmark)
Note: Occasionally the docker image provided does not work, if you run into any errors then try the gitlab registry. Also, the docker method is generally recommended as it's a lot easier to get up and running.
| CPU Model | Poor mic placement (s) | Thick accents (s) | Low-quality audio (s) | Artifacts in audio (s) | Ideal audio (one speaker) | (Docker/Native) |
| GPU Model | Poor mic placement (s) | Thick accents (s) | Low-quality audio (s) | Artifacts in audio (s) | Ideal audio (one speaker) | (Docker/Native) |