# WhisperBenchmarks This repository provides easy-to-use benchmarks using audio and video content from the Internet Archive, specifically targeting various challenging scenarios in audio recordings. Based on https://gitlab.com/aadnk/whisper-webui ## Models | Model | Command | |-|-| |faster-large-v3| --whisper_implementation faster-whisper --model large-v3 | |faster-medium| --whisper_implementation faster-whisper --model medium | |faster-small| --whisper_implementation faster-whisper --model small | |faster-tiny| --whisper_implementation faster-whisper --model tiny | ## Videos Videos are chosen for being short and matching their given category | Categories | Title | Links | Length | Type | |-|-|-|-|-| | Poor mic placement | Body camera footage from July 10 traffic stop | [Internet Archive](https://archive.org/details/cobmn-Body_camera_footage_from_July_10_traffic_stop) | 2:22 | MP4 | | Thick accents | Moonshine for Medicine Popcorn Sutton | [Internet Archive](https://archive.org/details/this-is-the-last-dam-run-of-likker-ill-ever-make-full-movie/+Moonshine+for+Medicine++++Popcorn+Sutton.mp4) | 1:35 | MP4 | | Artifacts in audio | 2002 007 Movie Trailer Commercial Bad Video | [Internet Archive](https://archive.org/details/2002variouscommercials/2002+007+Movie+Trailer+Commercial+Bad+Video.mp4) | 0:14 | MP4 | | Ideal audio (one speaker) | 8 Bit Bookclub | [Internet Archive](https://archive.org/details/8-bit-bookclub/36+-+ANNOUNCEMENT++SUMMER+HIATUS.mp3) | 1:44 | MP3 | ## How to Run Whisper Benchmarks -- TODO -- ## Results Results are for the complete run which includes loading the model, running VAD, and running the transcription. Links are embeded in the results for each category ### CPU Benchmarks | CPU Model | Poor mic placement (m:s:ms) | Thick accents (m:s:ms) | Artifacts in audio (m:s:ms) | Ideal audio (m:s:ms) | (Docker/Native) | Model | |-|-|-|-|-|-|-| ### GPU Benchmarks | GPU Model | Poor mic placement (m:s:ms) | Thick accents (m:s:ms) | Artifacts in audio (m:s:ms) | Ideal audio (m:s:ms) | (Docker/Native) | Model | |-|-|-|-|-|-|-| | RTX 2060S | 00:02.14 | 00:09.99 | 00:05.07 | 00:11.02 | Native | Faster-Medium | ## Todo: - [ ] Write easy bash scripts for running a set of benchmarks with an easy cleanup - [ ] Finalize a standard format for exporting the data into a spreadsheet