4.4 KiB
4.4 KiB
WhisperBenchmarks
This repository provides easy-to-use benchmarks using audio and video content from the Internet Archive, specifically targeting various challenging scenarios in audio recordings.
Based on https://gitlab.com/aadnk/whisper-webui
Models
Model | Command |
---|---|
faster-large-v3 | --whisper_implementation faster-whisper --model large-v3 |
faster-medium | --whisper_implementation faster-whisper --model medium |
faster-small | --whisper_implementation faster-whisper --model small |
faster-tiny | --whisper_implementation faster-whisper --model tiny |
Videos
Videos are chosen for being short and matching their given category
Categories | Title | Link | Length | Instant Download |
---|---|---|---|---|
Poor mic placement | Body camera footage from July 10 traffic stop | Internet Archive | 2:22 | MP4 |
Thick accents | Moonshine for Medicine Popcorn Sutton | Internet Archive | 1:35 | MP4 |
Artifacts in audio | 2002 007 Movie Trailer Commercial Bad Video | Internet Archive | 0:14 | MP4 |
Ideal audio (one speaker) | 8 Bit Bookclub | Internet Archive | 1:44 | MP3 |
Long form (many speakers) | Bionic Woman "Black Magic" (1976) | Internet Archive | 43:53 | MP4 |
How to Run Whisper Benchmarks
-- TODO --
Results
Results are for the complete run which includes loading the model, running VAD, and running the transcription. Links are embeded in the results for each category
CPU Benchmarks
CPU Model | Poor mic placement (m:s:ms) | Thick accents (m:s:ms) | Artifacts in audio (m:s:ms) | Ideal audio (m:s:ms) | Long form | (Docker/Native) | Model |
---|
GPU Benchmarks
GPU Model | Poor mic placement (m:s:ms) | Thick accents (m:s:ms) | Artifacts in audio (m:s:ms) | Ideal audio (m:s:ms) | Long form (m:s:ms) | (Docker/Native) | Model |
---|---|---|---|---|---|---|---|
RTX 2060S | 00:11:12 | 00:06:96 | 00:04:40 | 00:08:31 | 03:34:00 | Native | faster-medium |
RTX A5000 | 00:10:86 | 00:07.84 | 00:07:75 | 00:08:91 | 03:17:10 | Native | faster-large-v3 |
Todo:
- Write easy bash scripts for running a set of benchmarks with an easy cleanup
- Finalize a standard format for exporting the data into a spreadsheet