Skip to content

Releases: ggml-org/whisper.cpp

v1.8.4

19 Mar 15:08
9386f23

Choose a tag to compare

Overview

Maintenance release, latest ggml, some performance gains across the board.

What's Changed

New Contributors

Full Changelog: v1.8.3...v1.8.4

v1.8.3

15 Jan 11:27
2eeeba5

Choose a tag to compare

Overview

Maintenance release, latest ggml, minor improvements in the tools/server/bindings.

What's Changed

New Contributors

Full Changelog: v1.8.2...v1.8.3

v1.8.2

15 Oct 08:32
4979e04

Choose a tag to compare

Overview

  • Fix a bug in the ggml norm CPU scalar operator

What's Changed

Full Changelog: v1.8.1...v1.8.2

v1.8.1

12 Oct 10:18
a91dd3b

Choose a tag to compare

Overview

  • Fix Vulkan builds
  • Fix memory leaks when using VAD
  • Support --carry-initial-prompt

What's Changed

New Contributors

Full Changelog: v1.8.0...v1.8.1

v1.8.0

30 Sep 18:44
41fc9de

Choose a tag to compare

Overview

  • Flash attention is now enabled by default
  • Performance improvements

M1 Pro

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M1 Pro METAL tiny 1 0 32.44 1.71 0.43 0.04 8a67c55
M1 Pro METAL base 1 0 63.54 2.62 0.71 0.06 8a67c55
M1 Pro METAL small 1 0 200.30 5.34 1.72 0.17 8a67c55
M1 Pro METAL medium 1 0 580.06 11.71 4.18 0.45 8a67c55
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M1 Pro METAL tiny 1 1 22.09 1.84 0.43 0.03 8a67c55
M1 Pro METAL base 1 1 40.57 2.22 0.44 0.04 8a67c55
M1 Pro METAL small 1 1 135.15 4.23 0.95 0.12 8a67c55
M1 Pro METAL medium 1 1 395.18 9.14 2.21 0.30 8a67c55

M2 Ultra

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 0 8.63 1.09 0.27 0.01 b57b9d3
M2 ULTRA METAL tiny-q5_0 1 0 9.04 1.06 0.28 0.01 b57b9d3
M2 ULTRA METAL tiny-q5_1 1 0 8.98 1.06 0.28 0.01 b57b9d3
M2 ULTRA METAL tiny-q8_0 1 0 8.69 1.06 0.27 0.01 b57b9d3
M2 ULTRA METAL base 1 0 15.39 1.54 0.43 0.02 b57b9d3
M2 ULTRA METAL base-q5_0 1 0 16.50 1.50 0.42 0.02 b57b9d3
M2 ULTRA METAL base-q5_1 1 0 16.45 1.49 0.43 0.02 b57b9d3
M2 ULTRA METAL base-q8_0 1 0 15.62 1.51 0.42 0.02 b57b9d3
M2 ULTRA METAL small 1 0 45.99 2.99 0.90 0.05 b57b9d3
M2 ULTRA METAL small-q5_0 1 0 50.65 2.98 0.92 0.06 b57b9d3
M2 ULTRA METAL small-q5_1 1 0 50.74 2.96 0.92 0.06 b57b9d3
M2 ULTRA METAL small-q8_0 1 0 47.16 2.83 0.89 0.06 b57b9d3
M2 ULTRA METAL medium 1 0 132.78 6.46 2.02 0.13 b57b9d3
M2 ULTRA METAL medium-q5_0 1 0 149.35 6.11 2.09 0.14 b57b9d3
M2 ULTRA METAL medium-q5_1 1 0 149.11 6.09 2.11 0.14 b57b9d3
M2 ULTRA METAL medium-q8_0 1 0 137.37 6.05 2.03 0.13 b57b9d3
M2 ULTRA METAL medium-dis 1 0 121.60 0.90 0.25 0.02 b57b9d3
M2 ULTRA METAL large-v2 1 0 231.19 9.40 3.10 0.22 b57b9d3
M2 ULTRA METAL large-v2-q5_0 1 0 265.90 8.98 3.11 0.25 b57b9d3
M2 ULTRA METAL large-v2-q5_1 1 0 265.18 8.92 3.13 0.25 b57b9d3
M2 ULTRA METAL large-v2-q8_0 1 0 240.23 9.06 2.98 0.23 b57b9d3
M2 ULTRA METAL large-v2-dis 1 0 210.25 0.99 0.28 0.02 b57b9d3
M2 ULTRA METAL large-v3-turbo 1 0 211.72 1.52 0.46 0.03 b57b9d3
M2 ULTRA METAL large-v3-turbo-q5_0 1 0 242.17 1.40 0.47 0.04 b57b9d3
M2 ULTRA METAL large-v3-turbo-q8_0 1 0 219.75 1.40 0.45 0.04 b57b9d3
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 1 6.28 0.96 0.22 0.01 a77d11d
M2 ULTRA METAL tiny-q5_0 1 1 6.69 0.92 0.22 0.01 a77d11d
M2 ULTRA METAL tiny-q5_1 1 1 6.67 0.91 0.22 0.01 a77d11d
M2 ULTRA METAL tiny-q8_0 1 1 6.34 0.92 0.21 0.01 a77d11d
M2 ULTRA METAL base 1 1 10.77 1.30 0.32 0.02 a77d11d
M2 ULTRA METAL base-q5_0 1 1 11.84 1.23 0.33 0.02 a77d11d
M2 ULTRA METAL base-q5_1 1 1 11.95 1.24 0.33 0.02 a77d11d
M2 ULTRA METAL base-q8_0 1 1 11.14 1.23 0.32 0.02 a77d11d
M2 ULTRA METAL small 1 1 32.12 2.43 0.65 0.04 a77d11d
M2 ULTRA METAL small-q5_0 1 1 36.95 2.42 0.68 0.04 a77d11d
M2 ULTRA METAL small-q5_1 1 1 37.40 2.42 0.68 0.04 a77d11d
M2 ULTRA METAL small-q8_0 1 1 33.48 2.30 0.65 0.04 a77d11d
M2 ULTRA METAL medium 1 1 89.28 5.05 1.46 0.09 a77d11d
M2 ULTRA METAL medium-q5_0 1 1 105.24 4.89 1.48 0.11 a77d11d
M2 ULTRA METAL medium-q5_1 1 1 105.28 4.98 1.49 0.11 a77d11d
M2 ULTRA METAL medium-q8_0 1 1 93.61 4.89 1.43 0.10 a77d11d
M2 ULTRA METAL medium-dis 1 1 78.44 0.81 0.20 0.01 a77d11d
M2 ULTRA METAL large-v2 1 1 165.69 7.50 2.16 0.17 a77d11d
M2 ULTRA METAL large-v2-q5_0 1 1 199.40 7.37 2.18 0.20 a77d11d
M2 ULTRA METAL large-v2-q5_1 1 1 199.29 7.37 2.21 0.20 a77d11d
M2 ULTRA METAL large-v2-q8_0 1 1 174.60 6.87 2.16 0.18 a77d11d
M2 ULTRA METAL large-v2-dis 1 1 145.80 0.90 0.22 0.02 a77d11d
M2 ULTRA METAL large-v3-turbo 1 1 146.98 1.31 0.34 0.03 a77d11d
M2 ULTRA METAL large-v3-turbo-q5_0 1 1 176.77 1.19 0.35 0.03 a77d11d
M2 ULTRA METAL large-v3-turbo-q8_0 1 1 154.73 1.20 0.33 0.03 a77d11d

M4 Max

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M4 Max METAL tiny 1 0 10.51 0.86 0.23 0.01 47fcd7d
M4 Max METAL tiny-q8_0 1 0 10.73 0.84 0.24 0.01 47fcd7d
M4 Max METAL base 1 0 19.50 1.34 0.36 0.02 47fcd7d
M4 Max METAL base-q8_0 1 0 20.17 1.25 0.36 0.02 47fcd7d
M4 Max METAL small 1 0 61.91 2.77 0.78 0.06 47fcd7d
M4 Max METAL small-q8_0 1 0 64.17 2.43 0.78 0.06 47fcd7d
M4 Max METAL medium 1 0 181.50 6.44 1.85 0.15 47fcd7d
M4 Max METAL medium-q8_0 1 0 187.71 5.80 1.84 0.15 47fcd7d
M4 Max METAL large-v2 1 0 335.49 10.49 3.01 0.26 47fcd7d
M4 Max METAL large-v2-q8_0 1 0 349.89 8.65 2.97 0.27 47fcd7d
M4 Max METAL large-v3-turbo 1 0 301.34 1.83 0.49 0.04 47fcd7d
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M4 Max METAL tiny 1 1 8.23 0.71 0.16 0.01 47fcd7d
M4 Max METAL tiny-q8_0 1 1 8.47 0.67 0.16 0.01 47fcd7d
M4 Max METAL base 1 1 15.47 1.12 0.26 0.02 47fcd7d
M4 Max METAL base-q8_0 1 1 15.70 1.05 0.27 0.02 47fcd7d
M4 Max METAL small 1 1 49.82 2.37 0.53 0.05 47fcd7d
M4 Max METAL small-q8_0 1 1 51.76 1.99 0.53 0.05 47fcd7d
M4 Max METAL medium 1 1 147.76 5.52 1.27 0.12 47fcd7d
M4 Max METAL medium-q8_0 1 1 153.98 4.59 1.24 0.13 47fcd7d
M4 Max METAL large-v2 1 1 282.89 9.06 2.11 0.22 47fcd7d
M4 Max METAL large-v2-q8_0 1 1 296.43 7.44 2.09 0.23 47fcd7d
M4 Max METAL large-v3-turbo 1 1 249.91 1.65 0.38 0.04 47fcd7d

RTX 5090

| GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
| --- | --- | --- | --- | --- |...

Read more

v1.7.6

25 Jun 13:50
a8d002c

Choose a tag to compare

Overview

  • Add initial VAD support - feedback welcome and appreciated
  • Metal FA improvements

M2 Ultra

Flash Attention ON:

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 1 7.72 1.05 0.32 0.01 dc8dda6
M2 ULTRA METAL tiny-q5_0 1 1 8.20 0.98 0.31 0.01 dc8dda6
M2 ULTRA METAL tiny-q5_1 1 1 8.13 0.99 0.31 0.01 dc8dda6
M2 ULTRA METAL tiny-q8_0 1 1 7.96 0.93 0.30 0.01 dc8dda6
M2 ULTRA METAL base 1 1 13.52 1.39 0.35 0.02 dc8dda6
M2 ULTRA METAL base-q5_0 1 1 14.88 1.31 0.34 0.02 dc8dda6
M2 ULTRA METAL base-q5_1 1 1 14.76 1.33 0.34 0.02 dc8dda6
M2 ULTRA METAL base-q8_0 1 1 14.04 1.28 0.34 0.02 dc8dda6
M2 ULTRA METAL small 1 1 38.78 2.72 0.67 0.04 dc8dda6
M2 ULTRA METAL small-q5_0 1 1 44.01 2.64 0.69 0.05 dc8dda6
M2 ULTRA METAL small-q5_1 1 1 44.02 2.66 0.69 0.05 dc8dda6
M2 ULTRA METAL small-q8_0 1 1 40.79 2.49 0.67 0.05 dc8dda6
M2 ULTRA METAL medium 1 1 104.48 5.57 1.61 0.10 dc8dda6
M2 ULTRA METAL medium-q5_0 1 1 122.24 5.00 1.58 0.12 dc8dda6
M2 ULTRA METAL medium-q5_1 1 1 121.99 5.02 1.59 0.12 dc8dda6
M2 ULTRA METAL medium-q8_0 1 1 111.68 4.99 1.52 0.11 dc8dda6
M2 ULTRA METAL medium-dis 1 1 93.23 0.87 0.21 0.01 dc8dda6
M2 ULTRA METAL large-v2 1 1 189.82 8.36 2.35 0.19 dc8dda6
M2 ULTRA METAL large-v2-q5_0 1 1 225.73 7.34 2.40 0.22 dc8dda6
M2 ULTRA METAL large-v2-q5_1 1 1 225.88 7.60 2.40 0.22 dc8dda6
M2 ULTRA METAL large-v2-q8_0 1 1 203.55 7.32 2.26 0.20 dc8dda6
M2 ULTRA METAL large-v2-dis 1 1 168.20 0.98 0.24 0.02 dc8dda6
M2 ULTRA METAL large-v3-turbo 1 1 170.22 1.46 0.37 0.03 dc8dda6
M2 ULTRA METAL large-v3-turbo-q5_0 1 1 201.88 1.27 0.38 0.04 dc8dda6
M2 ULTRA METAL large-v3-turbo-q8_0 1 1 182.37 1.24 0.36 0.03 dc8dda6

Flash Attention OFF:

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 0 10.15 1.20 0.36 0.01 dc8dda6
M2 ULTRA METAL tiny-q5_0 1 0 10.21 1.15 0.39 0.01 dc8dda6
M2 ULTRA METAL tiny-q5_1 1 0 9.26 1.15 0.38 0.01 dc8dda6
M2 ULTRA METAL tiny-q8_0 1 0 9.00 1.12 0.37 0.01 dc8dda6
M2 ULTRA METAL base 1 0 15.77 1.73 0.45 0.02 dc8dda6
M2 ULTRA METAL base-q5_0 1 0 16.90 1.63 0.44 0.02 dc8dda6
M2 ULTRA METAL base-q5_1 1 0 16.93 1.64 0.44 0.02 dc8dda6
M2 ULTRA METAL base-q8_0 1 0 16.13 1.63 0.43 0.02 dc8dda6
M2 ULTRA METAL small 1 0 45.15 3.45 0.92 0.05 dc8dda6
M2 ULTRA METAL small-q5_0 1 0 50.63 3.36 0.94 0.06 dc8dda6
M2 ULTRA METAL small-q5_1 1 0 50.56 3.36 0.94 0.06 dc8dda6
M2 ULTRA METAL small-q8_0 1 0 47.52 3.20 0.92 0.05 dc8dda6
M2 ULTRA METAL medium 1 0 122.55 7.38 1.95 0.12 dc8dda6
M2 ULTRA METAL medium-q5_0 1 0 140.61 6.73 2.02 0.14 dc8dda6
M2 ULTRA METAL medium-q5_1 1 0 140.48 6.76 2.04 0.14 dc8dda6
M2 ULTRA METAL medium-q8_0 1 0 131.00 6.57 1.96 0.13 dc8dda6
M2 ULTRA METAL medium-dis 1 0 110.85 1.00 0.24 0.02 dc8dda6
M2 ULTRA METAL large-v2 1 0 222.28 10.96 3.03 0.21 dc8dda6
M2 ULTRA METAL large-v2-q5_0 1 0 258.64 9.79 3.04 0.25 dc8dda6
M2 ULTRA METAL large-v2-q5_1 1 0 258.32 9.87 3.05 0.24 dc8dda6
M2 ULTRA METAL large-v2-q8_0 1 0 236.55 9.61 2.87 0.23 dc8dda6
M2 ULTRA METAL large-v2-dis 1 0 199.84 1.14 0.27 0.02 dc8dda6
M2 ULTRA METAL large-v3-turbo 1 0 201.52 1.77 0.45 0.03 dc8dda6
M2 ULTRA METAL large-v3-turbo-q5_0 1 0 233.14 1.56 0.47 0.04 dc8dda6
M2 ULTRA METAL large-v3-turbo-q8_0 1 0 214.23 1.53 0.44 0.04 dc8dda6

What's Changed

Read more

v1.7.5

02 Apr 14:34

Choose a tag to compare

Overview

This is a relatively big update with various build and CI improvements especially for iOS and WASM. There are also some performance gains, especially for the Metal backend and probably for Arm-based devices.

Big shoutout to @danbev for stepping up and completing the maintenance roadmap for this release!

Mobile examples

All mobile examples have been refreshed. The iOS examples specifically are now much easier to build thanks to the new XCFramework workflow. This should simplify significantly integration of whisper.cpp in 3rd party iOS and macOS apps. CoreML build and convert instructions have also been updated.

WASM examples

The WASM examples are now automatically updated on each new commit and hosted in Github Pages at
https://ggml.ai/whisper.cpp/

Problems with CORS rules should be resolved.


Some performance numbers for this release:

M2 Ultra

Flash Attention ON:

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 1 7.82 1.31 0.35 0.01 ad4e350
M2 ULTRA METAL tiny-q5_0 1 1 8.32 1.28 0.37 0.01 ad4e350
M2 ULTRA METAL tiny-q5_1 1 1 8.21 1.28 0.37 0.01 ad4e350
M2 ULTRA METAL tiny-q8_0 1 1 7.97 1.23 0.36 0.01 ad4e350
M2 ULTRA METAL base 1 1 13.96 1.80 0.42 0.02 ad4e350
M2 ULTRA METAL base-q5_0 1 1 15.19 1.75 0.42 0.02 ad4e350
M2 ULTRA METAL base-q5_1 1 1 15.09 1.75 0.42 0.02 ad4e350
M2 ULTRA METAL base-q8_0 1 1 14.45 1.70 0.41 0.02 ad4e350
M2 ULTRA METAL small 1 1 40.08 3.54 0.86 0.05 ad4e350
M2 ULTRA METAL small-q5_0 1 1 45.07 3.51 0.88 0.05 ad4e350
M2 ULTRA METAL small-q5_1 1 1 45.05 3.52 0.88 0.05 ad4e350
M2 ULTRA METAL small-q8_0 1 1 42.04 3.34 0.85 0.05 ad4e350
M2 ULTRA METAL medium 1 1 107.20 7.28 1.79 0.11 ad4e350
M2 ULTRA METAL medium-q5_0 1 1 125.02 6.67 1.83 0.12 ad4e350
M2 ULTRA METAL medium-q5_1 1 1 124.83 6.70 1.84 0.12 ad4e350
M2 ULTRA METAL medium-q8_0 1 1 114.56 6.53 1.79 0.11 ad4e350
M2 ULTRA METAL medium-dis 1 1 95.96 1.01 0.23 0.01 ad4e350
M2 ULTRA METAL large-v2 1 1 194.29 10.57 2.67 0.20 ad4e350
M2 ULTRA METAL large-v2-q5_0 1 1 230.74 9.57 2.73 0.23 ad4e350
M2 ULTRA METAL large-v2-q5_1 1 1 229.97 9.69 2.74 0.23 ad4e350
M2 ULTRA METAL large-v2-q8_0 1 1 208.11 9.37 2.60 0.21 ad4e350
M2 ULTRA METAL large-v2-dis 1 1 172.72 1.12 0.26 0.02 ad4e350
M2 ULTRA METAL large-v3-turbo 1 1 174.46 1.74 0.42 0.03 ad4e350
M2 ULTRA METAL large-v3-turbo-q5_0 1 1 205.78 1.54 0.42 0.04 ad4e350
M2 ULTRA METAL large-v3-turbo-q8_0 1 1 186.33 1.50 0.40 0.03 ad4e350

Flash Attention OFF:

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 ULTRA METAL tiny 1 0 8.74 1.20 0.36 0.01 ad4e350
M2 ULTRA METAL tiny-q5_0 1 0 10.30 1.15 0.38 0.01 ad4e350
M2 ULTRA METAL tiny-q5_1 1 0 10.71 1.13 0.38 0.01 ad4e350
M2 ULTRA METAL tiny-q8_0 1 0 9.97 1.12 0.37 0.01 ad4e350
M2 ULTRA METAL base 1 0 16.77 1.71 0.44 0.02 ad4e350
M2 ULTRA METAL base-q5_0 1 0 16.92 1.63 0.44 0.02 ad4e350
M2 ULTRA METAL base-q5_1 1 0 16.84 1.63 0.44 0.02 ad4e350
M2 ULTRA METAL base-q8_0 1 0 16.12 1.63 0.44 0.02 ad4e350
M2 ULTRA METAL small 1 0 45.29 3.44 0.92 0.05 ad4e350
M2 ULTRA METAL small-q5_0 1 0 50.43 3.34 0.94 0.06 ad4e350
M2 ULTRA METAL small-q5_1 1 0 50.49 3.35 0.93 0.06 ad4e350
M2 ULTRA METAL small-q8_0 1 0 47.37 3.20 0.91 0.05 ad4e350
M2 ULTRA METAL medium 1 0 122.81 7.39 1.99 0.12 ad4e350
M2 ULTRA METAL medium-q5_0 1 0 140.62 6.73 2.03 0.14 ad4e350
M2 ULTRA METAL medium-q5_1 1 0 140.44 6.74 2.04 0.14 ad4e350
M2 ULTRA METAL medium-q8_0 1 0 131.05 6.54 1.95 0.13 ad4e350
M2 ULTRA METAL medium-dis 1 0 110.95 0.99 0.24 0.02 ad4e350
M2 ULTRA METAL large-v2 1 0 222.19 10.93 3.01 0.21 ad4e350
M2 ULTRA METAL large-v2-q5_0 1 0 258.47 9.75 3.01 0.25 ad4e350
M2 ULTRA METAL large-v2-q5_1 1 0 258.40 9.85 3.01 0.24 ad4e350
M2 ULTRA METAL large-v2-q8_0 1 0 236.68 9.61 2.85 0.23 ad4e350
M2 ULTRA METAL large-v2-dis 1 0 199.28 1.12 0.27 0.02 ad4e350
M2 ULTRA METAL large-v3-turbo 1 0 201.49 1.76 0.45 0.03 ad4e350
M2 ULTRA METAL large-v3-turbo-q5_0 1 0 233.70 1.55 0.46 0.04 ad4e350
M2 ULTRA METAL large-v3-turbo-q8_0 1 0 214.20 1.51 0.44 0.04 ad4e350

M4 Max

Flash Attention ON:

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M4 Max METAL tiny 1 1 15.22 0.89 0.26 0.01 ad4e350
M4 Max METAL tiny-q8_0 1 1 14.70 0.86 0.26 0.01 ad4e350
M4 Max METAL base 1 1 25.33 1.36 0.30 0.02 ad4e350
M4 Max METAL base-q8_0 1 1 21.27 1.31 0.30 0.02 ad4e350
M4 Max METAL small 1 1 58.43 2.78 0.60 0.05 ad4e350
M4 Max METAL small-q8_0 1 1 60.26 2.39 0.60 0.05 ad4e350
M4 Max METAL medium 1 1 169.73 6.03 1.31 0.14 ad4e350
M4 Max METAL medium-q8_0 1 1 176.61 4.99 1.31 0.14 ad4e350
M4 Max METAL large-v2 1 1 316.18 9.60 2.08 0.24 ad4e350
M4 Max METAL large-v2-q8_0 1 1 329.59 7.55 2.08 0.25 ad4e350

Flash Attention OFF:

CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M4 Max METAL tiny 1 0 13.12 0.87 0.29 0.01 ad4e350
M4 Max METAL tiny-q8_0 1 0 15.90 0.88 0.31 0.01 ad4e350
M4 Max METAL base 1 0 23.10 1.42 0.34 0.02 ad4e350
M4 Max METAL base-q8_0 1 0 27.25 1.31 0.34 0.02 ad4e350
M4 Max METAL small 1 0 71.76 3.02 0.70 0.06 ad4e350
M4 Max METAL small-q8_0 1 0 73.88 2.60 0.71 0.06 ad4e350
M4 Max METAL medium 1 0 208.22 6.94 1.55 0.16 ad4e350
M4 Max METAL medium-q8_0 1 0 214.65 5.90 1.57 0.17 ad4e350
M4 Max METAL large-v2 1 0 381.72 11.28 2.51 0.29 ad4e350
M4 Max METAL large-v2-q8_0 1 0 394.97 8.90 2.45 0.30 ad4e350

V100

Flash Attention ON:

GPU Config Model Th FA Enc. Dec. Bch5 PP Commit
V100 AVX2 CUDA tiny 8 1 4.01 0.90 0.25 0.01 ad4e350
V100 AVX2 CUDA tiny-q5_1 8 1 4.12 0.88 0.18 0.01 ad4e350
V100 AVX2 CUDA base 8 1 7.00 1.30 0.35 0.01 ad4e350
V100 AVX2 CUDA base-q5_1 8 1 7.22 1.21 0.26 0.02 ad4e350
V100 AVX2 CUDA small 8 1 18.68 2.39 0.69 0.03 ad4e350
V100 AVX2 CUDA small-q5_1 8 1 19.38 2.32 0.51 0.03 ad4e350
V100 AVX2 CUDA medium 8 1 53.17 5.15 1.45 0.06 ad4e350
V100 AVX2 CUDA medium-q5_0 8 ...
Read more

b2365

31 Mar 15:04
e153b8e

Choose a tag to compare

android.java : re-add ggml source updates (#2975)

This commit updates the ggml source to include the new unary and binary
operations. I merged https://github.com/ggerganov/whisper.cpp/pull/2958
which seems to have overwritten the changes to the ggml source which
were added in https://github.com/ggerganov/whisper.cpp/pull/2972.

Sorry about this.

v1.7.4

06 Jan 13:16
8a9ad78

Choose a tag to compare

Overview

Minor release with mostly build fixes.

What's Changed

New Contributors

Full Changelog: v1.7.3...v1.7.4

v1.7.3

18 Dec 16:15
3de9dee

Choose a tag to compare

Overview

  • Massive performance improvements for the Metal backend, especially for beams > 1 and for quantized models
  • Reduce hallucinations during silence by @jkarthic in #2629
  • Implement no_speech_thold by @jkarthic in #2625
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 Ultra Metal tiny 1 1 7.90 1.26 0.35 0.01 ed733e8
M2 Ultra Metal tiny-q5_0 1 1 8.44 1.23 0.36 0.01 ed733e8
M2 Ultra Metal tiny-q5_1 1 1 8.26 1.27 0.37 0.01 ed733e8
M2 Ultra Metal tiny-q8_0 1 1 8.03 1.21 0.35 0.01 ed733e8
M2 Ultra Metal base 1 1 13.77 1.80 0.42 0.02 ed733e8
M2 Ultra Metal base-q5_0 1 1 15.02 1.72 0.42 0.02 ed733e8
M2 Ultra Metal base-q5_1 1 1 14.93 1.74 0.42 0.02 ed733e8
M2 Ultra Metal base-q8_0 1 1 14.26 1.68 0.41 0.02 ed733e8
M2 Ultra Metal small 1 1 39.76 3.54 0.85 0.05 ed733e8
M2 Ultra Metal small-q5_0 1 1 45.07 3.47 0.87 0.05 ed733e8
M2 Ultra Metal small-q5_1 1 1 44.82 3.49 0.87 0.05 ed733e8
M2 Ultra Metal small-q8_0 1 1 41.79 3.30 0.84 0.05 ed733e8
M2 Ultra Metal medium 1 1 106.73 7.28 1.78 0.11 ed733e8
M2 Ultra Metal medium-q5_0 1 1 124.43 6.63 1.83 0.12 ed733e8
M2 Ultra Metal medium-q5_1 1 1 124.19 6.70 1.84 0.12 ed733e8
M2 Ultra Metal medium-q8_0 1 1 113.88 6.52 1.75 0.11 ed733e8
M2 Ultra Metal medium-dis 1 1 94.97 0.97 0.22 0.01 ed733e8
M2 Ultra Metal large-v2 1 1 193.33 10.53 2.65 0.20 ed733e8
M2 Ultra Metal large-v2-q5_0 1 1 229.22 9.52 2.72 0.23 ed733e8
M2 Ultra Metal large-v2-q5_1 1 1 229.40 9.62 2.73 0.23 ed733e8
M2 Ultra Metal large-v2-q8_0 1 1 207.30 9.36 2.59 0.21 ed733e8
M2 Ultra Metal large-v2-dis 1 1 171.43 1.09 0.25 0.02 ed733e8
M2 Ultra Metal large-v3-turbo 1 1 173.45 1.73 0.41 0.03 ed733e8
M2 Ultra Metal large-v3-turbo-q5_0 1 1 205.52 1.52 0.42 0.04 ed733e8
M2 Ultra Metal large-v3-turbo-q8_0 1 1 185.90 1.48 0.40 0.03 ed733e8

What's Changed

New Contributors

Full Changelog: v1.7.2...v1.7.3