oscar: Switch llama-cpp out for Vulkan extensions
This results in a substantial speedup. Before:
[ Prompt: 2.9 t/s | Generation: 2.5 t/s ]
After (I haven't figured out what the story is with variable speeds,
these are three successive messages of increasing length in the same
conversation):
[ Prompt: 95.7 t/s | Generation: 11.7 t/s ]
[ Prompt: 2866.0 t/s | Generation: 13.4 t/s ]
[ Prompt: 133.1 t/s | Generation: 14.0 t/s ]
[ Prompt: 188.3 t/s | Generation: 13.6 t/s ]
(benchmarks on Framework 13 AMD 7640U)
This commit is contained in:
parent
36df179501
commit
0ae0946f7a
1 changed files with 1 additions and 1 deletions
|
|
@ -161,7 +161,7 @@
|
||||||
wl-clipboard
|
wl-clipboard
|
||||||
|
|
||||||
# ✨ AI ✨
|
# ✨ AI ✨
|
||||||
llama-cpp
|
llama-cpp-vulkan
|
||||||
|
|
||||||
# compilers/language utils
|
# compilers/language utils
|
||||||
cargo
|
cargo
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue