oscar: Switch llama-cpp out for Vulkan extensions

This results in a substantial speedup. Before: [ Prompt: 2.9 t/s | Generation: 2.5 t/s ] After (I haven't figured out what the story is with variable speeds, these are three successive messages of increasing length in the same conversation): [ Prompt: 95.7 t/s | Generation: 11.7 t/s ] [ Prompt: 2866.0 t/s | Generation: 13.4 t/s ] [ Prompt: 133.1 t/s | Generation: 14.0 t/s ] [ Prompt: 188.3 t/s | Generation: 13.6 t/s ] (benchmarks on Framework 13 AMD 7640U)
2025-12-25 18:16:31 -06:00 · 2025-12-25 18:16:31 -06:00 · 0ae0946f7a
commit 0ae0946f7a
parent 36df179501
1 changed files with 1 additions and 1 deletions
--- a/oscar/configuration.nix
+++ b/oscar/configuration.nix
@ -161,7 +161,7 @@
      wl-clipboard
      # ✨ AI ✨
-      llama-cpp
+      llama-cpp-vulkan
      # compilers/language utils
      cargo