No description
This results in a substantial speedup. Before:
[ Prompt: 2.9 t/s | Generation: 2.5 t/s ]
After (I haven't figured out what the story is with variable speeds,
these are three successive messages of increasing length in the same
conversation):
[ Prompt: 95.7 t/s | Generation: 11.7 t/s ]
[ Prompt: 2866.0 t/s | Generation: 13.4 t/s ]
[ Prompt: 133.1 t/s | Generation: 14.0 t/s ]
[ Prompt: 188.3 t/s | Generation: 13.6 t/s ]
(benchmarks on Framework 13 AMD 7640U)
|
||
|---|---|---|
| bert | ||
| bigbird | ||
| factorio | ||
| oscar | ||
| sam | ||
| .envrc | ||
| .gitignore | ||
| create-server.py | ||
| git-pre-commit-hook | ||
| shell.nix | ||