Logo
Explore Help
Sign In
geoffsee/predict-otron-9001
1
0
Fork 0
You've already forked predict-otron-9001
mirror of https://github.com/geoffsee/predict-otron-9001.git synced 2025-09-08 22:46:44 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
inference-overhaul
predict-otron-9001/scripts
History
geoffsee 315ef17605 supports small llama and gemma models
Refactor inference

dedicated crates for llama and gemma inferencing, not integrated
2025-08-29 20:00:41 -04:00
..
cli.ts
update docs
2025-08-28 12:54:09 -04:00
curl_chat_stream.sh
Refactor apply_cached_repeat_penalty for optimized caching and reuse, add extensive unit tests, and integrate special handling for gemma-specific models.
2025-08-27 16:15:01 -04:00
curl_chat.sh
Refactor apply_cached_repeat_penalty for optimized caching and reuse, add extensive unit tests, and integrate special handling for gemma-specific models.
2025-08-27 16:15:01 -04:00
performance_test_embeddings.sh
Refactor apply_cached_repeat_penalty for optimized caching and reuse, add extensive unit tests, and integrate special handling for gemma-specific models.
2025-08-27 16:15:01 -04:00
performance_test_inference.sh
Refactor apply_cached_repeat_penalty for optimized caching and reuse, add extensive unit tests, and integrate special handling for gemma-specific models.
2025-08-27 16:15:01 -04:00
run_llama.sh
supports small llama and gemma models
2025-08-29 20:00:41 -04:00
run_server.sh
update docs
2025-08-28 12:54:09 -04:00
test.sh
update docs
2025-08-28 12:54:09 -04:00
Powered by Gitea Version: 1.24.5 Page: 29ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API