Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

server : tuning tests
#7388 opened May 19, 2024 by ggerganov Draft
Add alpaca chat template testing Everything test related
#7383 opened May 19, 2024 by jukofyork Loading…
Automate vocab support and model conversion python python script changes
#7379 opened May 19, 2024 by teleprint-me Draft
7 tasks
rpc: free buffer after client disconnect
#7378 opened May 19, 2024 by chraac Loading…
Tokenizer SPM fixes for phi-3 and llama-spm python python script changes testing Everything test related
#7375 opened May 18, 2024 by jaime-m-p Loading…
OpenELM support model Model specific python python script changes review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7359 opened May 18, 2024 by icecream95 Draft
examples: cache hf model when --model not provided enhancement New feature or request review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7353 opened May 17, 2024 by amirzia Loading…
SimpleChat: a simple and dumb web front end for testing /chat/completions and /completions end points and try chat enhancement New feature or request examples review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#7350 opened May 17, 2024 by hanishkvc Loading…
Another threadpool: Avoid creating hundreds of threads in GGML performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7342 opened May 17, 2024 by besnardjb Loading…
add Viking tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7329 opened May 16, 2024 by jonabur Loading…
Viking-7B tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7328 opened May 16, 2024 by akx Draft
Fixed painfully slow single process builds. build Compilation issues need feedback Testing and feedback with results are needed performance Speed related topics
#7326 opened May 16, 2024 by jboero Loading…
[SYCL] Update SYCL upscale operation generation quality Quality of model output review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#7321 opened May 16, 2024 by AidanBeltonS Loading…
sched : support async weight copy performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7315 opened May 15, 2024 by slaren Draft
Add phi-2 tokenizer model Model specific review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7300 opened May 15, 2024 by BramVanroy Loading…
avoid to get prompt in infill mode and embedding mode examples review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
#7286 opened May 14, 2024 by woodx9 Draft
common: free ctx_gguf when exiting llama_control_vector_load_one bugfix fixes an issue or bug review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7285 opened May 14, 2024 by stevegrubb Loading…
ggml-opencl, llama: using reserve() if count already known refactoring Refactoring review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7272 opened May 14, 2024 by GermanAizek Draft
common, ngram_cache: added const reference for std::pair<> and std::tuple<> more 16 bytes: refactoring Refactoring review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7270 opened May 14, 2024 by GermanAizek Draft
ggml, ngram-cache, log: added const and const ref for function params refactoring Refactoring review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7269 opened May 14, 2024 by GermanAizek Loading…
ggml llama: align structs for memory optimization on 64-bit platforms refactoring Refactoring review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7267 opened May 13, 2024 by GermanAizek Loading…
Windows support for AVX512_BF16 and associated bug fixes for BF16 model bugfix fixes an issue or bug build Compilation issues merging soon Will merge soon unless anyone objects review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7258 opened May 13, 2024 by Srihari-mcw Loading…
ProTip! Updated in the last three days: updated:>2024-05-16.