-
Notifications
You must be signed in to change notification settings - Fork 17.9k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
webui: fix flicker issue on dismiss animation on overlay primitives
examples
server/webui
server
#22773
opened May 6, 2026 by
vignesh191
Contributor
Loading…
codeowners : add ZenDNN backend codeowner
#22772
opened May 6, 2026 by
z-vishal
Contributor
Loading…
webui: fix ?model= URL param race in router mode
examples
server/webui
server
#22771
opened May 6, 2026 by
ServeurpersoCom
Contributor
Loading…
mtmd: fix whisper audio tail truncation by exposing padded buffer to FFT
examples
#22770
opened May 6, 2026 by
ServeurpersoCom
Contributor
Loading…
ggml-cpu: Optimized risc-v cpu q1_0 dot
ggml
changes relating to the ggml tensor library for machine learning
#22768
opened May 6, 2026 by
pl752
Contributor
Loading…
ggml-sycl : use malloc_shared for UMA/integrated GPU devices
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22766
opened May 6, 2026 by
vmartirosyan
Loading…
android: extract GgufMetadataReader factory to break cyclic dependency
android
Issues specific to Android
examples
#22763
opened May 6, 2026 by
Juste-Leo2
Contributor
Loading…
server: fix /infill prompt placement after FIM_MID
examples
server
#22761
opened May 6, 2026 by
Aayush7g
Loading…
[ggml] fix vulkan spv shadowing
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#22760
opened May 6, 2026 by
miyanyan
Loading…
server: opt-in Codec binary streaming (msgpack/protobuf token frames)
examples
server
#22757
opened May 6, 2026 by
wdunn001
Loading…
ggml-opencl: add opt-in Adreno xmem F16xF32 GEMM for prefill
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#22755
opened May 6, 2026 by
happyyzy
Loading…
ggml-cpu: extend RVV quantization vec dot to higher VLENs
ggml
changes relating to the ggml tensor library for machine learning
#22754
opened May 6, 2026 by
rehan-10xengineer
Contributor
Loading…
llama : add missing call to ggml_backend_load_all()
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
#22752
opened May 6, 2026 by
angt
Member
Loading…
Add more wav-compatiable MIME types and enhance MIME type normalization
examples
server/webui
server
#22744
opened May 6, 2026 by
guangchenli
•
Draft
ggml : add ggml_quantize_chunk_mt for parallel row quantization
ggml
changes relating to the ggml tensor library for machine learning
#22743
opened May 6, 2026 by
shikaku2
Loading…
common : revert reasoning budget +inf change
testing
Everything test related
#22740
opened May 6, 2026 by
aldehir
Contributor
Loading…
Fix Bad Substitution Error
examples
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22737
opened May 6, 2026 by
dogunbound
Loading…
webui : [ChatFormActionAdd][a11y] fix accessibility issues in add menu trigger and items
examples
server/webui
server
#22736
opened May 6, 2026 by
vignesh191
Contributor
Loading…
opencl: add q4_0 MoE GEMM for Adreno
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#22731
opened May 5, 2026 by
shawngu-quic
Contributor
Loading…
Write a readme on Multi-GPU usage in llama.cpp
documentation
Improvements or additions to documentation
#22729
opened May 5, 2026 by
gaugarg-nv
Contributor
Loading…
server, webui: support continue generation on reasoning models
examples
server/webui
server
#22727
opened May 5, 2026 by
ServeurpersoCom
Contributor
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.