Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

codeowners : add ZenDNN backend codeowner
#22772 opened May 6, 2026 by z-vishal Contributor Loading…
ggml-cpu: Optimized risc-v cpu q1_0 dot ggml changes relating to the ggml tensor library for machine learning
#22768 opened May 6, 2026 by pl752 Contributor Loading…
ggml-sycl : use malloc_shared for UMA/integrated GPU devices ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22766 opened May 6, 2026 by vmartirosyan Loading…
Draft: ggml-opencl: Early proof-of-concept implementation of plans via command buffers ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#22764 opened May 6, 2026 by jansol Draft
2 tasks done
android: extract GgufMetadataReader factory to break cyclic dependency android Issues specific to Android examples
#22763 opened May 6, 2026 by Juste-Leo2 Contributor Loading…
[ggml] fix vulkan spv shadowing ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#22760 opened May 6, 2026 by miyanyan Loading…
ggml-opencl: add opt-in Adreno xmem F16xF32 GEMM for prefill ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#22755 opened May 6, 2026 by happyyzy Loading…
ggml-cpu: extend RVV quantization vec dot to higher VLENs ggml changes relating to the ggml tensor library for machine learning
#22754 opened May 6, 2026 by rehan-10xengineer Contributor Loading…
llama : add missing call to ggml_backend_load_all() merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge.
#22752 opened May 6, 2026 by angt Member Loading…
ggml : add ggml_quantize_chunk_mt for parallel row quantization ggml changes relating to the ggml tensor library for machine learning
#22743 opened May 6, 2026 by shikaku2 Loading…
common : revert reasoning budget +inf change testing Everything test related
#22740 opened May 6, 2026 by aldehir Contributor Loading…
Fix Bad Substitution Error examples SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22737 opened May 6, 2026 by dogunbound Loading…
SYCL: reduce allocation overhead during flash attention ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22732 opened May 5, 2026 by sanmai Loading…
opencl: add q4_0 MoE GEMM for Adreno ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#22731 opened May 5, 2026 by shawngu-quic Contributor Loading…
Write a readme on Multi-GPU usage in llama.cpp documentation Improvements or additions to documentation
#22729 opened May 5, 2026 by gaugarg-nv Contributor Loading…
llama : extend embeddings API model Model specific
#22728 opened May 5, 2026 by ggerganov Member Draft
ProTip! Mix and match filters to narrow down what you’re looking for.