mtmd: fix whisper audio tail truncation by exposing padded buffer to FFT by ServeurpersoCom · Pull Request #22770 · ggml-org/llama.cpp

ServeurpersoCom · 2026-05-06T16:36:56Z

Overview

Fixes the audio input >30s problem

Additional information

In log_mel_spectrogram, the whisper-style padding branch allocated samples_padded with a 30s silence tail but never reassigned samples and n_samples to point at it (unlike the no_padding and center_padding branches), so the pad never made it into the mel and the chunking loop dropped the final partial slice along with real audio.

Fixes #22591. Reproduced & tested with :

ggml-org/Qwen3-Omni-30B-A3B-Instruct-GGUF/
Qwen3-Omni-30B-A3B-Instruct-Q8_0.gguf
mmproj-Qwen3-Omni-30B-A3B-Instruct-Q8_0.gguf
mmproj-Qwen3-Omni-30B-A3B-Instruct-bf16.gguf

Testing

Speak for more than 30 seconds and give a random magic word at the end. The LLM must know it.

No patch

With patch

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES Opus 4.7 + local MCP rootless disposable pod with shared GPU

ServeurpersoCom · 2026-05-06T17:00:46Z

cc @ngxson

ngxson

nice, thanks!

…FFT (ggml-org#22770)

mtmd: fix whisper audio tail truncation by exposing padded buffer to FFT

9b29ba0

ServeurpersoCom requested a review from a team as a code owner May 6, 2026 16:36

ServeurpersoCom mentioned this pull request May 6, 2026

Feature Request: Qwen3-Omni-30B-A3B support #16186

Closed

4 tasks

ServeurpersoCom requested a review from ngxson May 6, 2026 17:00

github-actions Bot added the examples label May 6, 2026

ngxson approved these changes May 7, 2026

View reviewed changes

ngxson requested a review from a team May 7, 2026 10:41

CISC approved these changes May 7, 2026

View reviewed changes

ngxson merged commit cc97e45 into ggml-org:master May 7, 2026
45 of 46 checks passed

cetarthoriphros pushed a commit to cetarthoriphros/llama.cpp that referenced this pull request May 9, 2026

mtmd: fix whisper audio tail truncation by exposing padded buffer to …

eb2f4e4

…FFT (ggml-org#22770)

meh pushed a commit to meh/llama.cpp that referenced this pull request May 10, 2026

mtmd: fix whisper audio tail truncation by exposing padded buffer to …

d7e71b8

…FFT (ggml-org#22770)

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026

mtmd: fix whisper audio tail truncation by exposing padded buffer to …

76a523c

…FFT (ggml-org#22770)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mtmd: fix whisper audio tail truncation by exposing padded buffer to FFT#22770

mtmd: fix whisper audio tail truncation by exposing padded buffer to FFT#22770
ngxson merged 1 commit into
ggml-org:masterfrom
ServeurpersoCom:mtmd/fix-whisper-audio-tail-truncation

ServeurpersoCom commented May 6, 2026

Uh oh!

ServeurpersoCom commented May 6, 2026

Uh oh!

ngxson left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ServeurpersoCom commented May 6, 2026

Overview

Additional information

Testing

No patch

With patch

Requirements

Uh oh!

ServeurpersoCom commented May 6, 2026

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants