Skip to content

docs: fix sign of online softmax rescaling factors#177

Open
tangpanyu wants to merge 1 commit into
deepseek-ai:mainfrom
tangpanyu:fix-doc-online-softmax-scale-sign
Open

docs: fix sign of online softmax rescaling factors#177
tangpanyu wants to merge 1 commit into
deepseek-ai:mainfrom
tangpanyu:fix-doc-online-softmax-scale-sign

Conversation

@tangpanyu

Copy link
Copy Markdown

This PR fixes the sign of the online softmax rescaling factors in docs/20250422-new-kernel-deep-dive.md.
When rescaling the previous accumulator from the old running max to the new running max, the factor should be:
exp(m_old - m_new)
rather than:
exp(m_new - m_old)

So I believe the formulas should be:
scale_0 = exp(m_old - m_new_0)
scale_1 = exp(m_old - m_new_1)

If I misunderstood the notation here, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant