llms/huggingface: update default inference endpoint to router.huggingface.co#1501
Open
bozhouDev wants to merge 1 commit into
Open
llms/huggingface: update default inference endpoint to router.huggingface.co#1501bozhouDev wants to merge 1 commit into
bozhouDev wants to merge 1 commit into
Conversation
…face.co HuggingFace deprecated https://api-inference.huggingface.co; requests to it now return 404 (the endpoint was retired on 2025-11-01). The package still used it as the default URL, so HuggingFace LLM and embedding calls fail by default unless the caller overrides the URL. Point defaultURL at the hf-inference provider on the new router (https://router.huggingface.co/hf-inference). The client builds request URLs as "%s/models/%s", so the base must omit a trailing /models; the resulting URL is https://router.huggingface.co/hf-inference/models/<model>, which is HuggingFace's documented replacement for the legacy task endpoint. Update the TestHuggingFaceLLMStandardInference replay fixture to target the new endpoint accordingly. Fixes tmc#1428 Signed-off-by: bozhouDev <259759010+bozhouDev@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Checklist
llms/huggingface: ...).Fixes #1428.TestHuggingFaceLLMStandardInferencereplay fixture is updated to cover the new endpoint).golangci-lintchecks —go vetpasses; I couldn't rungolangci-lintlocally, but this is a one-line constant value change with no new code.Description
Fixes #1428.
HuggingFace deprecated
https://api-inference.huggingface.coand requests to it now return404(the endpoint was retired on 2025-11-01).llms/huggingfacestill used it asdefaultURL, so HuggingFace LLM and embedding calls fail by default unless the caller overrides the URL.This points
defaultURLat thehf-inferenceprovider on the new router.URL construction note: the client builds request URLs as
fmt.Sprintf("%s/models/%s", c.url, model)(seeinternal/huggingfaceclient/inference.goandembeddings.go), so the base must omit a trailing/models. Usinghttps://router.huggingface.co/hf-inferenceyields:which is HuggingFace's documented replacement for the legacy task-based endpoint. (Note: the literal
https://router.huggingface.co/hf-inference/modelssuggested in the issue would produce a doubled path —.../hf-inference/models/models/<model>— so the base is set without the trailing/models.)Verification
The
TestHuggingFaceLLMStandardInferencereplay fixture is updated to target the new endpoint so the test stays consistent with the new default.Caveat (please review): the replay fixture's recorded response is the prior
404from the deprecated host (the test alreadyt.Skips on404). I don't have HuggingFace credentials to re-record a live200against the new endpoint, so a fresh recording withHF_TOKENwould be a welcome follow-up.