Skip to content

llms/huggingface: update default inference endpoint to router.huggingface.co#1501

Open
bozhouDev wants to merge 1 commit into
tmc:mainfrom
bozhouDev:fix/huggingface-default-router-url
Open

llms/huggingface: update default inference endpoint to router.huggingface.co#1501
bozhouDev wants to merge 1 commit into
tmc:mainfrom
bozhouDev:fix/huggingface-default-router-url

Conversation

@bozhouDev
Copy link
Copy Markdown

PR Checklist

  • Read the Contributing documentation.
  • Read the Code of conduct documentation.
  • Name your Pull Request title clearly, concisely, and prefixed with the affected package (llms/huggingface: ...).
  • Checked there isn't already a PR solving this the same way (searched open PRs; none touch the default URL).
  • Provide a description / reference the issue it solves — Fixes #1428.
  • Describes the source of new concepts.
  • References existing implementations as appropriate.
  • Contains test coverage for new functions — N/A (constant-only change; the existing TestHuggingFaceLLMStandardInference replay fixture is updated to cover the new endpoint).
  • Passes all golangci-lint checks — go vet passes; I couldn't run golangci-lint locally, but this is a one-line constant value change with no new code.

Description

Fixes #1428.

HuggingFace deprecated https://api-inference.huggingface.co and requests to it now return 404 (the endpoint was retired on 2025-11-01). llms/huggingface still used it as defaultURL, so HuggingFace LLM and embedding calls fail by default unless the caller overrides the URL.

This points defaultURL at the hf-inference provider on the new router.

URL construction note: the client builds request URLs as fmt.Sprintf("%s/models/%s", c.url, model) (see internal/huggingfaceclient/inference.go and embeddings.go), so the base must omit a trailing /models. Using https://router.huggingface.co/hf-inference yields:

https://router.huggingface.co/hf-inference/models/<model>

which is HuggingFace's documented replacement for the legacy task-based endpoint. (Note: the literal https://router.huggingface.co/hf-inference/models suggested in the issue would produce a doubled path — .../hf-inference/models/models/<model> — so the base is set without the trailing /models.)

Verification

go build ./llms/huggingface/...   # ok
go vet   ./llms/huggingface/...    # clean
go test  ./llms/huggingface/...    # ok (both packages)

The TestHuggingFaceLLMStandardInference replay fixture is updated to target the new endpoint so the test stays consistent with the new default.

Caveat (please review): the replay fixture's recorded response is the prior 404 from the deprecated host (the test already t.Skips on 404). I don't have HuggingFace credentials to re-record a live 200 against the new endpoint, so a fresh recording with HF_TOKEN would be a welcome follow-up.

…face.co

HuggingFace deprecated https://api-inference.huggingface.co; requests to it
now return 404 (the endpoint was retired on 2025-11-01). The package still
used it as the default URL, so HuggingFace LLM and embedding calls fail by
default unless the caller overrides the URL.

Point defaultURL at the hf-inference provider on the new router
(https://router.huggingface.co/hf-inference). The client builds request URLs
as "%s/models/%s", so the base must omit a trailing /models; the resulting
URL is https://router.huggingface.co/hf-inference/models/<model>, which is
HuggingFace's documented replacement for the legacy task endpoint.

Update the TestHuggingFaceLLMStandardInference replay fixture to target the
new endpoint accordingly.

Fixes tmc#1428

Signed-off-by: bozhouDev <259759010+bozhouDev@users.noreply.github.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update defaultURL in huggingfacellm_option.go (old endpoint deprecated)

1 participant