Skip to content

Fix endpoint parsing#241

Merged
bmerkle merged 3 commits intomicrosoft:mainfrom
gvanrossum:fix-endpoint-parsing
Apr 23, 2026
Merged

Fix endpoint parsing#241
bmerkle merged 3 commits intomicrosoft:mainfrom
gvanrossum:fix-endpoint-parsing

Conversation

@gvanrossum-ms
Copy link
Copy Markdown
Contributor

@gvanrossum-ms gvanrossum-ms commented Apr 22, 2026

Looks like some regressions were caused by #231 that only showed when using real API keys. Shame on me for not running tests locally before approving PRs.

@gvanrossum-ms
Copy link
Copy Markdown
Contributor Author

@KRRT7 Is this indeed caused by your #231? (And undiscovered due to me not running the tests manually.)

Do you agree with the fix-upon-fix that Claude came up with for me? It passes all tests and doesn't look completely implausible but I admit that I've lost track of all the details around the env vars we use (by old convention established in microsoft/TypeAgent) and I didn't review #231 well enough.

@bmerkle Once @KRRT7 approves the changes can you also review it?

@bmerkle
Copy link
Copy Markdown
Collaborator

bmerkle commented Apr 22, 2026

Strange because i ran tests today locally for #230 and # 232 and all Tests were green in my setup.

Thought we already had fixed the Regression of #231 via #238 but seems that we have missed a Variation...

Once final fix ist ready i can review quick tomorrow

@KRRT7
Copy link
Copy Markdown
Contributor

KRRT7 commented Apr 22, 2026

let me take a quick look

@gvanrossum
Copy link
Copy Markdown
Collaborator

Strange because i ran tests today locally for #230 and # 232 and all Tests were green in my setup.

Did you use Azure? How did you specify the endpoints? We have a convention used in our .env file where they look e.g. like this:

AZURE_OPENAI_ENDPOINT=https://*****.openai.azure.com/openai/deployments/gpt-4o-2/chat/completions?api-version=2024-08-01-preview

Maybe it would be useful to run make coverage` and see what code is or isn't hit here.

@bmerkle
Copy link
Copy Markdown
Collaborator

bmerkle commented Apr 22, 2026

Strange because i ran tests today locally for #230 and # 232 and all Tests were green in my setup.

Did you use Azure? How did you specify the endpoints? We have a convention used in our .env file where they look e.g. like this:

AZURE_OPENAI_ENDPOINT=https://*****.openai.azure.com/openai/deployments/gpt-4o-2/chat/completions?api-version=2024-08-01-preview

Maybe it would be useful to run make coverage` and see what code is or isn't hit here.

AZURE_APIM_SUBSCRIPTION_KEY=****

AZURE_OPENAI_ENDPOINT="https://****-ai.azure-api.net/openai/openai/deployments/gpt-4o/chat/completions?api-version=2025-01-01-preview"

AZURE_OPENAI_ENDPOINT_EMBEDDING="https://****-ai.azure-api.net/openai/openai/deployments/text-embedding-3-small/embeddings?api-version=2025-01-01-preview"

Comment thread src/typeagent/aitools/utils.py
@KRRT7
Copy link
Copy Markdown
Contributor

KRRT7 commented Apr 23, 2026

in my opinion every single line of code should have as close to 100% test coverage, so maybe make it a requirement, internally, I started doing this by establishing a baseline, and increasing that baseline over time for an incremental migration, this helps catch these sort of issues more easily.

Copy link
Copy Markdown
Collaborator

@bmerkle bmerkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bmerkle bmerkle merged commit d1ff811 into microsoft:main Apr 23, 2026
16 checks passed
@gvanrossum
Copy link
Copy Markdown
Collaborator

in my opinion every single line of code should have as close to 100% test coverage, so maybe make it a requirement, internally, I started doing this by establishing a baseline, and increasing that baseline over time for an incremental migration, this helps catch these sort of issues more easily.

Nice theory. In practice this would add a huge amount of tests which would require a lot of effort (and LLM tokens) to add. Let's not go overboard here.

@bmerkle
Copy link
Copy Markdown
Collaborator

bmerkle commented Apr 23, 2026

in my opinion every single line of code should have as close to 100% test coverage, so maybe make it a requirement, internally, I started doing this by establishing a baseline, and increasing that baseline over time for an incremental migration, this helps catch these sort of issues more easily.

There are different test coverage metrics and we have to be realistic here...effort vs. ROI

High function cov should BE doable
High statement cov is possible but some effort
High branch cov is a huge effort and sometimes Impossible. E.g. mcdc... We did this only for safety related Projects and it no fun

I have kicked off a branch with additional Test for function coverages. Lets start with this.
I will file a PR today with first steps.

@KRRT7
Copy link
Copy Markdown
Contributor

KRRT7 commented Apr 23, 2026

Fair point — I'm not suggesting writing tests for every line all at once. The approach I've found works well is: establish the current baseline (whatever make coverage reports today), set fail_under to that number, and just make sure it doesn't go down. New PRs that touch untested code add a few tests for what they changed — no big-bang effort required.

@bmerkle that sounds like a great starting point — function coverage first is exactly the right priority. Happy to review the PR when it's up.

KRRT7 added a commit to KRRT7/typeagent-py that referenced this pull request Apr 23, 2026
gvanrossum pushed a commit that referenced this pull request Apr 23, 2026
## Summary
- Adds 3 tests for `resolve_azure_model_name` covering: deployment name
extraction from endpoint, fallback to provided model name, and default
`AZURE_OPENAI_ENDPOINT` envvar usage
- Fixes `test_create_embedding_model_uses_azure_deployment_name` —
wasn't clearing `OPENAI_EMBEDDING_MODEL` from the environment, so the
assertion failed when that envvar was set
- Addresses review comment from #241

## Test plan
- [x] `make check` (pyright) — 0 errors
- [x] `pytest tests/test_utils.py tests/test_model_adapters.py` — 33/33
pass
- [x] `make test` — 496 pass, 3 fail (all transient 429 rate limits,
addressed in #245)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants