Skip to content

CUA Sample Agent#261

Open
MohamedAbdekader wants to merge 9 commits intomicrosoft:mainfrom
MohamedAbdekader:mabdelkader/cuaAgentSample
Open

CUA Sample Agent#261
MohamedAbdekader wants to merge 9 commits intomicrosoft:mainfrom
MohamedAbdekader:mabdelkader/cuaAgentSample

Conversation

@MohamedAbdekader
Copy link
Copy Markdown

No description provided.

@MohamedAbdekader MohamedAbdekader requested a review from a team as a code owner March 30, 2026 17:20
Copilot AI review requested due to automatic review settings March 30, 2026 17:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new .NET 8 “W365 Computer Use” sample agent that connects to the W365 Computer Use MCP server and drives a computer-use (CUA) loop via the OpenAI Responses API, with accompanying configuration, telemetry, and documentation.

Changes:

  • Introduces a new sample agent project/solution for Windows 365 computer-use via MCP + OpenAI Responses API.
  • Implements model providers (Azure OpenAI + custom endpoint), the computer-use orchestrator, and an Agent Framework-based bot.
  • Adds OpenTelemetry/observability helpers, local dev configuration assets, and a full README with setup/run instructions.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
dotnet/w365-computer-use/sample-agent/appsettings.json Adds sample configuration for auth, connections, AI providers, and computer-use settings
dotnet/w365-computer-use/sample-agent/W365ComputerUseSample.csproj New .NET 8 web project with A365/Agent Framework + OTEL dependencies
dotnet/w365-computer-use/sample-agent/ToolingManifest.json Declares MCP server dependency for tooling discovery
dotnet/w365-computer-use/sample-agent/Telemetry/AgentMetrics.cs Adds custom ActivitySource + metrics helpers for request/turn instrumentation
dotnet/w365-computer-use/sample-agent/Telemetry/A365OtelWrapper.cs Wraps agent operations to attach baggage and register observability token cache
dotnet/w365-computer-use/sample-agent/ServiceExtensions.cs Adds OpenTelemetry wiring (traces + metrics) to the web host
dotnet/w365-computer-use/sample-agent/README.md Documents sample purpose, architecture, setup, configuration, and troubleshooting
dotnet/w365-computer-use/sample-agent/Properties/launchSettings.json Adds local debug profile and URL binding
dotnet/w365-computer-use/sample-agent/Program.cs App composition: DI registrations, routing, auth, endpoints, shutdown hook
dotnet/w365-computer-use/sample-agent/ComputerUse/Models/ComputerUseModels.cs Adds JSON request/response models and tool definitions for Responses API
dotnet/w365-computer-use/sample-agent/ComputerUse/ICuaModelProvider.cs Defines abstraction for calling a CUA-capable model endpoint
dotnet/w365-computer-use/sample-agent/ComputerUse/CustomEndpointProvider.cs Implements certificate/MSAL-secured custom endpoint model provider
dotnet/w365-computer-use/sample-agent/ComputerUse/ComputerUseOrchestrator.cs Core CUA loop translating model actions into W365 MCP tool calls
dotnet/w365-computer-use/sample-agent/ComputerUse/AzureOpenAIModelProvider.cs Implements Azure OpenAI Responses API provider using API key
dotnet/w365-computer-use/sample-agent/AspNetExtensions.cs Adds configurable JWT token validation wiring for ASP.NET
dotnet/w365-computer-use/sample-agent/Agent/MyAgent.cs Agent entrypoint: auth selection, tool loading, streaming updates, orchestration
dotnet/w365-computer-use/sample-agent/.gitignore Ignores dev settings and screenshots output
dotnet/w365-computer-use/W365ComputerUseSample.sln New solution to open/build the sample project

if (_cachedTools != null)
return (_cachedTools, _cachedMcpClient);

var httpClient = _httpClient;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting _httpClient.DefaultRequestHeaders.Authorization on a long-lived HttpClient can leak/overwrite bearer tokens between users and affect unrelated requests from this orchestrator. Prefer per-request authorization headers (or a dedicated client per cached connection) rather than mutating DefaultRequestHeaders.

Suggested change
var httpClient = _httpClient;
// Use a dedicated HttpClient instance for the MCP connection to avoid
// mutating authorization headers on a shared, long-lived HttpClient.
var httpClient = new HttpClient();

Copilot uses AI. Check for mistakes.
}

/// <summary>
/// Run the CUA loop. Session must already be started by the caller.
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XML doc says RunAsync “Session must already be started by the caller”, but this method starts the session itself when _sessionStarted is false. Update the comment to match behavior (or enforce the contract by removing the internal start).

Suggested change
/// Run the CUA loop. Session must already be started by the caller.
/// Run the CUA loop. Starts a W365 session if one is not already active and
/// reuses the same session across calls for this application instance.

Copilot uses AI. Check for mistakes.
Comment on lines +560 to +564
var driveBase = string.IsNullOrEmpty(_oneDriveUserId)
? "https://graph.microsoft.com/v1.0/me/drive"
: $"https://graph.microsoft.com/v1.0/users/{_oneDriveUserId}/drive";
var url = $"{driveBase}/root:/{_oneDriveFolder.TrimStart('/')}/{fileName}:/content";

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OneDrive upload doc comment says files go to /CUA-Sessions/{date}/, but the code builds a URL without any date-based subfolder. Either implement the dated folder structure or adjust the comment to match actual behavior.

Copilot uses AI. Check for mistakes.
request.Content = new ByteArrayContent(Convert.FromBase64String(base64Data));
request.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("image/png");

var response = await _httpClient.SendAsync(request);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_httpClient.SendAsync(request) isn’t passed a CancellationToken, so cancelled requests/shutdown may hang until the HTTP call completes. Thread the token through and pass it into SendAsync here.

Suggested change
var response = await _httpClient.SendAsync(request);
var response = await _httpClient.SendAsync(request, System.Threading.CancellationToken.None);

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +73
agentId = agentId ?? Guid.Empty.ToString();
string? tempTenantId = turnContext?.Activity?.Conversation?.TenantId ?? turnContext?.Activity?.Recipient?.TenantId;
string tenantId = tempTenantId ?? Guid.Empty.ToString();
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agentId = agentId ?? Guid.Empty.ToString(); won’t replace an empty string, so agentId can remain "" and be propagated into baggage/observability. Use string.IsNullOrEmpty(agentId) (or make it nullable) and fall back to a stable placeholder when it’s missing.

Copilot uses AI. Check for mistakes.
Comment on lines +108 to +112
finally
{
stopwatch.Stop();
FinalizeMessageHandlingActivity(activity, context, stopwatch.ElapsedMilliseconds, true);
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FinalizeMessageHandlingActivity(..., success: true) is always called with true, even after an exception path. This can overwrite the Activity status to OK and misreport duration metrics. Track a success flag based on whether func() completed without throwing and pass that value.

Copilot uses AI. Check for mistakes.
Comment on lines +263 to +265
if (_cachedTools != null)
return (_cachedTools, _cachedMcpClient);

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This global cache returns the same _cachedTools/_cachedMcpClient for all callers. If multiple users/tenants hit the agent, sessions and tool state can be unintentionally shared. Cache per conversation/agent identity instead of a single global instance.

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +7
## Overview

This sample demonstrates how to build an agent that controls a Windows 365 Cloud PC using the OpenAI Responses API and the W365 Computer Use MCP server.

The agent receives a natural language task from the user, provisions a W365 desktop session via MCP tools, then runs a CUA (Computer Use Agent) loop: the model sees screenshots, decides actions (click, type, scroll), and the MCP server executes them on the VM.
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README doesn’t include an explicit “Demonstrates” section (used by other Agent 365 samples to quickly summarize what the sample teaches). Add a short “Demonstrates” section near the top so the learning goals are scannable.

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +6
using Microsoft.Agents.A365.Observability;
using Microsoft.Agents.A365.Observability.Extensions.AgentFramework;
using Microsoft.Agents.A365.Observability.Runtime;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These using directives appear unused in this file and will generate build warnings. Remove the unused Microsoft.Agents.A365.Observability* usings (or start using the referenced APIs) to keep the sample warning-free.

Suggested change
using Microsoft.Agents.A365.Observability;
using Microsoft.Agents.A365.Observability.Extensions.AgentFramework;
using Microsoft.Agents.A365.Observability.Runtime;

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +58

// Register the Computer Use orchestrator
builder.Services.AddSingleton<ComputerUseOrchestrator>();

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ComputerUseOrchestrator is registered as a singleton but it holds mutable per-conversation state (_conversationHistory, _sessionStarted, cached MCP client/tools, screenshot counter). This can cause cross-user data leakage and races if multiple conversations/messages are processed concurrently. Consider making it scoped/per-conversation (or keying state by conversation/user id with locking).

Copilot uses AI. Check for mistakes.
Mohamed Abdelkader and others added 6 commits April 2, 2026 14:55
- Add ConversationSession class to track per-conversation W365 sessions
- Refactor ComputerUseOrchestrator to use ConcurrentDictionary keyed by conversationId
- Parse sessionId from QuickStartSession response and pass to all MCP tool calls
- Pass conversationId from turnContext to orchestrator
- Add deployment artifacts to .gitignore (a365 configs, app.zip, publish/)
…loop

Instead of sending the full conversation history (including all base64
screenshots) on every model call, use the OpenAI Responses API's
previous_response_id to let the server reconstruct prior context.
Only new items (computer_call_output, function_call_output) are sent
per iteration, reducing API payload by ~15x.

Between user messages, computer actions and screenshots are pruned
from history while text context is preserved for conversational
continuity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: use previous_response_id to avoid resending screenshots in CUA …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants