An end-to-end AI-powered FAQ assistant that uses vector semantic search in Azure SQL Hyperscale, Retrieval-Augmented Generation (RAG) with GPT-4o, and Model Context Protocol (MCP) to ground AI responses in approved support content — with no hallucinations.
Built as part of the Microsoft Build 2026 Digital Lab
User Question
↓
Azure SQL Hyperscale ←─── 1536-dim embeddings (Azure OpenAI)
VECTOR_DISTANCE()
↓
Top 3 FAQ matches (semantic similarity)
↓
Grounded Prompt Assembly (T-SQL + STRING_AGG)
↓
GPT-4o via sp_invoke_external_rest_endpoint
↓
Grounded AI Answer (no hallucinations)
Agent flow (Exercise 4):
User → Foundry Agent → MCP tool call → Local Python server → Azure SQL → Grounded response
Analytics flow (Exercise 5):
Azure SQL → Fabric Mirroring (no ETL) → OneLake → SQL Analytics Endpoint → Power BI
- Vector semantic search — FAQ questions stored as 1536-dimension embeddings; cosine similarity via
VECTOR_DISTANCEin Azure SQL - RAG pipeline in pure T-SQL — retrieval, prompt assembly, and GPT-4o call all orchestrated inside Azure SQL using
sp_invoke_external_rest_endpoint - Custom MCP server — Python FastMCP server exposes a
search_faqtool to any MCP-compatible client (Foundry Agents, VS Code Copilot) - Foundry Agent orchestration — Microsoft Foundry Agent calls the MCP tool over dev tunnel; grounds every response in retrieved FAQ content
- Data API Builder (DAB) — Microsoft-native MCP layer exposing Azure SQL as a read-only tool without writing custom API code
- Fabric Mirroring —
FAQ_Contentauto-synced into OneLake in near real-time; Power BI report built directly on mirrored data - Hallucination prevention — agent and SQL pipeline both instructed to respond "I do not know" when no relevant FAQ content is found
├── sql-scripts/
│ └── rag_pipeline.sql # Full RAG pipeline: vector retrieval + GPT-4o call
├── mcp-server/
│ ├── server.py # FastMCP server exposing search_faq tool
│ ├── invoke_mcp.py # CLI test client for the MCP server
│ ├── run_query.py # Direct SQL query runner (no MCP)
│ ├── diag_pyodbc.py # Connection string diagnostic utility
│ ├── requirements.txt
│ └── .env.example # Environment variable template
├── dab-config/
│ └── dab-config.json # Data API Builder MCP configuration
└── README.md
- Python 3.10+
- ODBC Driver 17 for SQL Server
- Azure SQL Database with
dbo.FAQ_Content,dbo.FAQ_Embeddings, anddbo.SearchFAQstored procedure - Azure OpenAI deployment (GPT-4o)
-
Create a virtual environment and install dependencies:
python -m venv .venv .venv\Scripts\Activate.ps1 # Windows source .venv/bin/activate # macOS/Linux pip install -r mcp-server/requirements.txt
-
Copy
.env.exampleto.envand fill in your connection string:AZURE_SQL_CONN_STR=Driver={ODBC Driver 17 for SQL Server};Server=<server>.database.windows.net;Database=<db>;Uid=<user>;Pwd=<password>;Encrypt=yes -
Start the server:
python mcp-server/server.py
Server runs at
http://0.0.0.0:8000/mcp -
Test it:
python mcp-server/invoke_mcp.py "how do I track my order?"
Open sql-scripts/rag_pipeline.sql in VS Code with the MSSQL extension. Replace the placeholder API key and endpoint URL with your Azure OpenAI values, then run the script.
-
Install DAB (.NET SDK required):
dotnet new tool-manifest dotnet tool install microsoft.dataapibuilder
-
Update the connection string placeholder in
dab-config/dab-config.json. -
Start the MCP server:
dotnet tool run dab start --mcp-stdio role:anonymous --config dab-config/dab-config.json
| Component | Role |
|---|---|
| Azure SQL Hyperscale | Vector storage, RAG retrieval, REST API calls |
| Azure OpenAI (GPT-4o) | Embedding generation + answer generation |
| Python FastMCP | Custom MCP server (search_faq tool) |
| Microsoft Foundry Agents | Agent orchestration layer |
| Data API Builder | Microsoft-native MCP layer for Azure SQL |
| Microsoft Fabric + OneLake | Near real-time analytics mirroring |
| Power BI | FAQ analytics and reporting |
| dev tunnel | Expose local MCP server to cloud Foundry |
- How vector embeddings enable semantic search that goes beyond keyword matching
- How to implement a complete RAG pipeline entirely in T-SQL — no Python required for retrieval
- How Model Context Protocol (MCP) standardizes tool interfaces for AI agents
- How Microsoft Fabric Mirroring eliminates ETL pipelines for analytics on operational data
- The difference between a grounded AI response and a hallucinated one — and how to enforce grounding
Built at Microsoft Build 2026 — LAB513D