This solution lets you connect Microsoft Copilot Studio to local LLM models running on your machine through Azure Relay. Instead of calling cloud-based models, your copilot can route requests to locally-hosted LLMs.
The setup uses Azure Relay as a bridge between cloud and local resources:
- RelayClient (Azure Function) - Deployed in Azure, receives requests from Copilot Studio and forwards them through Azure Relay
- LocalRelayServer - Runs locally, listens for requests via Azure Relay and routes them to local LLM clients
- LocalClients - Wrapper library for talking to local LLM inference servers
The solution uses both FoundryLocalClient and LMStudioClient for a specific reason: Foundry Local doesn't currently support OpenAI-style tool/function calling when using its OpenAI-compatible endpoint.
According to this thread, Foundry Local's API is only compatible with "basic chat completion, embedding, and completion endpoints, not function calling." This will of course change in the future, as Foundry Local evolves.
For tasks that need tool calling - like interacting with the local computer and calling the Foundry Local SDK to load models, etc - this solution routes those requests to LM Studio, which supports OpenAI's function calling spec. Simple completions that don't need tools go to Foundry Local.
Core library with OpenAI-compatible clients for local models:
- FoundryLocalClient - Uses Microsoft AI Foundry Local for running models locally (basic completions only)
- LMStudioClient - Connects to LM Studio with full tool/function calling support
- LocalOpenAiClient - Base class for OpenAI-compatible endpoints
Console app that maintains a persistent connection to Azure Relay and routes incoming requests:
AdminTaskrequests → LMStudioClient (orchestration/planning with tool calling)ChatCompletionrequests → FoundryLocalClient (simple completions)
Azure Function that acts as the cloud-side relay:
- Accepts HTTP requests from Copilot Studio
- Forwards them through Azure Relay to your local machine
- Returns the LLM response back to Copilot Studio
- Azure subscription (for Azure Relay and Azure Functions)
- LM Studio or compatible OpenAI server running locally
- .NET 9.0
Create an Azure Relay namespace and configure a Hybrid Connection, as described here. Make note of the relay namespace, name, SAS keyname and key, which should be added to the configuration files.
Create these config files (they're gitignored):
LocalRelayServer/appsettings.Development.json:
{
"AzureRelay": {
"RelayNamespace": "your-namespace.servicebus.windows.net",
"ConnectionName": "your-connection-name",
"KeyName": "RootManageSharedAccessKey",
"Key": "your-key-here"
}
}RelayClient/local.settings.json:
{
"Values": {
"AzureRelay:RelayNamespace": "your-namespace.servicebus.windows.net",
"AzureRelay:ConnectionName": "your-connection-name",
"AzureRelay:KeyName": "RootManageSharedAccessKey",
"AzureRelay:Key": "your-key-here"
}
}- Start your local LLM server (LM Studio on port 1234)
- Run LocalRelayServer:
dotnet run --project LocalRelayServer - Deploy RelayClient to Azure Functions
- Configure Copilot Studio to call your Azure Function endpoint
Copilot Studio can't directly reach your local machine, but Azure Relay creates a secure tunnel. Your local server maintains an outbound connection to Azure, so no inbound firewall rules needed. When Copilot Studio needs a local model, the request flows through Azure Relay to your machine and back.