TL;DR
- Default Azure logging = one big bucket (hard to split by team/project).
- AI Gateway adds model-level logging + per-subscription keys for clean attribution.
- You also capture prompts, completions, tokens, latency, errors in a consistent schema.
- A ready-to-use Analytics workbook you can access without custom scripts.
Before This Update: Default Logging Options
Before AI Gateway introduced model-level logging, you still had some tools in Azure to track usage and spend.
- In Cost Management + Billing, you could break down costs by subscription, resource group, individual resource, or by tags if you’d applied them. You’d also see spend grouped by service name (for example, “Azure OpenAI Service”), which gave you a category-level view.
- You could also turn on diagnostic settings for your Azure OpenAI resource. This lets you push logs into Log Analytics or Azure Monitor and capture details like: prompt tokens, completion tokens, latency and, if you opted in, even the full request and response bodies.
So yes—you could see usage and request details.
But here’s the limitation: all of those logs were tied to the resource as a whole. If multiple teams or apps shared the same Azure OpenAI instance, the logs blended together. You ended up with one big blob of data that was hard to split cleanly by team, project, or model endpoint.
The Problem: One Big Blob of Logs
Now imagine three teams all working under the same subscription:
- Team A building a customer support chatbot
- Team B experimenting with document summarization
- Team C running analytics on external data
They’re all calling the same pool of model endpoints. With default logging, you might see total token usage, latency, or even the raw prompts and completions — but it’s all tied to the resource as a whole. You can’t answer questions like: Which team burned through millions of tokens last week? Which model is generating the most latency errors? Or which app sent the unsafe prompt that needs investigation?
What you really need is a way to split logs and usage by the slices that matter to you — whether that’s by team, by project, or by individual application. And that’s exactly what AI Gateway in API Management makes possible.
The Fix: Model-Level Logging with AI Gateway
That’s where the new AI Gateway capabilities in Azure API Management come in. With this update, you finally get fine-grained visibility and control.
- Model-level logging → Instead of all logs tied to a resource, you can log activity at the individual model endpoint level.
- Subscription keys per team or project → In API Management, you can create subscriptions and hand out unique keys to each team, app, or project. Every request is tagged with its key, so when logs land, attribution is automatic.
- Consistent schema → Logs capture prompts, completions, token counts, latency, and errors in a structured way.
- Per-team or per-model policies → Apply built-in LLM policies like token caps, caching, etc.
- Out-of-the-box dashboards → The Language models analytics workbook in the portal automatically lights up when diagnostics are enabled, showing usage split by subscription key, model, and API version. And of course, if you want something more tailored, you can always build your own custom Log Analytics workbooks—using KQL queries to slice the same AI Gateway logs in whatever way fits your reporting needs best.
How to Set It Up (Out-of-the-Box)
You don’t need custom scripts to get started—everything’s built right into Azure API Management. Here’s the flow:
- Import your model API into API Management
- In the Azure portal, go to your API Management instance.
- Use Import from Azure AI Foundry or Create from OpenAI spec to bring in your model endpoint.
- ⚡ Important: Under Manage Token Consumption:
- Select an Application Insights instance. This is required to collect token usage metrics.
- Choose how you want tokens to be grouped in your reports. In our earlier example, that was by subscription ID (so each team/project gets its own bucket). But you can also group by API ID, client IP address, user ID, and other dimensions—depending on how you want attribution to work.
- In the same tab, you can also set a token limit policy per subscription, so one team can’t silently burn through everyone else’s quota.
- Once configured, your model is now an API that can be governed, logged, and monitored like any other in API Management.
2. Create subscription keys for each team or project
Start by fixing the “one big blob of logs” problem. In API Management, you can issue unique subscription keys that represent whatever slice of usage you need to track — per team, per project, or even per application.
- Head to Subscriptions in your API Management instance.
- Create a new subscription for each slice you want to track (team, app, project).
- Each subscription generates its own primary and secondary keys. Hand these keys to the right group. From now on, every request carries a key tied to that slice.When logs land in the analytics workbook, you’ll see clean breakdowns by subscription key.
3. Enable diagnostic settings for AI Gateway logs
- Go to your API Management instance → Monitoring → Diagnostic settings.
- Add a setting, and make sure you check:
- Logs related to API Management gateway
- Logs related to generative AI gateway
- Send them to a Log Analytics workspace (so they’ll power dashboards and be queryable in KQL).
Turn on request and response logging (per API)
- Go to APIs → [Your Model API] → Settings → Azure Monitor.
- Enable Log LLM messages and configure how much of the prompt/completion to log.
4. Use the built-in Analytics workbook (or create your own)
- In the portal, go to Monitoring → Analytics → Language models.
- The default workbook will show token usage breakdown by model, API version, and subscription. And of course, you can always build custom workbooks if you want to go deeper or create tailored views for your teams.
With just these steps, you’ve gone from vague subscription-level metrics to detailed, team-aware diagnostics—with a dashboard ready on day one.
6. Drill into prompts and completions with Logs
- In your API Management instance, go to Monitoring → Logs. This opens the Log Analytics query editor for the workspace you selected when enabling diagnostics.
- Query the ApiManagementGatewayLlmLog table to explore the actual request/response content. This returns side-by-side prompt and response text for each correlation ID.
Wrapping Up
AI Gateway in API Management just leveled up. With model-level logging, out-of-the-box dashboards, and support across Azure OpenAI, Foundry, and OpenAI-compatible APIs, you finally get both diagnostic depth and team-level clarity.
✅ Bottom line: if your AI logs still feel like one big messy blob, it’s time to switch on model logging in AI Gateway.
References:
AI gateway capabilities in Azure API Management | Microsoft Learn
Set Up Logging for LLM APIs in Azure API Management – Azure API Management | Microsoft Learn
Import an Azure AI Foundry API – Azure API Management | Microsoft Learn
https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/monitor-openai
https://learn.microsoft.com/en-us/azure/cost-management-billing/costs/quick-acm-cost-analysis
Nafisa Ahmed
Nafisa Ahmed is a computer science researcher and data scientist with over 7 years of experience in applied AI, machine learning, and software engineering. She has a strong track record of developing scalable AI models for anomaly detection, data drift monitoring, and public health applications such as COVID-19 detection. Nafisa has contributed to academia through high-impact publications and to industry through innovative solutions that integrate cutting-edge AI technologies. Her proficiency in cloud platforms and FaaS architectures, combined with her dedication to mentoring and collaboration, drives impactful technological advancements