diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index bb331ae3..3a5d4cf1 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -172,7 +172,7 @@ Structure: - Standard library imports (time, json, tempfile, requests, pathlib, datetime) - `utils`, `apimtypes`, `console`, `azure_resources` (including `az`, `get_infra_rg_name`, `get_account_info`) 2. USER CONFIGURATION section: - - `rg_location`: Azure region (default: 'eastus2') + - `rg_location`: Azure region (default: `Region.EAST_US_2`) - `index`: Deployment index for resource naming (default: 1) - `deployment`: Selected infrastructure type (reference INFRASTRUCTURE enum options) - `api_prefix`: Prefix for APIs to avoid naming collisions @@ -407,6 +407,7 @@ Check `docs/README.md` for local preview instructions and styling notes. The pag - Existing cells must keep a unique `metadata.id` value. - New cells do not need a `metadata.id` value unless an editor or tool assigns one. - Keep notebook JSON logically structured and valid. Do not emit partial notebook fragments when a full notebook document is required. +- Place **all** `import` statements at the top of every code cell, before any other code. Never nest imports inside `if` / `else` / `try` blocks within a cell. Ruff's `PLC0415` does not flag imports inside module-level conditionals, so this must be enforced manually. - When describing notebook changes to users, refer to cells by visible cell number (Cell 1, Cell 2, etc.), not by internal cell IDs. ### Presentation Instructions diff --git a/.github/python.instructions.md b/.github/python.instructions.md index 1cc2d6e8..1ce87de4 100644 --- a/.github/python.instructions.md +++ b/.github/python.instructions.md @@ -20,6 +20,7 @@ This ensures all code changes comply with the project's linting standards from t - Use explicit imports (avoid `from module import *`), especially in notebooks, to prevent `F403/F405`. - Keep lines within the configured length limit (see `pyproject.toml`), and wrap long strings or calls. - Avoid f-strings without placeholders (e.g., `F541`). +- **Ruff gap:** `PLC0415` (`import-outside-toplevel`) only flags imports inside functions and classes. It does **not** flag imports inside module-level `if` / `else` / `try` blocks. Ruff will not catch those, so the top-of-file import rule below must be enforced manually. ## Goals @@ -35,7 +36,7 @@ This ensures all code changes comply with the project's linting standards from t ## Style and Conventions - Prefer Python 3.12+ features unless otherwise required. -- Keep all imports at the top of the file. +- Keep **all** imports at the top of the file. Do not place `import` statements inside `if` / `else` / `try` blocks or inside functions. Hoist them even when only one branch uses the module. Ruff `PLC0415` will catch function-scope imports but will **not** catch imports inside module-level conditional blocks, so apply this rule manually. - Use type hints and concise docstrings (PEP 257). - Use 4-space indentation and PEP 8 conventions. - Surround an equal sign by a space on each side. diff --git a/pyproject.toml b/pyproject.toml index 39e839ad..59a51809 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -37,14 +37,15 @@ exclude = ["*.ipynb"] [tool.ruff.lint] select = [ - "E", # pycodestyle errors - "W", # pycodestyle warnings - "F", # Pyflakes - "PLC", # Pylint convention - "PLE", # Pylint error - "PLR", # Pylint refactoring - "PLW", # Pylint warning - "Q", # flake8-quotes + "E", # pycodestyle errors + "W", # pycodestyle warnings + "F", # Pyflakes + "PLC", # Pylint convention + "PLC0415", # import-outside-toplevel (explicit; enforce imports at top of file/cell) + "PLE", # Pylint error + "PLR", # Pylint refactoring + "PLW", # Pylint warning + "Q", # flake8-quotes ] ignore = [ "PLR0911", # Too many return statements diff --git a/samples/costing/README.md b/samples/costing/README.md index 6e5f78c0..bffb09c3 100644 --- a/samples/costing/README.md +++ b/samples/costing/README.md @@ -1,6 +1,6 @@ # Samples: APIM Costing & Showback -This sample demonstrates how to track and allocate API costs using Azure API Management with Azure Monitor, Application Insights, Log Analytics, and Cost Management. This setup enables organizations to determine the cost of API consumption per business unit, department, or application. +This sample demonstrates how to track and allocate API costs using Azure API Management with Azure Monitor, Application Insights, Log Analytics, and Cost Management. It supports three complementary approaches: **subscription-based** tracking (using APIM subscription keys), **Entra ID application** tracking (using the `emit-metric` policy with JWT `appid` claims), and **AI Gateway token/PTU** tracking (using the `emit-metric` policy to capture per-client token consumption when APIM acts as an AI Gateway). All approaches share a single Azure Monitor Workbook with tabbed views. โš™๏ธ **Supported infrastructures**: All infrastructures (or bring your own existing APIM deployment) @@ -9,10 +9,13 @@ This sample demonstrates how to track and allocate API costs using Azure API Man ## ๐ŸŽฏ Objectives 1. **Track API usage by caller** - Use APIM subscription keys to identify business units, departments, or applications -2. **Capture request metrics** - Log subscriptionId, apiName, operationName, and status codes -3. **Aggregate cost data** - Combine API usage metrics with Azure Cost Management data -4. **Visualize showback data** - Create Azure Monitor Workbooks to display cost allocation by caller -5. **Enable cost governance** - Establish patterns for consistent tagging and naming conventions +2. **Track API usage by Entra ID application** - Use the `emit-metric` policy to extract `appid`/`azp` JWT claims and emit per-caller custom metrics +3. **Capture request metrics** - Log subscriptionId, apiName, operationName, and status codes +4. **Aggregate cost data** - Combine API usage metrics with Azure Cost Management data +5. **Visualize showback data** - Create Azure Monitor Workbooks with tabbed views for both approaches +6. **Enable cost governance** - Establish patterns for consistent tagging and naming conventions +7. **Enable budget alerts** - Create scheduled query alerts when callers exceed configurable thresholds +8. **Track AI token consumption per client** - When APIM is used as an AI Gateway, capture prompt, completion, and total token usage per calling application, enabling per-client cost attribution for PTU or pay-as-you-go OpenAI deployments ## โœ… Prerequisites @@ -109,6 +112,19 @@ Organizations often need to allocate the cost of shared API Management infrastru This sample focuses on **producing cost data**, not implementing billing processes. You determine costs; how you use that information (showback reports, chargeback, budgeting) is a separate business decision. +### Three Tracking Approaches + +| Aspect | Subscription-Based | Entra ID Application | AI Gateway Token/PTU | +|---|---|---|---| +| **Caller identification** | APIM subscription key (`ApimSubscriptionId`) | JWT `appid`/`azp` claim | JWT `appid`/`azp` claim | +| **Data source** | `ApiManagementGatewayLogs` in Log Analytics | `customMetrics` in Application Insights | `customMetrics` in Application Insights | +| **Tracking mechanism** | Built-in APIM logging | `emit-metric` policy | `emit-metric` policy (outbound response parsing) | +| **Metric name** | N/A (built-in logs) | `caller-requests` | `caller-tokens` | +| **Cost Management export** | Yes (storage account) | No (metrics-based) | No (metrics-based) | +| **Best for** | Dedicated subscriptions per BU | OAuth client-credentials flows, shared subscriptions | AI Gateway scenarios (Azure OpenAI, PTU capacity planning) | + +All three approaches are deployed together. Toggle `enable_entraid_tracking` and `enable_token_tracking` in the notebook to include or exclude each flow. + ## ๐Ÿ›ฉ๏ธ Lab Components This lab deploys and configures: @@ -118,14 +134,13 @@ This lab deploys and configures: - **Storage Account** - Receives Azure Cost Management exports - **Cost Management Export** - Automated export of cost data (configurable frequency) - **Diagnostic Settings** - Links APIM to Log Analytics with `logAnalyticsDestinationType: Dedicated` for resource-specific tables -- **Sample API & Subscriptions** - 5 subscriptions representing different business units -- **Azure Monitor Workbook** - Pre-built dashboard with: - - Cost allocation table (base + variable cost per BU) - - Base vs variable cost stacked bar chart - - Cost breakdown by API - - Request count and distribution charts - - Success/error rate analysis - - Response code distribution +- **Sample API & Subscriptions** - 4 subscriptions representing different business units +- **Entra ID Tracking API** (optional) - A second API with the `emit-metric` policy that extracts `appid` from JWT tokens and emits `caller-requests` custom metrics +- **AI Gateway Token Tracking API** (optional) - A third API with the `emit-metric` policy that parses Azure OpenAI response bodies to extract `prompt_tokens`, `completion_tokens`, and `total_tokens`, emitting `caller-tokens` custom metrics with `CallerId`, `TokenType`, and `Model` dimensions +- **Azure Monitor Workbook** - Pre-built tabbed dashboard with: + - **Subscription-Based Costing tab**: Cost allocation table (base + variable cost per BU), base vs variable cost stacked bar chart, cost breakdown by API, request count and distribution charts, success/error rate analysis, response code distribution, business unit drill-down + - **Entra ID Application Costing tab**: Usage by caller ID (bar chart + table), cost allocation by caller (table + pie chart), hourly request trend by caller + - **AI Gateway Token/PTU tab**: Token consumption by client (prompt vs completion bar chart), token cost allocation table with configurable per-1K-token rates, token/cost distribution pie charts, hourly token trend with PTU capacity threshold line, prompt vs completion area chart, model breakdown table - **Live Pricing Integration** - Auto-detects your APIM SKU and fetches current pricing from the [Azure Retail Prices API](https://learn.microsoft.com/rest/api/cost-management/retail-prices/azure-retail-prices) - **Budget Alerts** (optional) - Per-BU scheduled query alerts when request thresholds are exceeded @@ -153,10 +168,10 @@ This lab deploys and configures: After running the notebook, you will have: -1. **Application Insights** showing real-time API requests +1. **Application Insights** showing real-time API requests and `caller-requests` custom metrics (Entra ID) 2. **Log Analytics** with queryable `ApiManagementGatewayLogs` (resource-specific table) 3. **Storage Account** receiving cost export data -4. **Azure Monitor Workbook** displaying cost allocation and usage analytics +4. **Azure Monitor Workbook** with tabbed views for both subscription-based and Entra ID cost allocation 5. **Portal links** printed in the notebook's final cell for quick access ### Cost Management Export @@ -181,6 +196,30 @@ The deployed workbook provides a comprehensive view of API cost allocation and u ![Dashboard - Response Code Analysis](screenshots/Dashboard-05.png) +![Dashboard - Drill-Down Details](screenshots/Dashboard-06.png) + +### Entra ID Application Costing Tab + +The Entra ID tab shows cost attribution by calling application, using the `emit-metric` policy's `caller-requests` custom metric. + +![Entra ID - Usage by Caller ID](screenshots/EntraID-01.png) + +![Entra ID - Cost Allocation](screenshots/EntraID-02.png) + +![Entra ID - Request Trend](screenshots/EntraID-03.png) + +### AI Gateway Token/PTU Tab + +The AI Gateway tab shows per-client token consumption and estimated costs when APIM is used as an AI Gateway in front of Azure OpenAI or other LLM backends. It uses the `emit-metric` policy's `caller-tokens` custom metric with `CallerId`, `TokenType` (prompt/completion/total), and `Model` dimensions. + +![AI Gateway - Token Consumption by Client](screenshots/AIGateway-01.png) + +![AI Gateway - Token Cost Allocation](screenshots/AIGateway-02.png) + +![AI Gateway - Token Trends & PTU Utilization](screenshots/AIGateway-03.png) + +![AI Gateway - Model & Caller Breakdown](screenshots/AIGateway-04.png) + ## ๐Ÿงน Clean Up To remove all resources created by this sample, open and run `clean-up.ipynb`. This deletes: @@ -199,6 +238,11 @@ To remove all resources created by this sample, open and run `clean-up.ipynb`. T - [Log Analytics Kusto Query Language](https://learn.microsoft.com/azure/data-explorer/kusto/query/) - [Azure Monitor Workbooks](https://learn.microsoft.com/azure/azure-monitor/visualize/workbooks-overview) - [APIM Diagnostic Settings](https://learn.microsoft.com/azure/api-management/api-management-howto-use-azure-monitor) +- [APIM emit-metric policy](https://learn.microsoft.com/azure/api-management/emit-metric-policy) +- [Application Insights custom metrics](https://learn.microsoft.com/azure/azure-monitor/essentials/metrics-custom-overview) +- [Microsoft Entra ID application model](https://learn.microsoft.com/entra/identity-platform/application-model) +- [Azure OpenAI usage and token metrics](https://learn.microsoft.com/azure/ai-services/openai/how-to/monitoring) +- [PTU provisioned throughput concepts](https://learn.microsoft.com/azure/ai-services/openai/concepts/provisioned-throughput) [infrastructure-architectures]: ../../README.md#infrastructure-architectures [infrastructure-folder]: ../../infrastructure/ diff --git a/samples/costing/create.ipynb b/samples/costing/create.ipynb index ffd75ba0..697586c0 100644 --- a/samples/costing/create.ipynb +++ b/samples/costing/create.ipynb @@ -17,9 +17,13 @@ "metadata": {}, "outputs": [], "source": [ + "import os\n", + "\n", "import utils\n", "\n", - "from apimtypes import API, APIM_SKU, GET_APIOperation2, INFRASTRUCTURE, Region\n", + "from pathlib import Path\n", + "\n", + "from apimtypes import API, APIM_SKU, GET_APIOperation2, INFRASTRUCTURE, PolicyFragment, Region\n", "from console import print_error, print_info, print_ok, print_val, print_warning\n", "from azure_resources import get_infra_rg_name, get_account_info\n", "\n", @@ -41,12 +45,21 @@ "generate_sample_load = True # Generate sample API calls to demonstrate cost tracking\n", "sample_requests_per_subscription = 50 # Base request count per business unit (multiplied by each BU's weight)\n", "\n", + "# Entra ID application tracking\n", + "enable_entraid_tracking = True # Deploy emit-metric API for Entra ID caller tracking\n", + "enable_token_tracking = True # Deploy AI Gateway token tracking API (per-caller token/PTU metrics)\n", + "# Real JWT testing (set to True to test with a real Entra ID app registration)\n", + "# Credentials are read from the root .env file (see README for details)\n", + "use_real_jwt = False # When True, acquires a real token and sends requests through the emit-metric API\n", + "real_jwt_tenant_id = os.getenv('COSTING_JWT_TENANT_ID', '')\n", + "real_jwt_client_id = os.getenv('COSTING_JWT_CLIENT_ID', '')\n", + "real_jwt_client_secret = os.getenv('COSTING_JWT_CLIENT_SECRET', '')\n", + "\n", "# Budget alerts\n", "alert_threshold = 1000 # Request count threshold per BU per hour\n", "alert_email = 'alerts@contoso.com' # Email for alert notifications (leave empty to skip)\n", "\n", "\n", - "\n", "# ------------------------------\n", "# SYSTEM CONFIGURATION\n", "# ------------------------------\n", @@ -87,6 +100,62 @@ " serviceUrl = 'https://httpbin.org'\n", " )\n", "]\n", + "# Policy fragment: shared caller-identity extraction (used by both Entra ID and token-tracking APIs)\n", + "pf_caller_id_xml = Path(utils.determine_policy_path('pf-extract-caller-id.xml', sample_folder)).read_text(encoding='utf-8')\n", + "pfs = [PolicyFragment('Extract-CallerId', pf_caller_id_xml, 'Extracts caller identity from JWT appid/azp claim into callerId variable.')]\n", + "\n", + "# Entra ID API definition (emit-metric policy for appid-based cost tracking)\n", + "if enable_entraid_tracking:\n", + " emit_metric_policy_path = utils.determine_policy_path('emit_metric_caller_id.xml', sample_folder)\n", + " emit_metric_policy_xml = Path(emit_metric_policy_path).read_text(encoding='utf-8')\n", + "\n", + " entraid_api_path = 'appid-cost-demo'\n", + " entraid_cost_demo_get = GET_APIOperation2('get-status', 'Get Status', '/get', 'Get Status')\n", + "\n", + " apis.append(\n", + " API(\n", + " f'{api_prefix}appid-tracking-api',\n", + " 'Cost Tracking by App ID',\n", + " entraid_api_path,\n", + " 'API for demonstrating cost tracking by Entra ID application',\n", + " policyXml = emit_metric_policy_xml,\n", + " operations = [entraid_cost_demo_get],\n", + " tags = ['costing', 'emit-metric', 'entra-appid'],\n", + " subscriptionRequired = True,\n", + " serviceUrl = 'https://httpbin.org'\n", + " )\n", + " )\n", + "\n", + "\n", + "# AI Gateway token tracking API (emit-metric policy for per-caller token/PTU consumption)\n", + "if enable_token_tracking:\n", + " token_metric_policy_path = utils.determine_policy_path('emit_metric_caller_tokens.xml', sample_folder)\n", + " token_metric_policy_xml = Path(token_metric_policy_path).read_text(encoding='utf-8')\n", + "\n", + " token_api_path = 'token-cost-demo'\n", + " token_cost_demo_get = GET_APIOperation2('get-status', 'Get Status', '/get', 'Get Status')\n", + "\n", + " apis.append(\n", + " API(\n", + " f'{api_prefix}token-tracking-api',\n", + " 'AI Gateway Token Tracking',\n", + " token_api_path,\n", + " 'API for demonstrating per-caller token/PTU tracking (AI Gateway pattern)',\n", + " policyXml = token_metric_policy_xml,\n", + " operations = [token_cost_demo_get],\n", + " tags = ['costing', 'emit-metric', 'ai-gateway', 'token-tracking'],\n", + " subscriptionRequired = True,\n", + " serviceUrl = 'https://httpbin.org'\n", + " )\n", + " )\n", + "\n", + "# Simulated caller app IDs (for Entra ID tracking)\n", + "simulated_callers = [\n", + " {'appid': 'a5846c0e-1111-4000-8000-000000000001', 'name': 'HR Service', 'request_weight': 1.0},\n", + " {'appid': '9e6bfb3f-2222-4000-8000-000000000002', 'name': 'Finance Portal', 'request_weight': 2.5},\n", + " {'appid': 'c3d2e1f0-3333-4000-8000-000000000003', 'name': 'Mobile Gateway', 'request_weight': 0.5},\n", + " {'appid': 'b7a8c9d0-4444-4000-8000-000000000004', 'name': 'Engineering Tools', 'request_weight': 3.0}\n", + "]\n", "\n", "# Define business units\n", "business_units = [\n", @@ -98,7 +167,6 @@ "\n", "# Get Azure account information\n", "current_user, current_user_id, tenant_id, subscription_id = get_account_info()\n", - "\n", "if not subscription_id:\n", " print_error('Could not determine Azure subscription ID. Run: az login')\n", " raise SystemExit(1)\n", @@ -129,7 +197,8 @@ " 'costExportFrequency' : {'value': cost_export_frequency},\n", " 'index' : {'value': index},\n", " 'apis' : {'value': [api.to_dict() for api in apis]},\n", - " 'businessUnits' : {'value': [{'name': bu['name'], 'displayName': bu['display']} for bu in business_units]}\n", + " 'businessUnits' : {'value': [{'name': bu['name'], 'displayName': bu['display']} for bu in business_units]},\n", + " 'policyFragments' : {'value': [pf.to_dict() for pf in pfs]}\n", "}\n", "\n", "# Deploy the sample\n", @@ -147,9 +216,12 @@ " workbook_id = output.get('workbookId', '')\n", " cost_export_name = f'apim-cost-export-{index}-{rg_name}'\n", "\n", - " # Extract subscription keys\n", + " # Extract BU subscription keys (scoped to cost-tracking API)\n", " subscription_keys_output = output.getJson('subscriptionKeys', 'Subscription Keys', secure=True)\n", "\n", + " # Per-API subscription keys are retrieved on demand via\n", + " # get_apim_subscription_key() (RBAC-controlled ARM listSecrets).\n", + "\n", " # Map keys to business units\n", " subscriptions = {}\n", " if subscription_keys_output:\n", @@ -213,8 +285,7 @@ "# Register required resource provider\n", "print_info('Registering Microsoft.CostManagementExports resource provider...')\n", "register_result = run(\n", - " 'az provider register --namespace Microsoft.CostManagementExports --wait',\n", - " log_command=False\n", + " 'az provider register --namespace Microsoft.CostManagementExports --wait'\n", ")\n", "\n", "if register_result.success:\n", @@ -224,8 +295,7 @@ "existing_export = run(\n", " f'az rest --method GET '\n", " f'--url \"{export_scope}/providers/Microsoft.CostManagement/exports/{cost_export_name}'\n", - " f'?api-version={api_version}\" -o json',\n", - " log_command=False\n", + " f'?api-version={api_version}\" -o json'\n", ")\n", "\n", "if existing_export.success:\n", @@ -233,8 +303,7 @@ " run(\n", " f'az rest --method DELETE '\n", " f'--url \"{export_scope}/providers/Microsoft.CostManagement/exports/{cost_export_name}'\n", - " f'?api-version={api_version}\"',\n", - " log_command=False\n", + " f'?api-version={api_version}\"'\n", " )\n", "\n", "# Build recurrence settings\n", @@ -290,8 +359,7 @@ " f'az rest --method PUT '\n", " f'--url \"{export_scope}/providers/Microsoft.CostManagement/exports/{cost_export_name}'\n", " f'?api-version={api_version}\" '\n", - " f'--body @{body_file_path} -o json',\n", - " log_command=False\n", + " f'--body @{body_file_path} -o json'\n", " )\n", "finally:\n", " Path(body_file_path).unlink(missing_ok=True)\n", @@ -314,8 +382,7 @@ " f'--assignee-object-id {principal_id} '\n", " f'--assignee-principal-type ServicePrincipal '\n", " f'--role \"Storage Blob Data Contributor\" '\n", - " f'--scope {storage_account_id}',\n", - " log_command=False\n", + " f'--scope {storage_account_id}'\n", " )\n", "\n", " if role_assignment.success:\n", @@ -364,8 +431,7 @@ " run_result = run(\n", " f'az rest --method POST '\n", " f'--url \"{export_scope}/providers/Microsoft.CostManagement/exports/{cost_export_name}'\n", - " f'/run?api-version={api_version}\"',\n", - " log_command=False\n", + " f'/run?api-version={api_version}\"'\n", " )\n", "\n", " if run_result.success:\n", @@ -488,8 +554,7 @@ " result = run(\n", " f'az rest --method POST '\n", " f'--url \"https://management.azure.com{workspace_resource_id}/api/query?api-version=2020-08-01\" '\n", - " f'--body @{query_file_path} -o json',\n", - " log_command=False\n", + " f'--body @{query_file_path} -o json'\n", " )\n", "\n", " # A non-transient error (e.g. bad API version, auth failure) should stop immediately\n", @@ -527,6 +592,533 @@ " print_info(' 3. Re-run the traffic generation cell to send more requests')" ] }, + { + "cell_type": "markdown", + "id": "5c5d28c4", + "metadata": {}, + "source": [ + "### ๐Ÿš€ Generate Entra ID Application Traffic\n", + "\n", + "Generate sample API calls simulating different Entra ID applications (via crafted JWT `appid` claims) to demonstrate caller-based cost tracking.\n", + "\n", + "This creates `caller-requests` custom metrics in Application Insights via the `emit-metric` policy.\n", + "\n", + "> **Note:** In a production environment, callers would present real Entra ID tokens. Here we simulate them with minimal JWTs so that the `emit-metric` policy can extract the `appid` claim." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cc604ec2", + "metadata": {}, + "outputs": [], + "source": [ + "import base64\n", + "import json\n", + "\n", + "from apimrequests import ApimRequests\n", + "from azure_resources import get_apim_subscription_key\n", + "\n", + "if 'apim_gateway_url' not in locals():\n", + " print_error('Please run the deployment cell first')\n", + " raise SystemExit(1)\n", + "\n", + "def make_fake_jwt(appid: str) -> str:\n", + " \"\"\"Create a minimal unsigned JWT with an appid claim for emit-metric extraction.\"\"\"\n", + " header = base64.urlsafe_b64encode(json.dumps({'alg': 'none', 'typ': 'JWT'}).encode()).rstrip(b'=').decode()\n", + " payload = base64.urlsafe_b64encode(json.dumps({'appid': appid}).encode()).rstrip(b'=').decode()\n", + " return f'{header}.{payload}.'\n", + "\n", + "if enable_entraid_tracking and generate_sample_load:\n", + " print_info('Generating sample API traffic with simulated caller app IDs...')\n", + "\n", + " # Retrieve the per-API subscription key via RBAC-controlled ARM listSecrets\n", + " entraid_subscription_key = get_apim_subscription_key(\n", + " apim_name, rg_name, sid=f'api-{api_prefix}appid-tracking-api'\n", + " )\n", + "\n", + " if not entraid_subscription_key:\n", + " print_error('Could not retrieve subscription key for the Entra ID tracking API')\n", + " raise SystemExit(1)\n", + "\n", + " # Determine endpoints, URLs, etc. prior to test execution\n", + " endpoint_url, request_headers, allow_insecure_tls = utils.get_endpoint(deployment, rg_name, apim_gateway_url)\n", + "\n", + " for caller in simulated_callers:\n", + " caller_request_count = max(1, int(sample_requests_per_subscription * caller.get('request_weight', 1.0)))\n", + " fake_jwt = make_fake_jwt(caller['appid'])\n", + "\n", + " # Merge auth header with any infrastructure-specific headers\n", + " auth_headers = dict(request_headers) if request_headers else {}\n", + " auth_headers['Authorization'] = f'Bearer {fake_jwt}'\n", + "\n", + " reqs = ApimRequests(endpoint_url, entraid_subscription_key, auth_headers, allowInsecureTls=allow_insecure_tls)\n", + " reqs.multiGet(\n", + " f'/{entraid_api_path}/get',\n", + " caller_request_count,\n", + " msg = f'Generating {caller_request_count} requests for {caller[\"name\"]} ({caller[\"appid\"][:12]}...)',\n", + " printResponse = False,\n", + " sleepMs = 10\n", + " )\n", + "\n", + " # Display a summary table of simulated callers\n", + " print()\n", + " print_info('Simulated Caller Summary:')\n", + " print()\n", + " print(f' {\"App ID\":<40} {\"Caller Name\":<20} {\"Weight\":>8} {\"Requests\":>10}')\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 8} {\"-\" * 10}')\n", + " for caller in simulated_callers:\n", + " req_count = max(1, int(sample_requests_per_subscription * caller.get('request_weight', 1.0)))\n", + " print(f' {caller[\"appid\"]:<40} {caller[\"name\"]:<20} {caller[\"request_weight\"]:>8.1f} {req_count:>10}')\n", + " total_reqs = sum(max(1, int(sample_requests_per_subscription * c.get('request_weight', 1.0))) for c in simulated_callers)\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 8} {\"-\" * 10}')\n", + " print(f' {\"TOTAL\":<40} {\"\":<20} {\"\":>8} {total_reqs:>10}')\n", + "\n", + " print()\n", + " print_info('Note: Custom metrics typically take 5-10 minutes to appear in Application Insights')\n", + "elif not enable_entraid_tracking:\n", + " print_info('Entra ID tracking disabled (enable_entraid_tracking = False)')\n", + "else:\n", + " print_info('Sample load generation skipped (generate_sample_load = False)')" + ] + }, + { + "cell_type": "markdown", + "id": "471013f5", + "metadata": {}, + "source": [ + "### ๐Ÿ” Verify Metric Ingestion (Entra ID)\n", + "\n", + "Waits for `caller-requests` custom metrics to arrive in Application Insights (auto-retries for up to 10 minutes)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d8ba61f3", + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "import tempfile\n", + "import time\n", + "from pathlib import Path\n", + "\n", + "from azure_resources import run\n", + "\n", + "if not enable_entraid_tracking:\n", + " print_info('Entra ID tracking disabled - skipping metric verification')\n", + "elif 'app_insights_name' not in locals():\n", + " print_error('Please run the deployment cell first')\n", + " raise SystemExit(1)\n", + "else:\n", + " print_info('Waiting for caller-requests custom metrics to arrive in Application Insights...')\n", + " print_info('Metric ingestion typically takes 5-10 minutes after generating traffic')\n", + " print()\n", + "\n", + " print_val('Application Insights', app_insights_name)\n", + "\n", + " # Build the Application Insights resource ID for the ARM query endpoint\n", + " app_insights_resource_id = (\n", + " f'/subscriptions/{subscription_id}'\n", + " f'/resourceGroups/{rg_name}'\n", + " f'/providers/microsoft.insights/components/{app_insights_name}'\n", + " )\n", + "\n", + " # Load KQL from external file and wrap it in a JSON body\n", + " kql_path = utils.determine_policy_path('verify-metric-ingestion.kql', sample_folder)\n", + " kql_query = Path(kql_path).read_text(encoding='utf-8')\n", + "\n", + " query_body = {'query': kql_query}\n", + "\n", + " with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:\n", + " json.dump(query_body, f)\n", + " query_file_path = f.name\n", + "\n", + " # Poll Application Insights until caller-requests metrics appear\n", + " max_wait_minutes = 10\n", + " poll_interval_seconds = 30\n", + " max_attempts = (max_wait_minutes * 60) // poll_interval_seconds\n", + " metrics_found = False\n", + "\n", + " try:\n", + " for attempt in range(1, max_attempts + 1):\n", + " result = run(\n", + " f'az rest --method POST '\n", + " f'--url \"https://management.azure.com{app_insights_resource_id}/query?api-version=2018-04-20\" '\n", + " f'--body @{query_file_path} -o json'\n", + " )\n", + "\n", + " if not result.success:\n", + " print_error(f'Query failed: {result.text[:300]}')\n", + " break\n", + "\n", + " if result.json_data:\n", + " tables = result.json_data.get('tables', [])\n", + " if tables:\n", + " rows = tables[0].get('rows', [])\n", + " if rows and len(rows) > 0:\n", + " metric_count = float(rows[0][0])\n", + " if metric_count > 0:\n", + " print_ok(f'Found {int(metric_count)} caller-requests metric entries')\n", + " metrics_found = True\n", + " break\n", + "\n", + " elapsed = attempt * poll_interval_seconds\n", + " remaining = (max_wait_minutes * 60) - elapsed\n", + " print_info(f' No metrics yet... retrying in {poll_interval_seconds}s ({remaining}s remaining)')\n", + " time.sleep(poll_interval_seconds)\n", + " finally:\n", + " Path(query_file_path).unlink(missing_ok=True)\n", + "\n", + " if metrics_found:\n", + " print_ok('Metric ingestion verified - Entra ID tab in workbook should now display data')\n", + "\n", + " # Query per-caller breakdown from external KQL file\n", + " breakdown_kql_path = utils.determine_policy_path('verify-metric-breakdown.kql', sample_folder)\n", + " breakdown_kql = Path(breakdown_kql_path).read_text(encoding='utf-8')\n", + " breakdown_body = {'query': breakdown_kql}\n", + "\n", + " with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:\n", + " json.dump(breakdown_body, f)\n", + " breakdown_file = f.name\n", + "\n", + " try:\n", + " bd_result = run(\n", + " f'az rest --method POST '\n", + " f'--url \"https://management.azure.com{app_insights_resource_id}/query?api-version=2018-04-20\" '\n", + " f'--body @{breakdown_file} -o json'\n", + " )\n", + "\n", + " if bd_result.success and bd_result.json_data:\n", + " bd_tables = bd_result.json_data.get('tables', [])\n", + " if bd_tables:\n", + " bd_rows = bd_tables[0].get('rows', [])\n", + " if bd_rows:\n", + " # Build a lookup from appid -> name\n", + " caller_lookup = {c['appid']: c['name'] for c in simulated_callers}\n", + "\n", + " print()\n", + " print_info('Metric Breakdown by Caller:')\n", + " print()\n", + " print(f' {\"App ID\":<40} {\"Caller Name\":<20} {\"Requests\":>10}')\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 10}')\n", + " total = 0\n", + " for row in bd_rows:\n", + " caller_id = row[0]\n", + " count = int(row[1])\n", + " name = caller_lookup.get(caller_id, 'Unknown')\n", + " print(f' {caller_id:<40} {name:<20} {count:>10}')\n", + " total += count\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 10}')\n", + " print(f' {\"TOTAL\":<40} {\"\":<20} {total:>10}')\n", + " finally:\n", + " Path(breakdown_file).unlink(missing_ok=True)\n", + "\n", + " elif result.success:\n", + " print_warning(f'Metrics did not appear within {max_wait_minutes} minutes')\n", + " print_info('This can happen with newly deployed emit-metric policies. Tips:')\n", + " print_info(' 1. Wait a few more minutes and re-run this cell')\n", + " print_info(' 2. Verify the emit-metric policy is applied in Azure Portal')\n", + " print_info(' 3. Re-run the Entra ID traffic generation cell to send more requests')\n" + ] + }, + { + "cell_type": "markdown", + "id": "a7f0e2c1", + "metadata": {}, + "source": [ + "### ๐Ÿ”‘ Test with a Real JWT Token (Optional)\n", + "\n", + "When `use_real_jwt = True` is set in the configuration above, this cell acquires a real Entra ID token using the\n", + "**client credentials** flow (OAuth 2.0) and sends requests through the `emit-metric` API.\n", + "\n", + "This validates the end-to-end flow:\n", + "\n", + "1. An Entra ID app registration presents a JWT with a real `appid` claim\n", + "2. The `emit-metric` APIM policy extracts the `appid` from the token\n", + "3. A `caller-requests` custom metric is emitted with the real caller ID\n", + "\n", + "> **Prerequisites:** Create an [App Registration](https://learn.microsoft.com/entra/identity-platform/quickstart-register-app)\n", + "> and generate a client secret. No API permissions are required - the `emit-metric` policy only reads the `appid` claim from the JWT.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b8e1f3d2", + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "import requests\n", + "\n", + "from apimrequests import ApimRequests\n", + "from azure_resources import get_apim_subscription_key\n", + "\n", + "if not use_real_jwt:\n", + " print_info('Real JWT testing is disabled (use_real_jwt = False)')\n", + " print_info('To enable, set use_real_jwt = True and provide client credentials in the init cell')\n", + "elif not all([real_jwt_tenant_id, real_jwt_client_id, real_jwt_client_secret]):\n", + " print_error('Missing credentials: set COSTING_JWT_TENANT_ID, COSTING_JWT_CLIENT_ID, and COSTING_JWT_CLIENT_SECRET in the root .env file')\n", + "elif 'apim_gateway_url' not in locals():\n", + " print_error('Please run the deployment cell first')\n", + " raise SystemExit(1)\n", + "else:\n", + " # Acquire a token using the client credentials flow\n", + " token_url = f'https://login.microsoftonline.com/{real_jwt_tenant_id}/oauth2/v2.0/token'\n", + " token_payload = {\n", + " 'grant_type': 'client_credentials',\n", + " 'client_id': real_jwt_client_id,\n", + " 'client_secret': real_jwt_client_secret,\n", + " 'scope': f'api://{real_jwt_client_id}/.default'\n", + " }\n", + "\n", + " print_info('Acquiring token from Microsoft Identity Platform...')\n", + " token_response = requests.post(token_url, data=token_payload, timeout=30)\n", + "\n", + " if token_response.status_code != 200:\n", + " print_error(f'Token acquisition failed ({token_response.status_code}): {token_response.text[:300]}')\n", + " else:\n", + " access_token = token_response.json().get('access_token', '')\n", + " payload_part = access_token.split('.')[1]\n", + " payload_part += '=' * (4 - len(payload_part) % 4) # pad base64\n", + " claims = json.loads(base64.urlsafe_b64decode(payload_part))\n", + " real_appid = claims.get('appid', claims.get('azp', 'unknown'))\n", + "\n", + " print()\n", + " print_info('Token Details:')\n", + " print(f' {\"App ID (appid)\":<25} {real_appid}')\n", + " print(f' {\"Tenant ID (tid)\":<25} {claims.get(\"tid\", \"unknown\")}')\n", + " print(f' {\"Token type\":<25} {claims.get(\"typ\", claims.get(\"token_type\", \"unknown\"))}')\n", + " print()\n", + "\n", + " # Send requests with the real token\n", + " real_jwt_request_count = 10 # Small number for testing\n", + "\n", + " endpoint_url, request_headers, allow_insecure_tls = utils.get_endpoint(deployment, rg_name, apim_gateway_url)\n", + "\n", + " # Retrieve the per-API subscription key via RBAC-controlled ARM listSecrets\n", + " entraid_subscription_key = get_apim_subscription_key(\n", + " apim_name, rg_name, sid=f'api-{api_prefix}appid-tracking-api'\n", + " )\n", + "\n", + " if not entraid_subscription_key:\n", + " print_error('Could not retrieve subscription key for the Entra ID tracking API')\n", + " raise SystemExit(1)\n", + "\n", + " auth_headers = dict(request_headers) if request_headers else {}\n", + " auth_headers['Authorization'] = f'Bearer {access_token}'\n", + "\n", + " reqs = ApimRequests(endpoint_url, entraid_subscription_key, auth_headers, allowInsecureTls=allow_insecure_tls)\n", + " reqs.multiGet(\n", + " f'/{entraid_api_path}/get',\n", + " real_jwt_request_count,\n", + " msg = f'Sending {real_jwt_request_count} requests with real JWT (appid: {real_appid[:12]}...)',\n", + " printResponse = False,\n", + " sleepMs = 100\n", + " )\n", + "\n", + " print()\n", + " print_ok(f'Sent {real_jwt_request_count} requests with real Entra ID token')\n", + " print_info(f'The emit-metric policy will emit caller-requests with CallerId = {real_appid}')\n", + " print_info('Check Application Insights custom metrics or re-run the metric verification cell above')" + ] + }, + { + "cell_type": "markdown", + "id": "1ac1bf10", + "metadata": {}, + "source": [ + "### ๐Ÿค– AI Gateway: Token Consumption per Client\n", + "\n", + "When APIM is used as an **AI Gateway** in front of Azure OpenAI or other LLM backends, built-in model metrics\n", + "(PTU utilization, token counts) are reported at the deployment level and **do not break down by client**.\n", + "\n", + "This section demonstrates how to track **token consumption per calling application** using the `emit-metric`\n", + "policy. The policy extracts `prompt_tokens`, `completion_tokens`, and `total_tokens` from the response body\n", + "and emits `caller-tokens` custom metrics with `CallerId`, `TokenType`, and `Model` dimensions.\n", + "\n", + "> **Note:** To use this with a real backend, point the API's `serviceUrl` to your Azure OpenAI deployment.\n", + "> The policy only emits token metrics when the response body contains a `usage` object.\n", + "\n", + "| Dimension | Description |\n", + "|---|---|\n", + "| **CallerId** | Entra ID application (from JWT `appid`/`azp` claim) |\n", + "| **TokenType** | `prompt`, `completion`, or `total` |\n", + "| **Model** | AI model name (from response `model` field, or `unknown` if absent) |\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b15d79cc", + "metadata": {}, + "outputs": [], + "source": [ + "import base64\n", + "import json\n", + "import tempfile\n", + "import time\n", + "from pathlib import Path\n", + "\n", + "from apimrequests import ApimRequests\n", + "from azure_resources import get_apim_subscription_key, run\n", + "\n", + "if not enable_token_tracking:\n", + " print_info('Token tracking disabled (enable_token_tracking = False)')\n", + "elif 'apim_gateway_url' not in locals():\n", + " print_error('Please run the deployment cell first')\n", + " raise SystemExit(1)\n", + "else:\n", + " # --- Generate token-tracking traffic ---\n", + " print_info('Generating AI Gateway traffic with simulated callers...')\n", + " print_info('The emit-metric policy will extract/simulate token counts and emit caller-tokens metrics')\n", + " print()\n", + "\n", + " def make_fake_jwt_token(appid: str) -> str:\n", + " \"\"\"Create a minimal unsigned JWT with an appid claim.\"\"\"\n", + " header = base64.urlsafe_b64encode(json.dumps({'alg': 'none', 'typ': 'JWT'}).encode()).rstrip(b'=').decode()\n", + " payload = base64.urlsafe_b64encode(json.dumps({'appid': appid}).encode()).rstrip(b'=').decode()\n", + " return f'{header}.{payload}.'\n", + "\n", + " # Retrieve the per-API subscription key via RBAC-controlled ARM listSecrets\n", + " token_subscription_key = get_apim_subscription_key(\n", + " apim_name, rg_name, sid=f'api-{api_prefix}token-tracking-api'\n", + " )\n", + "\n", + " if not token_subscription_key:\n", + " print_error('Could not retrieve subscription key for the token-tracking API')\n", + " raise SystemExit(1)\n", + "\n", + " endpoint_url, request_headers, allow_insecure_tls = utils.get_endpoint(deployment, rg_name, apim_gateway_url)\n", + "\n", + " token_requests_per_caller = 20 # Fewer requests needed since each emits 3 metric entries\n", + "\n", + " for caller in simulated_callers:\n", + " caller_request_count = max(1, int(token_requests_per_caller * caller.get('request_weight', 1.0)))\n", + " fake_jwt = make_fake_jwt_token(caller['appid'])\n", + "\n", + " auth_headers = dict(request_headers) if request_headers else {}\n", + " auth_headers['Authorization'] = f'Bearer {fake_jwt}'\n", + "\n", + " reqs = ApimRequests(endpoint_url, token_subscription_key, auth_headers, allowInsecureTls=allow_insecure_tls)\n", + " reqs.multiGet(\n", + " f'/{token_api_path}/get',\n", + " caller_request_count,\n", + " msg = f'Generating {caller_request_count} requests for {caller[\"name\"]} ({caller[\"appid\"][:12]}...)',\n", + " printResponse = False,\n", + " sleepMs = 10\n", + " )\n", + "\n", + " # Display summary table\n", + " print()\n", + " print_info('AI Gateway Token Traffic Summary:')\n", + " print()\n", + " print(f' {\"App ID\":<40} {\"Caller Name\":<20} {\"Weight\":>8} {\"Requests\":>10} {\"~Tokens/Req\":>12}')\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 8} {\"-\" * 10} {\"-\" * 12}')\n", + " for caller in simulated_callers:\n", + " req_count = max(1, int(token_requests_per_caller * caller.get('request_weight', 1.0)))\n", + " print(f' {caller[\"appid\"]:<40} {caller[\"name\"]:<20} {caller[\"request_weight\"]:>8.1f} {req_count:>10} {\"~10-700\":>12}')\n", + " total_reqs = sum(max(1, int(token_requests_per_caller * c.get('request_weight', 1.0))) for c in simulated_callers)\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 8} {\"-\" * 10} {\"-\" * 12}')\n", + " print(f' {\"TOTAL\":<40} {\"\":>20} {\"\":>8} {total_reqs:>10} {\"\":>12}')\n", + " print()\n", + " print_info('Each request emits 3 caller-tokens metric entries (prompt, completion, total)')\n", + " print_info(f'Expected metric entries: ~{total_reqs * 3} + {total_reqs} caller-requests')\n", + " print_info('Note: Custom metrics typically take 5-10 minutes to appear in Application Insights')\n", + "\n", + " # --- Verify token metrics ---\n", + " print()\n", + " print_info('Waiting for caller-tokens metrics in Application Insights...')\n", + "\n", + " app_insights_resource_id = (\n", + " f'/subscriptions/{subscription_id}'\n", + " f'/resourceGroups/{rg_name}'\n", + " f'/providers/Microsoft.Insights/components/{app_insights_name}'\n", + " )\n", + "\n", + " # Load KQL from external file and prepend let bindings\n", + " token_kql_path = utils.determine_policy_path('verify-token-metric-ingestion.kql', sample_folder)\n", + " token_kql_template = Path(token_kql_path).read_text(encoding='utf-8')\n", + " token_kql = f\"let tokenType = '*';\\nlet timeWindow = 30m;\\n{token_kql_template}\"\n", + "\n", + " token_query_body = json.dumps({'query': token_kql})\n", + "\n", + " with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:\n", + " f.write(token_query_body)\n", + " token_query_file = f.name\n", + "\n", + " max_wait_minutes = 10\n", + " poll_interval_seconds = 30\n", + " max_attempts = (max_wait_minutes * 60) // poll_interval_seconds\n", + " token_metrics_found = False\n", + "\n", + " try:\n", + " for attempt in range(1, max_attempts + 1):\n", + " result = run(\n", + " f'az rest --method POST '\n", + " f'--url \"https://management.azure.com{app_insights_resource_id}/query?api-version=2018-04-20\" '\n", + " f'--body @{token_query_file} -o json'\n", + " )\n", + "\n", + " if not result.success:\n", + " print_error(f'Query failed: {result.text[:300]}')\n", + " break\n", + "\n", + " if result.json_data:\n", + " tables = result.json_data.get('tables', [])\n", + " if tables:\n", + " rows = tables[0].get('rows', [])\n", + " if rows and len(rows) > 0:\n", + " print_ok(f'Found {len(rows)} caller-tokens metric entries')\n", + " token_metrics_found = True\n", + " break\n", + "\n", + " elapsed = attempt * poll_interval_seconds\n", + " remaining = (max_wait_minutes * 60) - elapsed\n", + " print_info(f' No metrics yet... retrying in {poll_interval_seconds}s ({remaining}s remaining)')\n", + " time.sleep(poll_interval_seconds)\n", + " finally:\n", + " Path(token_query_file).unlink(missing_ok=True)\n", + "\n", + " if token_metrics_found:\n", + " print_ok('Token metric ingestion verified')\n", + " print()\n", + "\n", + " # Build caller lookup\n", + " caller_lookup = {c['appid']: c['name'] for c in simulated_callers}\n", + "\n", + " print_info('Token Consumption by Caller:')\n", + " print()\n", + " print(f' {\"Caller ID\":<40} {\"Name\":<20} {\"Type\":<12} {\"Tokens\":>10} {\"Model\":<15}')\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 12} {\"-\" * 10} {\"-\" * 15}')\n", + "\n", + " grand_total = 0\n", + " for row in rows:\n", + " caller_id = row[0]\n", + " token_type = row[1]\n", + " model = row[2]\n", + " tokens = int(row[3])\n", + " req_count = int(row[4])\n", + " name = caller_lookup.get(caller_id, caller_id[:12] + '...')\n", + " print(f' {caller_id:<40} {name:<20} {token_type:<12} {tokens:>10} {model:<15}')\n", + " if token_type == 'total':\n", + " grand_total += tokens\n", + "\n", + " print(f' {\"-\" * 40} {\"-\" * 20} {\"-\" * 12} {\"-\" * 10} {\"-\" * 15}')\n", + " print(f' {\"GRAND TOTAL (total tokens)\":<40} {\"\":>20} {\"\":>12} {grand_total:>10}')\n", + " print()\n", + " print_info(\"In production, multiply total tokens by your model's per-token rate for cost allocation\")\n", + "\n", + " elif result.success:\n", + " print_warning(f'Token metrics did not appear within {max_wait_minutes} minutes')\n", + " print_info('Tips:')\n", + " print_info(' 1. Wait a few more minutes and re-run this cell')\n", + " print_info(' 2. Verify the emit-metric policy is applied in Azure Portal')\n", + " print_info(' 3. Check that enable_token_tracking = True in the init cell')\n" + ] + }, { "cell_type": "markdown", "id": "6ec1ac38", @@ -608,8 +1200,7 @@ " f'az monitor log-analytics workspace show '\n", " f'--resource-group {rg_name} '\n", " f'--workspace-name {log_analytics_name} '\n", - " f'--query id -o tsv',\n", - " log_command=False\n", + " f'--query id -o tsv'\n", " )\n", " workspace_id = workspace_result.text.strip()\n", "\n", @@ -623,8 +1214,7 @@ " f'--name {action_group_name} '\n", " f'--short-name apimcost '\n", " f'--action email cost-alert-email {alert_email} '\n", - " f'-o json',\n", - " log_command=False\n", + " f'-o json'\n", " )\n", "\n", " if ag_result.success:\n", @@ -696,8 +1286,7 @@ " result = run(\n", " f'az rest --method PUT '\n", " f'--uri https://management.azure.com{alert_id}?api-version=2023-03-15-preview '\n", - " f'--body @{alert_body_path}',\n", - " log_command=False\n", + " f'--body @{alert_body_path}'\n", " )\n", " finally:\n", " Path(alert_body_path).unlink(missing_ok=True)\n", diff --git a/samples/costing/emit_metric_caller_id.xml b/samples/costing/emit_metric_caller_id.xml new file mode 100644 index 00000000..5610d7d8 --- /dev/null +++ b/samples/costing/emit_metric_caller_id.xml @@ -0,0 +1,32 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/samples/costing/emit_metric_caller_tokens.xml b/samples/costing/emit_metric_caller_tokens.xml new file mode 100644 index 00000000..dafae127 --- /dev/null +++ b/samples/costing/emit_metric_caller_tokens.xml @@ -0,0 +1,121 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + (); + }" /> + () : 0; + }" /> + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/samples/costing/main.bicep b/samples/costing/main.bicep index 5dbee471..d05545f7 100644 --- a/samples/costing/main.bicep +++ b/samples/costing/main.bicep @@ -48,6 +48,9 @@ param apis array = [] @description('Array of business units to create subscriptions for') param businessUnits array = [] +@description('Array of policy fragments to deploy') +param policyFragments array = [] + // ------------------ // VARIABLES @@ -70,15 +73,34 @@ resource apimService 'Microsoft.ApiManagement/service@2024-06-01-preview' existi name: apimName } +// APIM Policy Fragments +module policyFragmentModule '../../shared/bicep/modules/apim/v1/policy-fragment.bicep' = [for pf in policyFragments: { + name: 'pf-${pf.name}' + params: { + apimName: apimName + policyFragmentName: pf.name + policyFragmentDescription: pf.description + policyFragmentValue: pf.policyXml + } +}] + // APIM APIs +// Use the costing sample's own App Insights logger so that emit-metric +// custom metrics (caller-requests, caller-tokens) flow to the costing +// App Insights resource instead of the infrastructure-level one. module apisModule '../../shared/bicep/modules/apim/v1/api.bicep' = [for api in apis: if(!empty(apis)) { name: 'api-${api.name}' params: { apimName: apimName + apimLoggerName: 'applicationinsights-logger' appInsightsInstrumentationKey: appInsightsInstrKey appInsightsId: appInsightsResourceId api: api } + dependsOn: [ + apimDiagnosticsModule + policyFragmentModule + ] }] // Create subscriptions for different business units @@ -179,6 +201,14 @@ module apimDiagnosticsModule '../../shared/bicep/modules/apim/v1/diagnostics.bic } +// The workbook JSON contains '__APP_INSIGHTS_NAME__' tokens in cross-resource +// KQL queries (Entra ID tab). Replace them with the Application Insights AppId +// (GUID) so the app() function resolves correctly at runtime. +#disable-next-line BCP318 +var appInsightsAppId = enableApplicationInsights ? applicationInsightsModule.outputs.appId : '' +var rawWorkbookJson = string(loadJsonContent('workbook.json')) +var workbookJsonWithAppInsights = replace(rawWorkbookJson, '__APP_INSIGHTS_NAME__', appInsightsAppId) + // https://learn.microsoft.com/azure/templates/microsoft.insights/workbooks resource workbook 'Microsoft.Insights/workbooks@2023-06-01' = if (enableLogAnalytics) { name: guid(resourceGroup().id, 'apim-costing-workbook', string(index)) @@ -186,7 +216,7 @@ resource workbook 'Microsoft.Insights/workbooks@2023-06-01' = if (enableLogAnaly kind: 'shared' properties: { displayName: workbookName - serializedData: string(loadJsonContent('workbook.json')) + serializedData: workbookJsonWithAppInsights version: '1.0' #disable-next-line BCP318 sourceId: enableLogAnalytics ? logAnalyticsModule.outputs.id : '' @@ -265,3 +295,8 @@ output subscriptionKeys array = [for (bu, i) in businessUnits: { name: bu.name primaryKey: listSecrets(subscriptions[i].id, '2024-06-01-preview').primaryKey }] + +@description('Per-API subscription metadata (subscription keys are not exposed; retrieve keys via APIM RBAC-controlled mechanisms)') +output apiSubscriptionKeys array = [for (api, i) in apis: { + name: api.name +}] diff --git a/samples/costing/pf-extract-caller-id.xml b/samples/costing/pf-extract-caller-id.xml new file mode 100644 index 00000000..2599a329 --- /dev/null +++ b/samples/costing/pf-extract-caller-id.xml @@ -0,0 +1,40 @@ + + + = 2) + { + var base64 = parts[1].Replace('-', '+').Replace('_', '/'); + var padded = base64.PadRight(base64.Length + (4 - base64.Length % 4) % 4, '='); + var payload = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(padded)); + var json = Newtonsoft.Json.Linq.JObject.Parse(payload); + var appId = json["appid"]?.ToString() ?? json["azp"]?.ToString(); + if (!string.IsNullOrEmpty(appId)) { return appId; } + } + } + catch { } + } + return context.Subscription?.Id ?? "unknown"; + }" /> + diff --git a/samples/costing/screenshots/AIGateway-01.png b/samples/costing/screenshots/AIGateway-01.png new file mode 100644 index 00000000..2874dcaf Binary files /dev/null and b/samples/costing/screenshots/AIGateway-01.png differ diff --git a/samples/costing/screenshots/AIGateway-02.png b/samples/costing/screenshots/AIGateway-02.png new file mode 100644 index 00000000..5dd41847 Binary files /dev/null and b/samples/costing/screenshots/AIGateway-02.png differ diff --git a/samples/costing/screenshots/AIGateway-03.png b/samples/costing/screenshots/AIGateway-03.png new file mode 100644 index 00000000..6d6e2746 Binary files /dev/null and b/samples/costing/screenshots/AIGateway-03.png differ diff --git a/samples/costing/screenshots/AIGateway-04.png b/samples/costing/screenshots/AIGateway-04.png new file mode 100644 index 00000000..dab1c997 Binary files /dev/null and b/samples/costing/screenshots/AIGateway-04.png differ diff --git a/samples/costing/screenshots/Dashboard-06.png b/samples/costing/screenshots/Dashboard-06.png new file mode 100644 index 00000000..66c46dc8 Binary files /dev/null and b/samples/costing/screenshots/Dashboard-06.png differ diff --git a/samples/costing/screenshots/EntraID-01.png b/samples/costing/screenshots/EntraID-01.png new file mode 100644 index 00000000..250715ab Binary files /dev/null and b/samples/costing/screenshots/EntraID-01.png differ diff --git a/samples/costing/screenshots/EntraID-02.png b/samples/costing/screenshots/EntraID-02.png new file mode 100644 index 00000000..76c0552e Binary files /dev/null and b/samples/costing/screenshots/EntraID-02.png differ diff --git a/samples/costing/screenshots/EntraID-03.png b/samples/costing/screenshots/EntraID-03.png new file mode 100644 index 00000000..8e83c7d4 Binary files /dev/null and b/samples/costing/screenshots/EntraID-03.png differ diff --git a/samples/costing/screenshots/README.md b/samples/costing/screenshots/README.md index 0c21cb0a..f39984cd 100644 --- a/samples/costing/screenshots/README.md +++ b/samples/costing/screenshots/README.md @@ -33,3 +33,39 @@ This directory contains screenshots showing expected results after running the c ### Response Code Analysis ![Dashboard - Response Code Analysis](Dashboard-05.png) + +### Drill-Down Details + +![Dashboard - Drill-Down Details](Dashboard-06.png) + +## Entra ID Application Costing Tab + +### Usage by Caller ID + +![Entra ID - Usage by Caller ID](EntraID-01.png) + +### Cost Allocation + +![Entra ID - Cost Allocation](EntraID-02.png) + +### Request Trend + +![Entra ID - Request Trend](EntraID-03.png) + +## AI Gateway Token/PTU Tab + +### Token Consumption by Client + +![AI Gateway - Token Consumption by Client](AIGateway-01.png) + +### Token Cost Allocation + +![AI Gateway - Token Cost Allocation](AIGateway-02.png) + +### Token Trends & PTU Utilization + +![AI Gateway - Token Trends & PTU Utilization](AIGateway-03.png) + +### Model & Caller Breakdown + +![AI Gateway - Model & Caller Breakdown](AIGateway-04.png) diff --git a/samples/costing/verify-metric-breakdown.kql b/samples/costing/verify-metric-breakdown.kql new file mode 100644 index 00000000..8eaf4a5d --- /dev/null +++ b/samples/costing/verify-metric-breakdown.kql @@ -0,0 +1,9 @@ +// Per-caller breakdown of caller-requests custom metrics from +// the emit-metric policy (Entra ID application tracking). +// +// Returns one row per calling application with the total request count. +customMetrics +| where name == 'caller-requests' +| where isnotempty(customDimensions.CallerId) +| summarize Requests = sum(value) by CallerId = tostring(customDimensions.CallerId) +| order by Requests desc diff --git a/samples/costing/verify-metric-ingestion.kql b/samples/costing/verify-metric-ingestion.kql new file mode 100644 index 00000000..f8fbca18 --- /dev/null +++ b/samples/costing/verify-metric-ingestion.kql @@ -0,0 +1,7 @@ +// Checks whether the emit-metric policy is emitting caller-requests +// custom metrics into Application Insights. +// Used to verify that the policy is applied and metrics are flowing. +customMetrics +| where name == 'caller-requests' +| where isnotempty(customDimensions.CallerId) +| summarize Count = sum(value) diff --git a/samples/costing/verify-token-metric-ingestion.kql b/samples/costing/verify-token-metric-ingestion.kql new file mode 100644 index 00000000..c013e35a --- /dev/null +++ b/samples/costing/verify-token-metric-ingestion.kql @@ -0,0 +1,21 @@ +// Checks whether the emit-metric policy is emitting caller-tokens +// custom metrics into Application Insights for AI Gateway token tracking. +// +// Parameters (prepend as KQL 'let' bindings before running): +// let tokenType = '*'; // Options: 'prompt', 'completion', 'total', '*' (all) +// let timeWindow = 30m; // How far back to look (e.g. 30m, 1h, 24h) +// +// Used to verify that the token tracking policy is applied and +// per-caller token consumption metrics are flowing. +customMetrics +| where name == 'caller-tokens' +| where timestamp > ago(timeWindow) +| where isnotempty(customDimensions.CallerId) +| where tokenType == '*' or tostring(customDimensions.TokenType) == tokenType +| summarize + TotalTokens = sum(value), + Requests = count() + by CallerId = tostring(customDimensions.CallerId), + TokenType = tostring(customDimensions.TokenType), + Model = tostring(customDimensions.Model) +| order by TotalTokens desc diff --git a/samples/costing/workbook.json b/samples/costing/workbook.json index 70a02a38..dc8557ee 100644 --- a/samples/costing/workbook.json +++ b/samples/costing/workbook.json @@ -30,48 +30,12 @@ "version": "KqlParameterItem/1.0" }, { - "id": "c1a2b3d4-e5f6-7890-abcd-ef1234567890", - "isRequired": true, - "label": "Base Monthly APIM Cost ($)", - "name": "BaseMonthlyCost", - "type": 1, - "typeSettings": { - "paramValidationRules": [ - { - "match": true, - "message": "Enter a valid dollar amount (e.g. 150.00)", - "regExp": "^\\d+(\\.\\d{1,2})?$" - } - ] - }, - "value": "150.00", - "version": "KqlParameterItem/1.0" - }, - { - "id": "d2b3c4e5-f6a7-8901-bcde-f12345678901", - "isRequired": true, - "label": "Variable Cost per 1000 Requests ($)", - "name": "PerRequestRate", - "type": 1, - "typeSettings": { - "paramValidationRules": [ - { - "match": true, - "message": "Enter a valid rate (e.g. 0.003)", - "regExp": "^\\d+(\\.\\d{1,6})?$" - } - ] - }, - "value": "0.003", - "version": "KqlParameterItem/1.0" - }, - { - "id": "e3c4d5f6-a7b8-9012-cdef-234567890abc", + "id": "f0a1b2c3-d4e5-6789-abcd-tab000000001", "isHiddenWhenLocked": true, - "label": "Selected Business Unit", - "name": "SelectedBusinessUnit", + "label": "Selected Tab", + "name": "selectedTab", "type": 1, - "value": "*", + "value": "subscription", "version": "KqlParameterItem/1.0" } ], @@ -80,368 +44,1114 @@ "style": "pills", "version": "KqlParameterItem/1.0" }, - "name": "parameters - 0", + "name": "parameters - shared", "type": 9 }, { "content": { - "json": "## APIM Cost Allocation & Showback Dashboard\n\nThis workbook splits the **base APIM infrastructure cost** across business units proportionally by usage, then adds **variable per-API costs** based on request volume.\n\n| Parameter | Description |\n|---|---|\n| **Base Monthly APIM Cost** | Fixed platform cost (SKU, networking, etc.) split proportionally by request share |\n| **Variable Cost per 1K Requests** | Usage-based rate applied on top of the base allocation |\n\n> Adjust parameters above to model different pricing scenarios." - }, - "name": "text - header", - "type": 1 - }, - { - "content": { - "expandable": true, - "expanded": true, - "groupType": "editable", - "items": [ + "links": [ { - "content": { - "json": "| Component | Formula |\n|---|---|\n| **Base Cost** | Monthly platform cost for the APIM SKU (see parameter above): **${BaseMonthlyCost}** |\n| **Base Cost Share** | `Base Monthly Cost x (BU Requests / Total Requests)` |\n| **Variable Cost** | `BU Requests x (Rate per 1K / 1000)` |\n| **Total Allocated** | `Base Cost Share + Variable Cost` |\n\n> The base monthly cost and variable rate parameters are editable above. Use the notebook's pricing lookup cell to auto-detect values from the [Azure Retail Prices API](https://learn.microsoft.com/rest/api/cost-management/retail-prices/azure-retail-prices) and keep them in sync with your APIM SKU." - }, - "name": "text - cost-model-detail", - "type": 1 + "cellValue": "selectedTab", + "id": "f0a1b2c3-d4e5-6789-abcd-tab000000010", + "linkTarget": "parameter", + "linkLabel": "Subscription-Based Costing", + "preText": "", + "style": "link", + "subTarget": "subscription" + }, + { + "cellValue": "selectedTab", + "id": "f0a1b2c3-d4e5-6789-abcd-tab000000011", + "linkTarget": "parameter", + "linkLabel": "Entra ID Application Costing", + "preText": "", + "style": "link", + "subTarget": "entraid" + }, + { + "cellValue": "selectedTab", + "id": "f0a1b2c3-d4e5-6789-abcd-tab000000012", + "linkTarget": "parameter", + "linkLabel": "AI Gateway Token/PTU", + "preText": "", + "style": "link", + "subTarget": "aigateway" } ], - "loadType": "always", - "title": "Cost Allocation Model", - "version": "NotebookGroup/1.0" + "style": "tabs", + "version": "LinkItem/1.0" }, - "name": "group - cost-model", - "type": 12 + "name": "links - tabs", + "type": 11 }, { + "conditionalVisibility": { + "comparison": "isEqualTo", + "parameterName": "selectedTab", + "value": "subscription" + }, "content": { - "expandable": true, - "expanded": true, "groupType": "editable", "items": [ { "content": { - "exportDefaultValue": "*", - "exportFieldName": "Business Unit", - "exportParameterName": "SelectedBusinessUnit", - "gridSettings": { - "formatters": [ - { - "columnMatch": "Usage Share (%)", - "formatOptions": { - "max": 100, - "min": 0, - "palette": "blue" - }, - "formatter": 8 - }, - { - "columnMatch": "Base Cost Share ($)", - "formatOptions": { - "min": 0, - "palette": "blue" - }, - "formatter": 8 - }, - { - "columnMatch": "Total Allocated ($)", - "formatOptions": { - "min": 0, - "palette": "turquoise" - }, - "formatter": 8 - } - ] - }, - "query": "let baseCost = todouble('{BaseMonthlyCost}');\r\nlet perKRate = todouble('{PerRequestRate}');\r\nlet logs = ApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\r\nlet totalRequests = toscalar(logs | summarize count());\r\nlogs\r\n| summarize RequestCount = count() by ApimSubscriptionId\r\n| extend UsageShare = round(RequestCount * 100.0 / totalRequests, 2)\r\n| extend BaseCostShare = round(baseCost * RequestCount / totalRequests, 2)\r\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 2)\r\n| extend TotalAllocatedCost = round(BaseCostShare + VariableCost, 2)\r\n| order by TotalAllocatedCost desc\r\n| project\r\n ['Business Unit'] = ApimSubscriptionId,\r\n ['Requests'] = RequestCount,\r\n ['Usage Share (%)'] = UsageShare,\r\n ['Base Cost ($)'] = baseCost,\r\n ['Base Cost Share ($)'] = BaseCostShare,\r\n ['Variable Cost ($)'] = VariableCost,\r\n ['Total Allocated ($)'] = TotalAllocatedCost", + "parameters": [ + { + "id": "c1a2b3d4-e5f6-7890-abcd-ef1234567890", + "isRequired": true, + "label": "Base Monthly APIM Cost ($)", + "name": "BaseMonthlyCost", + "type": 1, + "typeSettings": { + "paramValidationRules": [ + { + "match": true, + "message": "Enter a valid dollar amount (e.g. 150.00)", + "regExp": "^\\d+(\\.\\d{1,2})?$" + } + ] + }, + "value": "150.00", + "version": "KqlParameterItem/1.0" + }, + { + "id": "d2b3c4e5-f6a7-8901-bcde-f12345678901", + "isRequired": true, + "label": "Variable Cost per 1000 Requests ($)", + "name": "PerRequestRate", + "type": 1, + "typeSettings": { + "paramValidationRules": [ + { + "match": true, + "message": "Enter a valid rate (e.g. 0.003)", + "regExp": "^\\d+(\\.\\d{1,6})?$" + } + ] + }, + "value": "0.003", + "version": "KqlParameterItem/1.0" + }, + { + "id": "e3c4d5f6-a7b8-9012-cdef-234567890abc", + "isHiddenWhenLocked": true, + "label": "Selected Business Unit", + "name": "SelectedBusinessUnit", + "type": 1, + "value": "*", + "version": "KqlParameterItem/1.0" + } + ], "queryType": 0, "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Cost Allocation by Business Unit (click a row to filter charts below)", - "version": "KqlItem/1.0", - "visualization": "table" + "style": "pills", + "version": "KqlParameterItem/1.0" }, - "name": "query - cost-allocation-table", - "type": 3 + "name": "parameters - subscription", + "type": 9 }, { "content": { - "chartSettings": { - "customThresholdLine": "{BaseMonthlyCost}", - "customThresholdLineStyle": 1, - "seriesLabelSettings": [ - { "color": "blue", "label": "Base Cost ($)", "seriesName": "BaseCostShare" }, - { "color": "orange", "label": "Variable Cost ($)", "seriesName": "VariableCost" } - ], - "xAxis": "ApimSubscriptionId", - "ySettings": { - "min": 0 + "json": "## APIM Cost Allocation & Showback Dashboard\n\nThis tab splits the **base APIM infrastructure cost** across business units proportionally by usage, then adds **variable per-API costs** based on request volume. Data comes from `ApiManagementGatewayLogs` in Log Analytics, keyed by APIM subscription ID.\n\n| Parameter | Description |\n|---|---|\n| **Base Monthly APIM Cost** | Fixed platform cost (SKU, networking, etc.) split proportionally by request share |\n| **Variable Cost per 1K Requests** | Usage-based rate applied on top of the base allocation |\n\n> Adjust parameters above to model different pricing scenarios." + }, + "name": "text - header-subscription", + "type": 1 + }, + { + "content": { + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "json": "| Component | Formula |\n|---|---|\n| **Base Cost** | Monthly platform cost for the APIM SKU (see parameter above): **${BaseMonthlyCost}** |\n| **Base Cost Share** | `Base Monthly Cost x (BU Requests / Total Requests)` |\n| **Variable Cost** | `BU Requests x (Rate per 1K / 1000)` |\n| **Total Allocated** | `Base Cost Share + Variable Cost` |\n\n> The base monthly cost and variable rate parameters are editable above. Use the notebook's pricing lookup cell to auto-detect values from the [Azure Retail Prices API](https://learn.microsoft.com/rest/api/cost-management/retail-prices/azure-retail-prices) and keep them in sync with your APIM SKU." + }, + "name": "text - cost-model-detail", + "type": 1 } - }, - "query": "let baseCost = todouble('{BaseMonthlyCost}');\r\nlet perKRate = todouble('{PerRequestRate}');\r\nlet logs = ApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\r\nlet totalRequests = toscalar(logs | summarize count());\r\nlogs\r\n| summarize RequestCount = count() by ApimSubscriptionId\r\n| extend BaseCostShare = round(baseCost * RequestCount / totalRequests, 2)\r\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 2)\r\n| project ApimSubscriptionId, BaseCostShare, VariableCost", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Base vs Variable Cost Split by Business Unit", - "version": "KqlItem/1.0", - "visualization": "barchart" + ], + "loadType": "always", + "title": "Cost Allocation Model", + "version": "NotebookGroup/1.0" }, - "name": "query - cost-allocation-chart", - "type": 3 + "name": "group - cost-model", + "type": 12 }, { "content": { - "gridSettings": { - "formatters": [ - { - "columnMatch": "Total ($)", - "formatOptions": { - "min": 0, - "palette": "turquoise" - }, - "formatter": 8 - } - ] - }, - "query": "let baseCost = todouble('{BaseMonthlyCost}');\r\nlet perKRate = todouble('{PerRequestRate}');\r\nlet selectedBU = '{SelectedBusinessUnit}';\r\nlet logs = ApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\r\nlet filteredLogs = logs\r\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU;\r\nlet totalRequests = toscalar(logs | summarize count());\r\nfilteredLogs\r\n| summarize RequestCount = count() by ApimSubscriptionId, ApiId\r\n| extend BaseCostShare = round(baseCost * RequestCount / totalRequests, 2)\r\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 2)\r\n| extend TotalCost = round(BaseCostShare + VariableCost, 2)\r\n| order by TotalCost desc\r\n| project\r\n ['Business Unit'] = ApimSubscriptionId,\r\n ['API'] = ApiId,\r\n ['Requests'] = RequestCount,\r\n ['Base Share ($)'] = BaseCostShare,\r\n ['Variable ($)'] = VariableCost,\r\n ['Total ($)'] = TotalCost\r\n| take 25", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Cost Breakdown by Business Unit & API (Top 25)", - "version": "KqlItem/1.0", - "visualization": "table" + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "exportDefaultValue": "*", + "exportFieldName": "Business Unit", + "exportParameterName": "SelectedBusinessUnit", + "gridSettings": { + "formatters": [ + { + "columnMatch": "Usage Share (%)", + "formatOptions": { + "max": 100, + "min": 0, + "palette": "blue" + }, + "formatter": 8 + }, + { + "columnMatch": "Base Cost Share ($)", + "formatOptions": { + "min": 0, + "palette": "blue" + }, + "formatter": 8 + }, + { + "columnMatch": "Total Allocated ($)", + "formatOptions": { + "min": 0, + "palette": "turquoise" + }, + "formatter": 8 + } + ] + }, + "query": "let baseCost = todouble('{BaseMonthlyCost}');\nlet perKRate = todouble('{PerRequestRate}');\nlet logs = ApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\nlet totalRequests = toscalar(logs | summarize count());\nlogs\n| summarize RequestCount = count() by ApimSubscriptionId\n| extend UsageShare = round(RequestCount * 100.0 / totalRequests, 2)\n| extend BaseCostShare = round(baseCost * RequestCount / totalRequests, 2)\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 2)\n| extend TotalAllocatedCost = round(BaseCostShare + VariableCost, 2)\n| order by TotalAllocatedCost desc\n| project\n ['Business Unit'] = ApimSubscriptionId,\n ['Requests'] = RequestCount,\n ['Usage Share (%)'] = UsageShare,\n ['Base Cost ($)'] = baseCost,\n ['Base Cost Share ($)'] = BaseCostShare,\n ['Variable Cost ($)'] = VariableCost,\n ['Total Allocated ($)'] = TotalAllocatedCost", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Cost Allocation by Business Unit (click a row to filter charts below)", + "version": "KqlItem/1.0", + "visualization": "table" + }, + "name": "query - cost-allocation-table", + "type": 3 + }, + { + "content": { + "chartSettings": { + "customThresholdLine": "{BaseMonthlyCost}", + "customThresholdLineStyle": 1, + "seriesLabelSettings": [ + { "color": "blue", "label": "Base Cost ($)", "seriesName": "BaseCostShare" }, + { "color": "orange", "label": "Variable Cost ($)", "seriesName": "VariableCost" } + ], + "xAxis": "ApimSubscriptionId", + "ySettings": { + "min": 0 + } + }, + "query": "let baseCost = todouble('{BaseMonthlyCost}');\nlet perKRate = todouble('{PerRequestRate}');\nlet logs = ApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\nlet totalRequests = toscalar(logs | summarize count());\nlogs\n| summarize RequestCount = count() by ApimSubscriptionId\n| extend BaseCostShare = round(baseCost * RequestCount / totalRequests, 2)\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 2)\n| project ApimSubscriptionId, BaseCostShare, VariableCost", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Base vs Variable Cost Split by Business Unit", + "version": "KqlItem/1.0", + "visualization": "barchart" + }, + "name": "query - cost-allocation-chart", + "type": 3 + }, + { + "content": { + "gridSettings": { + "formatters": [ + { + "columnMatch": "Total ($)", + "formatOptions": { + "min": 0, + "palette": "turquoise" + }, + "formatter": 8 + } + ] + }, + "query": "let baseCost = todouble('{BaseMonthlyCost}');\nlet perKRate = todouble('{PerRequestRate}');\nlet selectedBU = '{SelectedBusinessUnit}';\nlet logs = ApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\nlet filteredLogs = logs\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU;\nlet totalRequests = toscalar(logs | summarize count());\nfilteredLogs\n| summarize RequestCount = count() by ApimSubscriptionId, ApiId\n| extend BaseCostShare = round(baseCost * RequestCount / totalRequests, 2)\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 2)\n| extend TotalCost = round(BaseCostShare + VariableCost, 2)\n| order by TotalCost desc\n| project\n ['Business Unit'] = ApimSubscriptionId,\n ['API'] = ApiId,\n ['Requests'] = RequestCount,\n ['Base Share ($)'] = BaseCostShare,\n ['Variable ($)'] = VariableCost,\n ['Total ($)'] = TotalCost\n| take 25", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Cost Breakdown by Business Unit & API (Top 25)", + "version": "KqlItem/1.0", + "visualization": "table" + }, + "name": "query - cost-per-api", + "type": 3 + } + ], + "loadType": "always", + "title": "Cost Allocation Summary", + "version": "NotebookGroup/1.0" }, - "name": "query - cost-per-api", - "type": 3 - } - ], - "loadType": "always", - "title": "Cost Allocation Summary", - "version": "NotebookGroup/1.0" - }, - "name": "group - cost-allocation", - "type": 12 - }, - { - "content": { - "expandable": true, - "expanded": true, - "groupType": "editable", - "items": [ + "name": "group - cost-allocation", + "type": 12 + }, { "content": { - "chartSettings": { - "ySettings": { - "min": 0 + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "chartSettings": { + "ySettings": { + "min": 0 + } + }, + "query": "ApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\n| summarize RequestCount = count() by ApimSubscriptionId\n| order by RequestCount desc\n| project BusinessUnit = ApimSubscriptionId, RequestCount", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Request Count by Business Unit", + "version": "KqlItem/1.0", + "visualization": "barchart" + }, + "name": "query - request-count", + "type": 3 + }, + { + "content": { + "chartSettings": { + "seriesLabelSettings": [ + { "color": "blue", "seriesName": "RequestCount" } + ] + }, + "query": "let apimLogs = ApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\nlet totalCount = toscalar(apimLogs | count);\napimLogs\n| summarize RequestCount = count() by ApimSubscriptionId\n| extend Percentage = round(RequestCount * 100.0 / totalCount, 2)\n| order by RequestCount desc\n| project BusinessUnit = ApimSubscriptionId, RequestCount, ['Percentage (%)'] = Percentage", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Request Distribution Across Business Units", + "version": "KqlItem/1.0", + "visualization": "piechart" + }, + "name": "query - distribution", + "type": 3 + }, + { + "content": { + "query": "let selectedBU = '{SelectedBusinessUnit}';\nApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU\n| summarize RequestCount = count() by bin(TimeGenerated, 1h), ApimSubscriptionId\n| project TimeGenerated, BusinessUnit = ApimSubscriptionId, RequestCount", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Request Trends Over Time by Business Unit", + "version": "KqlItem/1.0", + "visualization": "timechart" + }, + "name": "query - trends", + "type": 3 } - }, - "query": "ApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\r\n| summarize RequestCount = count() by ApimSubscriptionId\r\n| order by RequestCount desc\r\n| project BusinessUnit = ApimSubscriptionId, RequestCount", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Request Count by Business Unit", - "version": "KqlItem/1.0", - "visualization": "barchart" + ], + "loadType": "always", + "title": "Usage Analytics", + "version": "NotebookGroup/1.0" }, - "name": "query - request-count", - "type": 3 + "name": "group - usage-analytics", + "type": 12 }, { "content": { - "chartSettings": { - "seriesLabelSettings": [ - { "color": "blue", "seriesName": "RequestCount" } - ] - }, - "query": "let apimLogs = ApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\r\nlet totalCount = toscalar(apimLogs | count);\r\napimLogs\r\n| summarize RequestCount = count() by ApimSubscriptionId\r\n| extend Percentage = round(RequestCount * 100.0 / totalCount, 2)\r\n| order by RequestCount desc\r\n| project BusinessUnit = ApimSubscriptionId, RequestCount, ['Percentage (%)'] = Percentage", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Request Distribution Across Business Units", - "version": "KqlItem/1.0", - "visualization": "piechart" + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "gridSettings": { + "formatters": [ + { + "columnMatch": "Success Rate (%)", + "formatOptions": { + "max": 100, + "min": 0, + "palette": "redGreen" + }, + "formatter": 8 + }, + { + "columnMatch": "Error Rate (%)", + "formatOptions": { + "max": 100, + "min": 0, + "palette": "greenRed" + }, + "formatter": 8 + } + ] + }, + "query": "let selectedBU = '{SelectedBusinessUnit}';\nApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU\n| summarize \n TotalRequests = count(),\n SuccessRequests = countif(ResponseCode < 400),\n ClientErrors = countif(ResponseCode >= 400 and ResponseCode < 500),\n ServerErrors = countif(ResponseCode >= 500)\n by ApimSubscriptionId\n| extend SuccessRate = round(SuccessRequests * 100.0 / TotalRequests, 2)\n| extend ErrorRate = round((ClientErrors + ServerErrors) * 100.0 / TotalRequests, 2)\n| project \n BusinessUnit = ApimSubscriptionId, \n TotalRequests, \n SuccessRequests, \n ClientErrors, \n ServerErrors, \n ['Success Rate (%)'] = SuccessRate,\n ['Error Rate (%)'] = ErrorRate\n| order by TotalRequests desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Success & Error Metrics by Business Unit", + "version": "KqlItem/1.0", + "visualization": "table" + }, + "name": "query - success-errors", + "type": 3 + }, + { + "content": { + "chartSettings": { + "seriesLabelSettings": [ + { "color": "blue", "label": "2xx Success", "seriesName": "2xx" }, + { "color": "turquoise", "label": "3xx Redirect", "seriesName": "3xx" }, + { "color": "orange", "label": "4xx Client Error", "seriesName": "4xx" }, + { "color": "redBright", "label": "5xx Server Error", "seriesName": "5xx" } + ] + }, + "query": "let selectedBU = '{SelectedBusinessUnit}';\nApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU\n| extend ResponseClass = case(\n ResponseCode >= 200 and ResponseCode < 300, '2xx',\n ResponseCode >= 300 and ResponseCode < 400, '3xx',\n ResponseCode >= 400 and ResponseCode < 500, '4xx',\n ResponseCode >= 500, '5xx',\n 'Other')\n| summarize RequestCount = count() by ApimSubscriptionId, ResponseClass\n| order by ApimSubscriptionId, ResponseClass\n| project BusinessUnit = ApimSubscriptionId, ResponseClass, RequestCount", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Response Code Distribution by Business Unit", + "version": "KqlItem/1.0", + "visualization": "categoricalbar" + }, + "name": "query - response-codes", + "type": 3 + } + ], + "loadType": "always", + "title": "Health & Reliability", + "version": "NotebookGroup/1.0" }, - "name": "query - distribution", - "type": 3 + "name": "group - health", + "type": 12 }, { "content": { - "query": "let selectedBU = '{SelectedBusinessUnit}';\r\nApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\r\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU\r\n| summarize RequestCount = count() by bin(TimeGenerated, 1h), ApimSubscriptionId\r\n| project TimeGenerated, BusinessUnit = ApimSubscriptionId, RequestCount", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Request Trends Over Time by Business Unit", - "version": "KqlItem/1.0", - "visualization": "timechart" + "expandable": true, + "expanded": false, + "groupType": "editable", + "items": [ + { + "content": { + "json": "Select a **Business Unit** row in the Cost Allocation table above to see that unit's cost trend over time.\n\nThe chart shows daily base cost share and variable cost for the selected business unit. The horizontal line represents the total base monthly cost (${BaseMonthlyCost}) for reference." + }, + "name": "text - drilldown-help", + "type": 1 + }, + { + "content": { + "chartSettings": { + "customThresholdLine": "{BaseMonthlyCost}", + "customThresholdLineStyle": 1, + "seriesLabelSettings": [ + { "color": "blue", "label": "Base Cost Share ($)", "seriesName": "BaseCostShare" }, + { "color": "orange", "label": "Variable Cost ($)", "seriesName": "VariableCost" }, + { "color": "purple", "label": "Total Allocated ($)", "seriesName": "TotalAllocatedCost" } + ], + "ySettings": { + "min": 0 + } + }, + "query": "let baseCost = todouble('{BaseMonthlyCost}');\nlet perKRate = todouble('{PerRequestRate}');\nlet selectedBU = '{SelectedBusinessUnit}';\nlet allLogs = ApiManagementGatewayLogs\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\nlet dailyTotal = allLogs\n| summarize DayTotal = count() by bin(TimeGenerated, 1d);\nallLogs\n| where selectedBU != '*' and ApimSubscriptionId == selectedBU\n| summarize RequestCount = count() by bin(TimeGenerated, 1d)\n| join kind=inner dailyTotal on TimeGenerated\n| extend BaseCostShare = round(baseCost * RequestCount / DayTotal, 2)\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 4)\n| extend TotalAllocatedCost = round(BaseCostShare + VariableCost, 2)\n| project TimeGenerated, BaseCostShare, VariableCost, TotalAllocatedCost", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 2592000000 + }, + "title": "Cost Trend for: {SelectedBusinessUnit}", + "version": "KqlItem/1.0", + "visualization": "linechart" + }, + "conditionalVisibility": { + "comparison": "isNotEqualTo", + "parameterName": "SelectedBusinessUnit", + "value": "*" + }, + "name": "query - drilldown-cost-trend", + "type": 3 + }, + { + "content": { + "json": "> Select a business unit row in the Cost Allocation table to view its cost trend.", + "style": "info" + }, + "conditionalVisibility": { + "comparison": "isEqualTo", + "parameterName": "SelectedBusinessUnit", + "value": "*" + }, + "name": "text - drilldown-placeholder", + "type": 1 + } + ], + "loadType": "always", + "title": "Business Unit Drill-Down (over time)", + "version": "NotebookGroup/1.0" }, - "name": "query - trends", - "type": 3 + "name": "group - drilldown", + "type": 12 } ], "loadType": "always", - "title": "Usage Analytics", "version": "NotebookGroup/1.0" }, - "name": "group - usage-analytics", + "name": "group - tab-subscription", "type": 12 }, { + "conditionalVisibility": { + "comparison": "isEqualTo", + "parameterName": "selectedTab", + "value": "entraid" + }, "content": { - "expandable": true, - "expanded": true, "groupType": "editable", "items": [ { "content": { - "gridSettings": { - "formatters": [ - { - "columnMatch": "Success Rate (%)", - "formatOptions": { - "max": 100, - "min": 0, - "palette": "redGreen" - }, - "formatter": 8 - }, - { - "columnMatch": "Error Rate (%)", - "formatOptions": { - "max": 100, - "min": 0, - "palette": "greenRed" - }, - "formatter": 8 - } - ] - }, - "query": "let selectedBU = '{SelectedBusinessUnit}';\r\nApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\r\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU\r\n| summarize \r\n TotalRequests = count(),\r\n SuccessRequests = countif(ResponseCode < 400),\r\n ClientErrors = countif(ResponseCode >= 400 and ResponseCode < 500),\r\n ServerErrors = countif(ResponseCode >= 500)\r\n by ApimSubscriptionId\r\n| extend SuccessRate = round(SuccessRequests * 100.0 / TotalRequests, 2)\r\n| extend ErrorRate = round((ClientErrors + ServerErrors) * 100.0 / TotalRequests, 2)\r\n| project \r\n BusinessUnit = ApimSubscriptionId, \r\n TotalRequests, \r\n SuccessRequests, \r\n ClientErrors, \r\n ServerErrors, \r\n ['Success Rate (%)'] = SuccessRate,\r\n ['Error Rate (%)'] = ErrorRate\r\n| order by TotalRequests desc", + "parameters": [ + { + "id": "a1b2c3d4-e5f6-7890-abcd-100000000002", + "isRequired": true, + "label": "Monthly Base Cost (USD)", + "name": "BaseCost", + "type": 1, + "typeSettings": { + "paramValidationRules": [ + { + "match": true, + "message": "Enter a valid dollar amount (e.g. 150.00)", + "regExp": "^\\d+(\\.\\d{1,2})?$" + } + ] + }, + "value": "150.00", + "version": "KqlParameterItem/1.0" + }, + { + "id": "a1b2c3d4-e5f6-7890-abcd-100000000003", + "isHiddenWhenLocked": true, + "label": "App ID Names (JSON)", + "name": "AppIdNames", + "type": 1, + "description": "Optional JSON mapping of App IDs to friendly names. Example: {\"a5846c0e-...\":\"HR Service\",\"9e6bfb3f-...\":\"Mobile Gateway\"}", + "value": "{}", + "version": "KqlParameterItem/1.0" + } + ], "queryType": 0, "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Success & Error Metrics by Business Unit", - "version": "KqlItem/1.0", - "visualization": "table" + "style": "pills", + "version": "KqlParameterItem/1.0" }, - "name": "query - success-errors", - "type": 3 + "name": "parameters - entraid", + "type": 9 }, { "content": { - "chartSettings": { - "seriesLabelSettings": [ - { "color": "blue", "label": "2xx Success", "seriesName": "2xx" }, - { "color": "turquoise", "label": "3xx Redirect", "seriesName": "3xx" }, - { "color": "orange", "label": "4xx Client Error", "seriesName": "4xx" }, - { "color": "redBright", "label": "5xx Server Error", "seriesName": "5xx" } - ] - }, - "query": "let selectedBU = '{SelectedBusinessUnit}';\r\nApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != ''\r\n| where selectedBU == '*' or ApimSubscriptionId == selectedBU\r\n| extend ResponseClass = case(\r\n ResponseCode >= 200 and ResponseCode < 300, '2xx',\r\n ResponseCode >= 300 and ResponseCode < 400, '3xx',\r\n ResponseCode >= 400 and ResponseCode < 500, '4xx',\r\n ResponseCode >= 500, '5xx',\r\n 'Other')\r\n| summarize RequestCount = count() by ApimSubscriptionId, ResponseClass\r\n| order by ApimSubscriptionId, ResponseClass\r\n| project BusinessUnit = ApimSubscriptionId, ResponseClass, RequestCount", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Response Code Distribution by Business Unit", - "version": "KqlItem/1.0", - "visualization": "categoricalbar" + "json": "## APIM Cost Attribution by Caller ID\n\nThis tab shows API usage and cost allocation by **Entra ID application** (`appid` claim). Data comes from the `emit-metric` policy's `caller-requests` custom metric in Application Insights.\n\n| Parameter | Description |\n|---|---|\n| **Monthly Base Cost** | Fixed platform cost split proportionally by request share |\n| **App ID Names** | Optional JSON mapping of App IDs to friendly names |\n\n> **Note:** Data typically takes 5-10 minutes to appear after API calls." + }, + "name": "text - header-entraid", + "type": 1 + }, + { + "content": { + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "chartSettings": { + "ySettings": { + "min": 0 + } + }, + "query": "let appNames = parse_json('{AppIdNames}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-requests'\n| extend CallerId = tostring(customDimensions.CallerId)\n| where isnotempty(CallerId)\n| summarize RequestCount = sum(value) by CallerId\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| project Caller, RequestCount\n| order by RequestCount desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 1, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Total Requests by Caller ID", + "noDataMessage": "No caller-requests metrics found in the selected time range.", + "version": "KqlItem/1.0", + "visualization": "barchart" + }, + "customWidth": "60", + "name": "query - entraid-usage-chart", + "styleSettings": { + "maxWidth": "60%", + "showBorder": true + }, + "type": 3 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNames}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-requests'\n| extend CallerId = tostring(customDimensions.CallerId)\n| where isnotempty(CallerId)\n| summarize RequestCount = sum(value) by CallerId\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| project Caller, RequestCount\n| order by RequestCount desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Usage Summary", + "noDataMessage": "No caller-requests metrics found in the selected time range.", + "version": "KqlItem/1.0", + "visualization": "table", + "gridSettings": { + "formatters": [ + { + "columnMatch": "RequestCount", + "formatter": 1, + "numberFormat": { + "unit": 17, + "options": { + "style": "decimal", + "useGrouping": true + } + } + } + ], + "labelSettings": [ + { "columnId": "Caller", "label": "Caller" }, + { "columnId": "RequestCount", "label": "Requests" } + ] + } + }, + "customWidth": "40", + "name": "query - entraid-usage-table", + "styleSettings": { + "maxWidth": "40%", + "showBorder": true + }, + "type": 3 + } + ], + "loadType": "always", + "title": "Usage by Caller ID", + "version": "NotebookGroup/1.0" }, - "name": "query - response-codes", - "type": 3 + "name": "group - entraid-usage", + "type": 12 + }, + { + "content": { + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "json": "Proportional cost breakdown based on the **Monthly Base Cost** parameter above. Each caller's share is calculated as their percentage of total requests applied to the base cost." + }, + "name": "text - entraid-cost-description", + "type": 1 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNames}');\nlet baseCost = {BaseCost};\nlet metrics = app(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-requests'\n| extend CallerId = tostring(customDimensions.CallerId)\n| where isnotempty(CallerId);\nlet totalRequests = toscalar(metrics | summarize sum(value));\nmetrics\n| summarize RequestCount = sum(value) by CallerId\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| extend UsagePercent = round(RequestCount * 100.0 / totalRequests, 2)\n| extend AllocatedCost = round(baseCost * RequestCount / totalRequests, 2)\n| order by AllocatedCost desc\n| project Caller, RequestCount, ['Usage %'] = UsagePercent, ['Allocated Cost ($)'] = AllocatedCost", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Cost Allocation by Caller ID", + "noDataMessage": "No caller-requests metrics found in the selected time range.", + "version": "KqlItem/1.0", + "visualization": "table", + "gridSettings": { + "formatters": [ + { + "columnMatch": "Usage %", + "formatOptions": { + "min": 0, + "max": 100, + "palette": "blue" + }, + "formatter": 4 + }, + { + "columnMatch": "Allocated Cost ($)", + "formatOptions": { + "min": 0, + "palette": "turquoise" + }, + "formatter": 8 + } + ] + } + }, + "customWidth": "60", + "name": "query - entraid-cost-table", + "styleSettings": { + "maxWidth": "60%", + "showBorder": true + }, + "type": 3 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNames}');\nlet baseCost = {BaseCost};\nlet metrics = app(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-requests'\n| extend CallerId = tostring(customDimensions.CallerId)\n| where isnotempty(CallerId);\nlet totalRequests = toscalar(metrics | summarize sum(value));\nmetrics\n| summarize RequestCount = sum(value) by CallerId\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| extend AllocatedCost = round(baseCost * RequestCount / totalRequests, 2)\n| project Caller, AllocatedCost\n| order by AllocatedCost desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 1, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Cost Distribution", + "noDataMessage": "No caller-requests metrics found in the selected time range.", + "version": "KqlItem/1.0", + "visualization": "piechart" + }, + "customWidth": "40", + "name": "query - entraid-cost-pie", + "styleSettings": { + "maxWidth": "40%", + "showBorder": true + }, + "type": 3 + } + ], + "loadType": "always", + "title": "Cost Allocation", + "version": "NotebookGroup/1.0" + }, + "name": "group - entraid-cost-allocation", + "type": 12 + }, + { + "content": { + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "json": "Hourly request volume by caller over time. Use this to spot traffic spikes, identify peak usage periods, and detect anomalies by caller." + }, + "name": "text - entraid-trend-description", + "type": 1 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNames}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-requests'\n| extend CallerId = tostring(customDimensions.CallerId)\n| where isnotempty(CallerId)\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| summarize Requests = sum(value) by Caller, bin(timestamp, 1h)\n| order by timestamp asc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Hourly Request Trend by Caller ID", + "noDataMessage": "No caller-requests metrics found in the selected time range.", + "version": "KqlItem/1.0", + "visualization": "timechart" + }, + "name": "query - entraid-trend-chart", + "type": 3 + } + ], + "loadType": "always", + "title": "Request Trend", + "version": "NotebookGroup/1.0" + }, + "name": "group - entraid-trend", + "type": 12 } ], "loadType": "always", - "title": "Health & Reliability", "version": "NotebookGroup/1.0" }, - "name": "group - health", + "name": "group - tab-entraid", "type": 12 }, { + "conditionalVisibility": { + "comparison": "isEqualTo", + "parameterName": "selectedTab", + "value": "aigateway" + }, "content": { - "expandable": true, - "expanded": false, "groupType": "editable", "items": [ { "content": { - "json": "Select a **Business Unit** row in the Cost Allocation table above to see that unit's cost trend over time.\n\nThe chart shows daily base cost share and variable cost for the selected business unit. The horizontal line represents the total base monthly cost (${BaseMonthlyCost}) for reference." + "parameters": [ + { + "id": "a1b2c3d4-e5f6-7890-abcd-200000000001", + "isRequired": true, + "label": "Cost per 1K Prompt Tokens ($)", + "name": "PromptTokenRate", + "type": 1, + "typeSettings": { + "paramValidationRules": [ + { + "match": true, + "message": "Enter a valid rate (e.g. 0.03)", + "regExp": "^\\d+(\\.\\d{1,6})?$" + } + ] + }, + "value": "0.03", + "version": "KqlParameterItem/1.0" + }, + { + "id": "a1b2c3d4-e5f6-7890-abcd-200000000002", + "isRequired": true, + "label": "Cost per 1K Completion Tokens ($)", + "name": "CompletionTokenRate", + "type": 1, + "typeSettings": { + "paramValidationRules": [ + { + "match": true, + "message": "Enter a valid rate (e.g. 0.06)", + "regExp": "^\\d+(\\.\\d{1,6})?$" + } + ] + }, + "value": "0.06", + "version": "KqlParameterItem/1.0" + }, + { + "id": "a1b2c3d4-e5f6-7890-abcd-200000000003", + "isRequired": false, + "label": "PTU Capacity (tokens/min)", + "name": "PtuCapacity", + "type": 1, + "typeSettings": { + "paramValidationRules": [ + { + "match": true, + "message": "Enter a valid number (e.g. 60000)", + "regExp": "^\\d+$" + } + ] + }, + "value": "60000", + "version": "KqlParameterItem/1.0" + }, + { + "id": "a1b2c3d4-e5f6-7890-abcd-200000000004", + "isHiddenWhenLocked": true, + "label": "App ID Names (JSON)", + "name": "AppIdNamesToken", + "type": 1, + "description": "Optional JSON mapping of App IDs to friendly names.", + "value": "{}", + "version": "KqlParameterItem/1.0" + } + ], + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "style": "pills", + "version": "KqlParameterItem/1.0" + }, + "name": "parameters - aigateway", + "type": 9 + }, + { + "content": { + "json": "## AI Gateway: Token & PTU Consumption per Client\n\nWhen APIM is used as an **AI Gateway**, built-in model-level metrics (PTU utilization, token counts) do not break down by client. This tab fills that gap using `caller-tokens` custom metrics emitted by the `emit-metric` policy.\n\n| Parameter | Description |\n|---|---|\n| **Cost per 1K Prompt Tokens** | Rate for input/prompt tokens (from your Azure OpenAI pricing) |\n| **Cost per 1K Completion Tokens** | Rate for output/completion tokens |\n| **PTU Capacity** | Your provisioned throughput unit capacity in tokens/minute (for utilization %) |\n\n> Rates default to approximate GPT-4o pricing. Adjust to match your model and region." }, - "name": "text - drilldown-help", + "name": "text - header-aigateway", "type": 1 }, { "content": { - "chartSettings": { - "customThresholdLine": "{BaseMonthlyCost}", - "customThresholdLineStyle": 1, - "seriesLabelSettings": [ - { "color": "blue", "label": "Base Cost Share ($)", "seriesName": "BaseCostShare" }, - { "color": "orange", "label": "Variable Cost ($)", "seriesName": "VariableCost" }, - { "color": "purple", "label": "Total Allocated ($)", "seriesName": "TotalAllocatedCost" } - ], - "ySettings": { - "min": 0 + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "chartSettings": { + "seriesLabelSettings": [ + { "color": "blue", "label": "Prompt Tokens", "seriesName": "PromptTokens" }, + { "color": "orange", "label": "Completion Tokens", "seriesName": "CompletionTokens" } + ], + "ySettings": { + "min": 0 + } + }, + "query": "let appNames = parse_json('{AppIdNamesToken}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend CallerId = tostring(customDimensions.CallerId)\n| extend TokenType = tostring(customDimensions.TokenType)\n| where isnotempty(CallerId) and TokenType != 'total'\n| summarize Tokens = sum(value) by CallerId, TokenType\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| project Caller, TokenType, Tokens\n| evaluate pivot(TokenType, sum(Tokens))\n| project Caller, PromptTokens = column_ifexists('prompt', 0.0), CompletionTokens = column_ifexists('completion', 0.0)\n| order by PromptTokens + CompletionTokens desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 1, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Token Consumption by Client (Prompt vs Completion)", + "noDataMessage": "No caller-tokens metrics found. Ensure the token tracking policy is deployed.", + "version": "KqlItem/1.0", + "visualization": "barchart" + }, + "name": "query - token-usage-chart", + "type": 3 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNamesToken}');\nlet promptRate = todouble('{PromptTokenRate}');\nlet completionRate = todouble('{CompletionTokenRate}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend CallerId = tostring(customDimensions.CallerId)\n| extend TokenType = tostring(customDimensions.TokenType)\n| extend Model = tostring(customDimensions.Model)\n| where isnotempty(CallerId)\n| summarize Tokens = sum(value), Requests = count() by CallerId, TokenType, Model\n| evaluate pivot(TokenType, sum(Tokens))\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| extend PromptTokens = column_ifexists('prompt', 0.0)\n| extend CompletionTokens = column_ifexists('completion', 0.0)\n| extend TotalTokens = column_ifexists('total', 0.0)\n| extend PromptCost = round(PromptTokens * promptRate / 1000.0, 4)\n| extend CompletionCost = round(CompletionTokens * completionRate / 1000.0, 4)\n| extend TotalCost = round(PromptCost + CompletionCost, 4)\n| order by TotalCost desc\n| project Caller, Model, PromptTokens, CompletionTokens, TotalTokens, ['Prompt Cost ($)'] = PromptCost, ['Completion Cost ($)'] = CompletionCost, ['Total Cost ($)'] = TotalCost", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Token Cost Allocation by Client & Model", + "noDataMessage": "No caller-tokens metrics found.", + "version": "KqlItem/1.0", + "visualization": "table", + "gridSettings": { + "formatters": [ + { + "columnMatch": "TotalTokens", + "formatOptions": { + "min": 0, + "palette": "blue" + }, + "formatter": 8 + }, + { + "columnMatch": "Total Cost", + "formatOptions": { + "min": 0, + "palette": "turquoise" + }, + "formatter": 8 + } + ] + } + }, + "name": "query - token-cost-table", + "type": 3 } - }, - "query": "let baseCost = todouble('{BaseMonthlyCost}');\r\nlet perKRate = todouble('{PerRequestRate}');\r\nlet selectedBU = '{SelectedBusinessUnit}';\r\nlet allLogs = ApiManagementGatewayLogs\r\n| where TimeGenerated {TimeRange} and ApimSubscriptionId != '';\r\nlet dailyTotal = allLogs\r\n| summarize DayTotal = count() by bin(TimeGenerated, 1d);\r\nallLogs\r\n| where selectedBU != '*' and ApimSubscriptionId == selectedBU\r\n| summarize RequestCount = count() by bin(TimeGenerated, 1d)\r\n| join kind=inner dailyTotal on TimeGenerated\r\n| extend BaseCostShare = round(baseCost * RequestCount / DayTotal, 2)\r\n| extend VariableCost = round(RequestCount * perKRate / 1000.0, 4)\r\n| extend TotalAllocatedCost = round(BaseCostShare + VariableCost, 2)\r\n| project TimeGenerated, BaseCostShare, VariableCost, TotalAllocatedCost", - "queryType": 0, - "resourceType": "microsoft.operationalinsights/workspaces", - "size": 0, - "timeContext": { - "durationMs": 2592000000 - }, - "title": "Cost Trend for: {SelectedBusinessUnit}", - "version": "KqlItem/1.0", - "visualization": "linechart" + ], + "loadType": "always", + "title": "Token Usage & Cost Allocation", + "version": "NotebookGroup/1.0" }, - "conditionalVisibility": { - "comparison": "isNotEqualTo", - "parameterName": "SelectedBusinessUnit", - "value": "*" + "name": "group - token-usage", + "type": 12 + }, + { + "content": { + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "query": "let appNames = parse_json('{AppIdNamesToken}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend CallerId = tostring(customDimensions.CallerId)\n| extend TokenType = tostring(customDimensions.TokenType)\n| where isnotempty(CallerId) and TokenType == 'total'\n| summarize TotalTokens = sum(value) by CallerId\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| project Caller, TotalTokens\n| order by TotalTokens desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 1, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Total Token Share by Client", + "noDataMessage": "No caller-tokens metrics found.", + "version": "KqlItem/1.0", + "visualization": "piechart" + }, + "customWidth": "50", + "name": "query - token-share-pie", + "styleSettings": { + "maxWidth": "50%", + "showBorder": true + }, + "type": 3 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNamesToken}');\nlet promptRate = todouble('{PromptTokenRate}');\nlet completionRate = todouble('{CompletionTokenRate}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend CallerId = tostring(customDimensions.CallerId)\n| extend TokenType = tostring(customDimensions.TokenType)\n| where isnotempty(CallerId)\n| summarize Tokens = sum(value) by CallerId, TokenType\n| evaluate pivot(TokenType, sum(Tokens))\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| extend PromptCost = round(column_ifexists('prompt', 0.0) * promptRate / 1000.0, 4)\n| extend CompletionCost = round(column_ifexists('completion', 0.0) * completionRate / 1000.0, 4)\n| extend TotalCost = round(PromptCost + CompletionCost, 4)\n| order by TotalCost desc\n| project Caller, TotalCost", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 1, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Estimated Cost Share by Client", + "noDataMessage": "No caller-tokens metrics found.", + "version": "KqlItem/1.0", + "visualization": "piechart" + }, + "customWidth": "50", + "name": "query - token-cost-pie", + "styleSettings": { + "maxWidth": "50%", + "showBorder": true + }, + "type": 3 + } + ], + "loadType": "always", + "title": "Token & Cost Distribution", + "version": "NotebookGroup/1.0" }, - "name": "query - drilldown-cost-trend", - "type": 3 + "name": "group - token-distribution", + "type": 12 }, { "content": { - "json": "> Select a business unit row in the Cost Allocation table to view its cost trend.", - "style": "info" + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "json": "Token consumption trend over time. Each data point shows the total tokens consumed per client in 1-hour bins. The dashed line represents the configurable PTU capacity threshold.\n\n> **PTU utilization**: If a client's hourly token rate approaches the PTU capacity, consider provisioning more throughput or applying rate limiting." + }, + "name": "text - token-trend-description", + "type": 1 + }, + { + "content": { + "chartSettings": { + "customThresholdLine": "{PtuCapacity}", + "customThresholdLineStyle": 1, + "ySettings": { + "min": 0 + } + }, + "query": "let appNames = parse_json('{AppIdNamesToken}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend CallerId = tostring(customDimensions.CallerId)\n| extend TokenType = tostring(customDimensions.TokenType)\n| where isnotempty(CallerId) and TokenType == 'total'\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| summarize TotalTokens = sum(value) by Caller, bin(timestamp, 1h)\n| order by timestamp asc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Hourly Token Consumption by Client (vs PTU Capacity)", + "noDataMessage": "No caller-tokens metrics found.", + "version": "KqlItem/1.0", + "visualization": "timechart" + }, + "name": "query - token-trend-chart", + "type": 3 + }, + { + "content": { + "chartSettings": { + "seriesLabelSettings": [ + { "color": "blue", "label": "Prompt Tokens", "seriesName": "PromptTokens" }, + { "color": "orange", "label": "Completion Tokens", "seriesName": "CompletionTokens" } + ], + "ySettings": { + "min": 0 + } + }, + "query": "app(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend TokenType = tostring(customDimensions.TokenType)\n| where TokenType != 'total'\n| summarize Tokens = sum(value) by TokenType, bin(timestamp, 1h)\n| evaluate pivot(TokenType, sum(Tokens))\n| project timestamp, PromptTokens = column_ifexists('prompt', 0.0), CompletionTokens = column_ifexists('completion', 0.0)\n| order by timestamp asc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Prompt vs Completion Tokens Over Time (All Clients)", + "noDataMessage": "No caller-tokens metrics found.", + "version": "KqlItem/1.0", + "visualization": "areachart" + }, + "name": "query - token-type-trend", + "type": 3 + } + ], + "loadType": "always", + "title": "Token Consumption Trends & PTU Utilization", + "version": "NotebookGroup/1.0" }, - "conditionalVisibility": { - "comparison": "isEqualTo", - "parameterName": "SelectedBusinessUnit", - "value": "*" + "name": "group - token-trends", + "type": 12 + }, + { + "content": { + "expandable": true, + "expanded": true, + "groupType": "editable", + "items": [ + { + "content": { + "json": "Per-model breakdown showing which AI models each client is consuming. Useful when multiple models (GPT-4o, GPT-4o-mini, etc.) are served through the same APIM gateway." + }, + "name": "text - model-breakdown-description", + "type": 1 + }, + { + "content": { + "query": "let appNames = parse_json('{AppIdNamesToken}');\napp(\"__APP_INSIGHTS_NAME__\").customMetrics\n| where name == 'caller-tokens'\n| extend CallerId = tostring(customDimensions.CallerId)\n| extend TokenType = tostring(customDimensions.TokenType)\n| extend Model = tostring(customDimensions.Model)\n| where isnotempty(CallerId) and TokenType == 'total'\n| summarize TotalTokens = sum(value), Requests = dcount(timestamp) by CallerId, Model\n| extend Caller = iif(isnotempty(tostring(appNames[CallerId])), strcat(tostring(appNames[CallerId]), ' (', CallerId, ')'), CallerId)\n| project Caller, Model, TotalTokens, Requests\n| order by TotalTokens desc", + "queryType": 0, + "resourceType": "microsoft.operationalinsights/workspaces", + "size": 0, + "timeContext": { + "durationMs": 0 + }, + "timeContextFromParameter": "TimeRange", + "title": "Token Usage by Client & Model", + "noDataMessage": "No caller-tokens metrics found.", + "version": "KqlItem/1.0", + "visualization": "table", + "gridSettings": { + "formatters": [ + { + "columnMatch": "TotalTokens", + "formatOptions": { + "min": 0, + "palette": "blue" + }, + "formatter": 8 + } + ] + } + }, + "name": "query - model-breakdown-table", + "type": 3 + } + ], + "loadType": "always", + "title": "Model Breakdown", + "version": "NotebookGroup/1.0" }, - "name": "text - drilldown-placeholder", - "type": 1 + "name": "group - model-breakdown", + "type": 12 } ], "loadType": "always", - "title": "Business Unit Drill-Down (over time)", "version": "NotebookGroup/1.0" }, - "name": "group - drilldown", + "name": "group - tab-aigateway", "type": 12 } ],