How often does the Capacity Metrics app refresh?

The app refreshes every 30 minutes by default. You can see real-time utilization in the Azure portal, but the app provides more detailed historical analysis.

Can I set up alerts for capacity issues?

Yes, you can create Power BI data alerts on key metrics like utilization percentage or throttling events to receive notifications when thresholds are exceeded.

Monitoring Fabric with Capacity Metrics

The Microsoft Fabric Capacity Metrics app is the single most important monitoring tool for Fabric administrators. Without it, you are operating blind—unable to determine whether your capacity is right-sized, which workloads consume the most resources, when throttling occurs, or whether individual users or jobs are monopolizing shared compute. The app provides a pre-built Power BI report that visualizes capacity utilization across all Fabric workloads (Power BI, Data Engineering, Data Factory, Real-Time Analytics, Data Science) with historical trends, item-level detail, and throttling analysis. Mastering this app is essential for cost optimization, performance troubleshooting, and capacity planning.

Understanding Fabric Capacity Units

All Fabric workloads consume a shared pool of Capacity Units (CUs). Understanding how CUs work is fundamental to interpreting metrics:

| Concept | Description | Key Detail | |---|---|---| | CU Seconds | Unit of compute measurement | 1 CU consumed for 1 second = 1 CU-second | | CU Allocation | CUs available per SKU | F2 = 2 CUs, F8 = 8 CUs, F64 = 64 CUs, etc. | | Interactive Operations | User-initiated queries, report renders | Evaluated in 30-second windows, throttled at 100% utilization | | Background Operations | Scheduled refreshes, pipelines, Spark jobs | Evaluated in 24-hour windows, throttled at sustained overuse | | Smoothing | CU consumption is smoothed over evaluation windows | A 10-second spike does not immediately trigger throttling | | Burst | Short-term consumption can exceed SKU allocation | Bursting borrows from future capacity—repaid over the evaluation window |

The distinction between interactive and background operations is critical. Interactive operations (report queries, dashboard refreshes) are throttled quickly when capacity is overloaded—users see slow or failed reports. Background operations (scheduled refreshes, Spark jobs) are throttled more gradually, with jobs queued rather than rejected.

Installing and Configuring the App

Prerequisites

Fabric Capacity Administrator or Global Administrator role
At least one Fabric capacity (F2 or higher, or Power BI Premium P1 or higher)
A workspace to host the app

Installation Steps

Navigate to Power BI Service > Apps > Get apps and search for "Microsoft Fabric Capacity Metrics"
Click Install and select the workspace where the app will be stored
After installation, open the app and click Connect to configure the data source
Enter your Capacity ID (found in the Fabric Admin Portal under Capacity Settings) and select the date range
The app begins loading historical data—initial load may take several minutes for large capacities

Data Retention

The app stores 14 days of detailed metrics and 30 days of summarized data. For longer retention, create a dataflow that exports capacity metrics to a Lakehouse on a scheduled basis.

Key Dashboard Pages

Overview Page

The overview page shows the most critical health indicators at a glance:

CU Utilization Trend: Line chart showing CU consumption percentage over time. Sustained utilization above 80% indicates you are approaching capacity limits. Sustained utilization below 20% suggests over-provisioning.
Throttling Events: Count and duration of throttling events. Any throttling of interactive operations means users experienced degraded performance.
Top Items by CU: Ranked list of Fabric items (datasets, notebooks, pipelines) consuming the most CUs. This identifies optimization targets.

Workload Breakdown Page

Shows CU consumption segmented by workload type:

Power BI: Report queries, dataset refreshes, paginated report renders
Data Engineering: Spark notebook execution, Lakehouse operations
Data Factory: Pipeline runs, dataflow refreshes, copy activities
Real-Time Analytics: Eventstream processing, KQL queries
Data Science: ML experiment runs, model training

This breakdown reveals which workloads dominate your capacity. If 70% of CU consumption comes from Spark notebooks, optimizing those notebooks delivers the most savings.

Item-Level Detail Page

Drill into individual items to see:

CU consumption per refresh or execution
Duration trends over time (increasing duration signals degradation)
User who triggered the operation
Success/failure status
Queuing time (time waiting for available CUs before execution starts)

Interpreting Capacity Health

Healthy Capacity

CU utilization averages 40-60% during peak hours
Zero or minimal interactive throttling events
Background operations complete within scheduled windows
No single item consumes more than 20% of total CU budget

Capacity at Risk

CU utilization peaks above 80% regularly
Occasional interactive throttling during peak hours
Background operations starting to queue and delay
One or two items dominate CU consumption

Capacity in Crisis

CU utilization sustained above 100% (consuming burst capacity)
Frequent interactive throttling—users reporting slow or failed reports
Background operations significantly delayed or failing
Throttling policy may reject new operations

Optimization Strategies by Finding

Finding: Single Refresh Consuming Excessive CUs

Root cause: A large dataset refresh (often with full refresh instead of incremental) monopolizes capacity during refresh windows.

Solution: Implement incremental refresh to reduce the data volume refreshed each cycle. Switch from full refresh to partition-level refresh where possible. Schedule the refresh during off-peak hours when capacity headroom is available.

Finding: Spark Notebooks Consuming Majority of CUs

Root cause: Unoptimized Spark jobs with excessive shuffling, missing partitioning, or oversized cluster configurations.

Solution: Review Spark job optimization—apply proper partitioning, use broadcast joins for small tables, cache intermediate results, and reduce the cluster size to the minimum needed for the workload.

Finding: Many Users Running Reports Simultaneously at Peak

Root cause: Report rendering generates interactive CU demand that exceeds capacity during business hours.

Solution: Optimize DAX queries in the most-consumed reports (use Performance Analyzer to identify slow measures). Enable query caching on frequently accessed datasets. Consider scaling the capacity up during peak hours and down during off-hours (Fabric supports capacity pause/resume via API for scheduling).

Finding: Development Workloads Consuming Production Capacity

Root cause: Data engineers and analysts running ad-hoc Spark notebooks or dataflow tests on the production capacity.

Solution: Create a separate development capacity (smaller SKU, paused when not in use). Move development workspaces to the development capacity. Implement workspace governance that prevents development workloads from running on production.

Setting Up Alerts

Create proactive alerts to catch capacity issues before users report them:

Power BI data alert: Pin the CU Utilization card to a dashboard, then create an alert for when utilization exceeds 85%
Power Automate flow: Trigger a Teams notification or email when throttling events are detected
Azure Monitor integration: For Premium capacities, configure Azure Monitor alerts on capacity metrics with automatic scaling responses

Capacity Planning with Metrics Data

Use historical metrics data to plan capacity changes:

Growth trending: If average CU utilization increases 5% month-over-month, project when you will hit capacity limits
Workload forecasting: When onboarding a new department or workload, review similar existing workloads to estimate CU impact
SKU optimization: If utilization never exceeds 30%, consider downgrading to a smaller SKU. If throttling occurs regularly, upgrade to the next SKU tier.
Cost modeling: Calculate cost-per-CU-second to determine ROI of optimization efforts. A $100/hour optimization effort that reduces daily CU consumption by 20% may pay for itself within a week.

Monitoring Fabric with Capacity Metrics

Understanding Fabric Capacity Units

Installing and Configuring the App

Prerequisites

Installation Steps

Data Retention

Key Dashboard Pages

Overview Page

Workload Breakdown Page

Item-Level Detail Page

Interpreting Capacity Health

Healthy Capacity

Capacity at Risk

Capacity in Crisis

Optimization Strategies by Finding

Finding: Single Refresh Consuming Excessive CUs

Finding: Spark Notebooks Consuming Majority of CUs

Finding: Many Users Running Reports Simultaneously at Peak

Finding: Development Workloads Consuming Production Capacity

Setting Up Alerts

Capacity Planning with Metrics Data

Related Resources

Frequently Asked Questions

How often does the Capacity Metrics app refresh?

Can I set up alerts for capacity issues?

Related Articles

Getting Started with Microsoft Fabric: A Complete Guide for 2025

Microsoft Fabric Capacity Planning Guide

Microsoft Fabric: Warehouse vs Lakehouse - When to Use Each

Related Services

Microsoft Fabric Consulting

Data Analytics

Architecture Consulting

Industry Solutions

Need Help With Power BI?

Ready to Transform Your Data Strategy?