
Mastering Incremental Refresh in Power BI
Reduce Power BI dataset refresh times by up to 90% with incremental refresh. Step-by-step guide for large datasets with millions of rows.
Incremental refresh in Power BI reduces dataset refresh times by up to 95% by only processing new and changed data instead of reloading entire tables. If your Power BI dataset takes more than 10 minutes to refresh, incremental refresh is the single most impactful optimization you can implement - I have seen it cut a 2-hour refresh to under 8 minutes on a 100M-row IoT dataset. This guide walks through configuration step by step, covers advanced patterns, and shares the real-world pitfalls I have encountered across dozens of enterprise implementations.
The core concept is straightforward: instead of refreshing every row in a table during each cycle, Power BI partitions your table by date ranges and only reprocesses partitions within a defined "refresh window." Historical partitions stay untouched. For a sales table with 50 million rows spanning five years, the gateway extracts only the last 30 days of data (maybe 300,000 rows) instead of all 50 million. The math is simple - that is a 99.4% reduction in data volume per refresh. Our Power BI consulting services have implemented incremental refresh for datasets ranging from 10M to 2B rows across healthcare, finance, and retail.
How Incremental Refresh Works Under the Hood
Traditional dataset refresh replaces all data in every table during each refresh cycle. Power BI drops the existing data, sends a full query to the source, transfers every row through the gateway, and rebuilds the entire table in the dataset. For large tables, this is wasteful, slow, and expensive.
Incremental refresh changes this by creating time-based partitions automatically. When you configure a 5-year archive with a 30-day refresh window, Power BI creates partitions like this:
- Historical partitions (2020-01 through current month minus 30 days): Never refreshed after initial load
- Refresh partitions (last 30 days): Refreshed every cycle
- Optional DirectQuery partition (current day/hour): Real-time data via DirectQuery
Each partition maps to a separate query against the data source. During refresh, only the queries for the refresh window execute. Historical partitions are skipped entirely.
Step-by-Step Configuration
Step 1: Create RangeStart and RangeEnd Parameters
In Power Query Editor, create two DateTime parameters:
- RangeStart: Set to a sample date (e.g., 1/1/2024). Type must be DateTime
- RangeEnd: Set to a later date (e.g., 2/1/2024). Type must be DateTime
These parameters must be named exactly RangeStart and RangeEnd (case-sensitive). Power BI uses them to generate partition boundaries. I cannot overstate how critical the exact naming is - I have debugged issues where someone named a parameter "rangeStart" (lowercase r) and spent 3 hours wondering why incremental refresh was not working.
Step 2: Filter Your Source Query
Apply a filter to your date column using these parameters. In Power Query, filter the date column to be "is after or equal to RangeStart" AND "is before RangeEnd." This filter enables Power BI to generate the appropriate SQL WHERE clause for each partition.
The filter must use the native query folding capabilities of your data source. If query folding breaks, the entire table will be extracted for each partition, negating the performance benefits. This is the number one mistake I see - teams configure incremental refresh, celebrate the setup, then wonder why refreshes are actually slower. Always verify query folding before configuring incremental refresh.
Step 3: Define the Refresh Policy
Right-click the table in the model view and select "Incremental refresh and real-time data." Configure:
- Archive data starting: How far back to keep historical data (e.g., 5 years). These partitions are never refreshed. Set this based on your business reporting needs - do not archive more data than users actually query
- Incrementally refresh data starting: The rolling window of data to refresh each cycle (e.g., 30 days). Only these partitions are processed during refresh
- Detect data changes: Optionally specify a column (like LastModifiedDate) to only refresh partitions where data actually changed. This is extremely powerful for datasets where historical records get updated (financial adjustments, healthcare claim corrections)
- Only refresh complete days: Prevents partial-day data issues by waiting for a full day before including it. Enable this unless you need intraday data
Step 4: Publish and Configure
Publish to a Premium or Pro workspace. The first refresh processes all historical data (this may take hours for large datasets). Subsequent refreshes only process the incremental window, completing in minutes.
Critical first-refresh tip: Schedule the initial full refresh during off-hours (weekends or overnight). I have seen organizations publish during business hours, trigger the initial refresh, and bring their gateway to its knees for 4+ hours. For datasets over 100M rows, consider loading historical data in batches using XMLA endpoint and TMSL scripts.
Real-World Performance Impact
| Scenario | Full Refresh | Incremental Refresh | Improvement |
|---|---|---|---|
| 50M row sales table (5 years) | 45 minutes | 4 minutes | 91% faster |
| 100M row IoT sensor data | 2 hours | 8 minutes | 93% faster |
| 20M row financial transactions | 25 minutes | 3 minutes | 88% faster |
| 500M row healthcare claims | 6.5 hours | 12 minutes | 97% faster |
| 80M row retail POS data | 1.5 hours | 6 minutes | 93% faster |
The improvement ratio increases with dataset size. For datasets under 1M rows, incremental refresh adds complexity without meaningful benefit.
Advanced Configuration Options
Real-Time Data with DirectQuery: In Premium workspaces, combine incremental refresh with a DirectQuery partition for the most recent data. Historical data uses fast Import mode while the current partition queries the source directly for real-time results. I use this pattern for executive dashboards that need last-hour freshness without compromising performance on historical trend analysis.
Custom Partition Granularity: By default, Power BI creates daily, monthly, quarterly, or yearly partitions based on your refresh window. XMLA endpoint access (Premium) allows custom partition management for fine-grained control. For one retail client, we created hourly partitions for the current week and daily partitions for everything else.
Detect Data Changes: Configure a "watermark" column (like LastModifiedDate) so Power BI only refreshes partitions where data actually changed. This further reduces refresh times when only a few partitions have new data. In one financial services implementation, this reduced the effective refresh from 30 partitions to 2-3 partitions per cycle.
Polishing Partitions via XMLA: For Premium workspaces, use XMLA endpoint with tools like Tabular Editor or SSMS to inspect, merge, or manually refresh specific partitions. This is essential for handling edge cases like backdated data corrections that fall outside the normal refresh window.
Query Folding Requirements
Incremental refresh depends on query folding - the ability for Power Query to push filter operations back to the data source as native SQL. If query folding breaks, the entire dataset is pulled through the gateway before filtering. This is not just a performance issue - it can cause gateway crashes and timeout failures.
Sources with reliable query folding: SQL Server, Azure SQL, Synapse Analytics, Oracle, PostgreSQL, Snowflake, Databricks SQL, Google BigQuery, Amazon Redshift
Sources where folding may not work: Flat files (CSV, Excel), web APIs, SharePoint lists, some ODBC connections, any source after a non-foldable Power Query step
Verify query folding by right-clicking a step in Power Query and checking if "View Native Query" is available. If greyed out, folding has broken at that step. Every step after the break also loses folding. Common folding breakers include: - Table.Buffer or List.Buffer calls - Custom M functions without query folding delegation - Merging with non-foldable queries - Certain type conversions applied before the filter step
Pro tip: Always apply the RangeStart/RangeEnd filter as early as possible in your query steps, ideally right after the source connection.
Troubleshooting Common Issues
- First refresh takes too long: Expected behavior - the initial load processes all historical data. Schedule during off-hours
- Refresh fails with timeout: Check gateway timeout settings. Large initial loads may exceed default timeouts
- Data not appearing for current day: Enable "Only refresh complete days" if your source updates throughout the day to avoid partial data
- Row counts do not match: Verify your RangeStart/RangeEnd filters include boundary conditions correctly (>= and <). This is the most common data accuracy issue I encounter
- Gateway memory pressure: Monitor gateway memory during refresh and consider dedicated gateway clusters for large implementations
- Partition management errors: If you see "cannot determine partitions" errors, verify the RangeStart/RangeEnd parameters have the correct DateTime type
Incremental Refresh vs Full Refresh Decision Matrix
| Factor | Full Refresh | Incremental Refresh |
|---|---|---|
| Dataset size | Under 1M rows | Over 5M rows |
| Refresh duration | Under 5 minutes | Over 10 minutes |
| Data source | Files, APIs | SQL databases |
| Query folding | Not available | Verified working |
| Historical data changes | Frequent full rewrites | Append-only or watermark |
| Complexity tolerance | Low | Medium to high |
Related Resources
Frequently Asked Questions
Does incremental refresh require Power BI Premium?
Basic incremental refresh works with Power BI Pro. However, advanced features like real-time DirectQuery partitions, XMLA endpoint partition management, and detect-data-changes require Power BI Premium or Premium Per User (PPU) licensing.
How much can incremental refresh reduce my refresh times?
Most organizations see 80-95% reduction in refresh times. A dataset that takes 45 minutes for full refresh typically completes in 3-5 minutes with incremental refresh configured. The improvement depends on the ratio of new data to historical data and your partition granularity.
What happens if my data source does not support query folding?
Without query folding, Power BI must download the entire dataset through the gateway for each partition, which can actually make refreshes slower than without incremental refresh. Verify query folding works by checking "View Native Query" in Power Query before configuring incremental refresh. SQL-based sources typically support folding; flat files and APIs typically do not.