Dataflows Gen2 Migration Guide: Upgrading Power BI Dataflows to Microsoft Fabric

Dataflows Gen2 in Microsoft Fabric is the next generation of Power BI's cloud-based data transformation engine. While Gen1 Dataflows served organizations well for basic Power Query-based ETL, Gen2 adds Apache Spark processing, native OneLake storage, and tighter integration with the broader Fabric ecosystem. Migrating from Gen1 to Gen2 requires careful planning to preserve existing transformation logic while taking advantage of new capabilities.

Gen1 vs Gen2: Key Differences

Understanding what changed helps plan an effective migration:

Storage: Gen1 stores transformed data in an internal Azure Data Lake that is only accessible through Power BI. Gen2 stores data as Delta tables in OneLake, making it accessible from Lakehouses, Warehouses, Notebooks, and any tool that can read Delta format.

Processing Engine: Gen1 uses the Power Query mashup engine exclusively. Gen2 uses the mashup engine for most transformations but can also leverage Apache Spark for heavy processing. This hybrid approach gives better performance on large datasets.

Scheduling: Gen1 uses Power BI's built-in refresh scheduling. Gen2 integrates with Fabric Data Pipelines, enabling complex orchestration - trigger dataflow after upstream pipeline completes, chain multiple dataflows with conditional logic, and implement retry policies.

Monitoring: Gen1 provides basic refresh history (success/failure/duration). Gen2 provides detailed Spark execution logs, step-level timing, and integration with Fabric Monitoring Hub for comprehensive observability.

Output Destinations: Gen1 outputs to the Power BI service internal storage. Gen2 can output to Lakehouse tables, Warehouse tables, or KQL databases, enabling broader consumption beyond Power BI.

Migration Assessment

Before migrating, inventory your Gen1 dataflows:

Complexity Audit: Catalog each dataflow by number of entities (tables), transformation complexity, data volume, refresh frequency, and downstream dependencies (which datasets connect to this dataflow).

Compatibility Check: Most Power Query M transformations work identically in Gen2. However, some features have differences: - Custom connectors may need updating for Fabric compatibility - On-premises gateway connections work differently in Fabric - Enhanced compute engine features from Gen1 are replaced by Spark in Gen2

Dependency Mapping: Document which Power BI datasets, reports, and other dataflows depend on each Gen1 dataflow. Migration must maintain these connections or establish new ones.

Migration Strategy

Approach 1: Side-by-Side Migration

Create Gen2 versions alongside Gen1, validate outputs match, then switch consumers:

Create a new Gen2 dataflow in a Fabric workspace
Copy the Power Query M code from Gen1 entities to Gen2 entities
Configure Gen2 output destination (Lakehouse table recommended)
Run both Gen1 and Gen2 on the same schedule for 1-2 weeks
Compare outputs to verify data matches
Redirect consuming datasets to the Gen2 output
Decommission Gen1 after confirmed stability

This is the safest approach but requires temporary double processing.

Approach 2: In-Place Upgrade

For simple dataflows with straightforward transformations:

Note the Gen1 dataflow configuration (M code, schedule, connections)
Delete the Gen1 dataflow
Create a Gen2 dataflow with the same logic
Reconfigure downstream datasets to connect to the new output location
Test and validate

Faster but riskier - no parallel validation period.

Approach 3: Phased Migration

For organizations with dozens of dataflows:

Phase 1: Migrate standalone dataflows with no dependencies
Phase 2: Migrate dataflows that are dependencies for Phase 1 outputs
Phase 3: Migrate complex dataflows with custom connectors or gateway dependencies
Phase 4: Decommission Gen1 infrastructure

Common Migration Challenges

Gateway Changes: Gen1 dataflows using on-premises data gateways may need gateway reconfiguration for Fabric. Verify gateway compatibility with Fabric before migrating.

Output Format Changes: Gen1 outputs are consumed differently than Gen2 Lakehouse tables. Datasets connecting to Gen1 dataflows need reconfiguration to read from Lakehouse tables via Direct Lake or Import mode.

Scheduling Differences: Gen1 refresh schedules do not migrate automatically. Recreate schedules in Fabric or configure pipeline-based triggering.

Custom Functions: Gen1 dataflows with shared Power Query functions must recreate those functions in the Gen2 environment.

Incremental Refresh: Gen1 incremental refresh policies need translation to Gen2 equivalents. Gen2 offers more flexible incremental processing through Spark.

Post-Migration Optimization

After successful migration, take advantage of Gen2 capabilities:

Lakehouse Integration: Gen2 outputs are instantly queryable from Lakehouse SQL endpoint - no additional loading required
Pipeline Orchestration: Replace simple schedules with pipeline-driven execution for complex dependency chains
Spark Processing: For large-volume transformations (100M+ rows), switch from mashup engine to Spark notebooks for 10x performance improvement
Monitoring: Use Fabric Monitoring Hub for detailed execution insights across all dataflows

Related Resources

Frequently Asked Questions

What are the main differences between Dataflows Gen1 and Gen2?

Dataflows Gen2 (Fabric) adds several capabilities beyond Gen1 (Power BI): (1) Storage location—Gen2 stores data in OneLake as Delta tables (queryable from lakehouses), Gen1 stores in internal Azure Data Lake, (2) Transformation engine—Gen2 supports both Power Query mashup engine AND Apache Spark notebooks, Gen1 only supports Power Query, (3) Scheduling—Gen2 integrates with Fabric pipelines and job scheduler, Gen1 uses Power BI refresh schedules, (4) Monitoring—Gen2 provides detailed Spark logs and metrics, Gen1 has limited diagnostics, and (5) Capacity—Gen2 runs on Fabric capacity (any F-SKU), Gen1 requires Power BI Premium. Both support Power Query M language and incremental refresh. Gen2 is the future direction—Gen1 will continue working but receive no new features. Migration is one-way—you cannot downgrade Gen2 back to Gen1.

Will my existing dataflows stop working if I do not migrate to Gen2?

No, Power BI Dataflows Gen1 will continue functioning indefinitely—Microsoft has not announced any deprecation timeline. However, all new Dataflow features are only available in Gen2 (Spark transformations, OneLake integration, advanced scheduling). For organizations staying on Power BI Premium (not migrating to Fabric), Gen1 dataflows remain fully supported. Migration is recommended when: (1) Moving to Microsoft Fabric capacity, (2) Needing Spark-based transformations for complex data engineering, (3) Wanting to query dataflow output in lakehouses/warehouses, or (4) Requiring advanced monitoring and pipeline orchestration. If current Gen1 dataflows meet your needs and you are staying on Power BI Premium, migration is optional. Most organizations migrate as part of broader Fabric adoption, not due to Gen1 limitations.

How long does a typical dataflow migration from Gen1 to Gen2 take?

Simple dataflows (5-10 tables, basic Power Query transformations, no complex dependencies) migrate in 1-2 hours per dataflow. Complex dataflows (50+ tables, computed entities, incremental refresh, linked entities) require 1-2 days each. Process: (1) Export Gen1 dataflow as Power Query template (30 minutes), (2) Create new Gen2 dataflow in Fabric workspace (15 minutes), (3) Import template and reconnect data sources (1-2 hours), (4) Reconfigure incremental refresh and parameters (30 minutes), (5) Test refresh and validate output (1-2 hours), (6) Update downstream Power BI reports to use Gen2 dataflow (30 minutes per report). Enterprises with dozens of dataflows typically migrate 2-3 per week, completing full migration in 2-3 months. Parallel migration possible for independent dataflows. Allow extra time for testing—always validate row counts and data quality match Gen1 before retiring old dataflow.

Dataflows Gen2 Migration Guide: Upgrading Power BI Dataflows to Microsoft Fabric

Gen1 vs Gen2: Key Differences

Migration Assessment

Migration Strategy

Approach 1: Side-by-Side Migration

Approach 2: In-Place Upgrade

Approach 3: Phased Migration

Common Migration Challenges

Post-Migration Optimization

Related Resources

Frequently Asked Questions

What are the main differences between Dataflows Gen1 and Gen2?

Will my existing dataflows stop working if I do not migrate to Gen2?

How long does a typical dataflow migration from Gen1 to Gen2 take?

Related Articles

Getting Started with Microsoft Fabric: A Complete Guide for 2025

Microsoft Fabric Capacity Planning Guide

Microsoft Fabric: Warehouse vs Lakehouse - When to Use Each

Related Services

Microsoft Fabric Consulting

Data Analytics

Architecture Consulting

Industry Solutions

Need Help With Power BI?

Ready to Transform Your Data Strategy?