Data Factory in Microsoft Fabric: Complete Pipeline Guide
Data Factory in Microsoft Fabric: Complete Pipeline Guide
Master Data Factory in Fabric — data pipelines, Dataflows Gen2, connectors, scheduling, monitoring, and migration from Azure Data Factory.
Data Factory in Microsoft Fabric provides visual data integration and orchestration capabilities for building ETL/ELT pipelines. With 1,000 monthly searches for "data factory," this workload is central to enterprise data engineering in Fabric.
What Is Data Factory in Fabric?
Data Factory in Fabric brings the pipeline and dataflow capabilities of Azure Data Factory into the unified Fabric experience. It provides:
- Data Pipelines: Orchestrate data movement and transformation with a visual designer
- Dataflows Gen2: Self-service data preparation with Power Query (no-code)
- 200+ Connectors: Connect to cloud services, databases, files, and APIs
- Scheduling: Automate data loads on time-based or event-based triggers
- Monitoring: Track pipeline runs, errors, and performance
Data Pipelines vs Dataflows Gen2
| Feature | Data Pipelines | Dataflows Gen2 |
|---|---|---|
| Interface | Visual pipeline designer | Power Query editor |
| Skill level | Data engineer | Business analyst |
| Scale | Enterprise ETL/ELT | Self-service data prep |
| Coding | No-code + expressions | No-code + M language |
| Output | Any Fabric destination | Lakehouse or Warehouse |
| Scheduling | Time + event triggers | Time triggers |
| Error handling | Advanced (retry, branching) | Basic |
Building Your First Pipeline
Step 1: Create a Pipeline 1. Open a Fabric workspace 2. Click New → Data Pipeline 3. Name your pipeline (e.g., "Daily Sales Load")
Step 2: Add Activities Drag activities from the toolbox onto the canvas: - Copy Data: Move data between sources and destinations - Dataflow: Run a Dataflows Gen2 transformation - Notebook: Execute a Spark notebook - Stored Procedure: Run SQL in a warehouse - ForEach: Loop over a set of items - If Condition: Branch based on expressions - Web: Call REST APIs - Wait: Pause execution
Step 3: Configure Copy Data 1. Set Source: connection, table/query, authentication 2. Set Destination: Lakehouse table, warehouse table, or files 3. Configure mapping: column mapping, data types 4. Set performance: parallel copies, staging
Step 4: Add Scheduling 1. Click Schedule on the pipeline toolbar 2. Set frequency: hourly, daily, weekly 3. Set time zone and start time 4. Enable/disable as needed
Step 5: Monitor View pipeline runs in the Monitoring Hub: - Run status (succeeded, failed, in progress) - Duration and data volumes - Error messages and retry counts - Activity-level details
Dataflows Gen2: Self-Service ETL
Dataflows Gen2 use the Power Query interface familiar to Excel and Power BI users:
- Create New → Dataflow Gen2
- Connect to data source
- Transform data (filter, merge, pivot, calculate)
- Set destination (Lakehouse or Warehouse table)
- Schedule refresh
Key advantage: Business analysts can build data pipelines without learning Spark or SQL.
See our Power Query guide for transformation techniques.
Migration from Azure Data Factory
If you're currently using Azure Data Factory (ADF):
What Migrates Easily - Copy Data activities (same interface) - Pipeline orchestration patterns - Most connectors are available - Scheduling and triggers
What Changes - Linked Services → Connections (simplified) - Integration Runtimes → Managed within Fabric - Mapping Data Flows → Dataflows Gen2 (different engine) - Storage → OneLake replaces ADLS
Migration Approach 1. Assess current ADF pipelines and prioritize by business value 2. Recreate high-priority pipelines in Fabric Data Factory 3. Test data quality and performance 4. Transition scheduling and decommission ADF pipelines 5. Retain ADF for unsupported scenarios (some connectors)
Best Practices
- Use Dataflows Gen2 for simple transformations — Don't over-engineer with pipelines when Power Query suffices
- Use pipelines for orchestration — Coordinate notebooks, stored procedures, and dataflows
- Implement medallion architecture — Bronze (raw) → Silver (cleaned) → Gold (business-ready)
- Monitor actively — Set up alerts for pipeline failures
- Version control — Use Fabric Git integration for pipeline version history
Our Microsoft Fabric consulting team specializes in data pipeline design and migration from Azure Data Factory. Contact us for a migration assessment.
Frequently Asked Questions
What is the difference between Data Factory in Fabric and Azure Data Factory?
Data Factory in Fabric is a simplified, SaaS version of Azure Data Factory integrated into the Fabric platform. It shares the same pipeline design interface but uses Fabric connections instead of linked services, stores output in OneLake instead of ADLS, and benefits from unified Fabric governance and billing. Azure Data Factory remains available as a standalone Azure service for organizations not yet on Fabric or needing specific ADF features not yet in Fabric.
When should I use Dataflows Gen2 vs Data Pipelines?
Use Dataflows Gen2 when: business analysts need to prepare data without coding, transformations are straightforward (filter, merge, calculate), and the output goes to a single Lakehouse or Warehouse table. Use Data Pipelines when: you need orchestration (coordinate multiple steps), require error handling with retries and branching, need to call notebooks or stored procedures, or are building complex multi-step ETL processes.
Can Data Factory in Fabric connect to on-premises data sources?
Yes, through the on-premises data gateway. Install the gateway on a Windows server with access to your on-premises databases (SQL Server, Oracle, SAP, file shares), then configure the connection in Fabric. The gateway acts as a secure bridge between your on-premises network and Fabric cloud. For enterprise deployments, configure gateway clustering for high availability with 2-3 gateway servers.