Is medallion architecture required in Fabric?

No, medallion is a recommended pattern but not required. It provides clear data lineage and progressive quality improvement. Smaller projects might simplify, but enterprise implementations benefit from the structure.

How do I handle data quality failures in medallion architecture?

Implement quarantine tables in the Bronze or Silver layer. Records failing quality checks go to quarantine for review while valid data progresses. Track and resolve quarantined records through a defined process.

What is the medallion architecture in Microsoft Fabric?

The medallion architecture is a data organization pattern with three layers: Bronze (raw data ingestion), Silver (cleaned, validated, conformed data), and Gold (business-ready aggregations and models). In Fabric, each layer is implemented as Delta tables in a Lakehouse, providing version history, ACID transactions, and schema evolution at each tier.

How many Lakehouses should I use for medallion layers?

Best practice is to use separate Lakehouses for each medallion layer (Bronze, Silver, Gold) within a single Fabric workspace. This provides clear separation of concerns, independent access control per layer, and simpler management. Use OneLake shortcuts to reference data across Lakehouses without duplication.

Can Power BI connect directly to each medallion layer?

Yes, but best practice is to connect Power BI only to the Gold layer. Gold tables are optimized for analytical queries with pre-aggregated, business-ready data. Connecting to Bronze or Silver layers exposes raw or intermediate data that may contain quality issues and is not optimized for BI query patterns.

How is a Power BI engagement priced for enterprise workloads?

Pricing is modeled against three variables: the complexity of the semantic layer, the volume and velocity of source data, and the governance footprint required after go-live. A scoped implementation typically runs as a fixed-fee discovery sprint followed by a time-and-materials build, because dataset refresh patterns and row-level security rules almost always evolve once real user personas review early drafts. Licensing is separated from consulting fees and mapped against Power BI Pro, Premium Per User, and Fabric capacity SKUs so finance teams can plan capacity uplift independently of delivery cost. A typical mid-market rollout lands between 120 and 480 consulting hours across data modeling, DAX optimization, report design, and deployment pipelines. Fabric workloads add capacity sizing sessions to avoid over-provisioning F-SKUs on day one.

How long does a production-ready Power BI rollout take?

A single subject-area workspace with a conformed star schema, deployment pipeline, and row-level security ships in roughly six to eight weeks once source access is granted. Multi-domain rollouts that span finance, operations, and customer analytics typically run three to five months because the semantic model has to reconcile calendar, product, and organizational hierarchies that rarely align across source systems. Fabric lakehouse projects add four to six weeks for medallion design, Direct Lake tuning, and OneLake shortcut setup. Timelines compress when an authoritative data dictionary already exists and lengthen when stakeholders are discovering definitions (for example, what counts as "active customer") for the first time. Governance, training, and center-of-excellence enablement are run in parallel rather than sequentially.

What delivery methodology do you use for Power BI and Fabric projects?

Delivery runs on a five-stage pattern: Discover, Model, Visualize, Operationalize, and Enable. Discover captures questions stakeholders actually want answered, not just source tables. Model builds a Kimball-style star schema (or a medallion lakehouse in Fabric) with conformed dimensions and documented grains. Visualize produces reports against a shared theme file, applies accessibility tokens, and uses bookmarks and field parameters instead of bespoke page duplication. Operationalize wires up deployment pipelines, dataset refresh monitoring, and Azure DevOps or GitHub source control via TMDL. Enable delivers hands-on training for citizen developers and a lightweight center-of-excellence charter. Each stage ends with a demo and a written acceptance checklist, so scope creep is visible before it becomes rework.

How do you secure sensitive data inside Power BI reports?

Security is layered rather than relying on a single control. Row-level security and object-level security are enforced inside the semantic model using DAX filter expressions driven by Entra ID group membership, so filters survive refresh and cannot be bypassed by report-level tricks. Sensitivity labels from Microsoft Purview are applied at dataset and report scope and follow exports into Excel and PDF. Workspace roles follow a least-privilege pattern (Admin, Member, Contributor, Viewer) and are granted to Entra ID groups, never individuals. Private endpoints and VNet data gateways keep on-premises sources off the public internet, and tenant-level settings restrict publish-to-web, external sharing, and guest access. Every production dataset is audited quarterly against a written RLS test plan.

Can existing Excel and Tableau reports be migrated into Power BI?

Yes, but a migration is an opportunity to redesign rather than a literal port. Excel financial models usually translate cleanly into a semantic model: named ranges become dimensions, pivot tables become matrix visuals, and SUMIFS chains become well-formed DAX measures with variables. Tableau workbooks require more interpretation because Tableau extracts and Power BI datasets have different refresh and relationship semantics; calculated fields are rewritten as DAX, LOD expressions become CALCULATE patterns, and dashboard actions are rebuilt using bookmarks and drillthrough. A discovery audit catalogs every report, ranks them by business value, and retires the long tail of duplicates before migration starts. This typically reduces the report estate by 30–50 percent.

How do you optimize DAX performance on large semantic models?

DAX performance starts with the model, not the measure. Star schemas with integer surrogate keys, single-direction relationships, and calculated columns materialized in Power Query outperform ambiguous snowflakes every time. Measures are written using variables (VAR/RETURN) to avoid repeated context transitions, and expensive patterns like FILTER over entire tables are replaced with KEEPFILTERS and Boolean filter arguments. Aggregation tables and composite models offload detail-level queries from imported caches to Direct Lake or DirectQuery sources. Every measure that appears in a production report is profiled with DAX Studio and VertiPaq Analyzer; queries above a 2-second threshold on representative hardware are rewritten before release. Fabric capacity metrics are reviewed weekly to catch runaway refreshes and interactive CPU spikes.

Do you support Microsoft Fabric, OneLake, and Direct Lake workloads?

Fabric is now the default landing zone for new analytics workloads unless a client has a specific reason to stay on legacy Power BI Premium capacity. Engagements cover medallion architecture design in lakehouses, data engineering pipelines built in Fabric notebooks or Dataflows Gen2, shortcut-based integration with Azure Data Lake Storage and Amazon S3 through OneLake, and Direct Lake semantic models that skip import refresh entirely. Capacity sizing is based on measured workload patterns rather than vendor rules of thumb, and F-SKU scaling is automated through Azure Logic Apps or the Fabric REST API so off-hours costs stay predictable. Copilot in Fabric is configured with tenant-level data boundaries and audit logging before it is enabled for end users.

What training and enablement do end users receive after go-live?

Training is tiered by persona. Report consumers get a 45-minute guided tour covering filters, bookmarks, subscriptions, and mobile access. Analysts get a three-session workshop on self-service semantic model extensions, composite models, and publishing to workspaces that follow the governance pattern. Data modelers receive a two-day immersion on star schema design, DAX fundamentals, and deployment pipeline usage. Every session is recorded, indexed, and paired with a sandbox workspace seeded with representative sample data. A written center-of-excellence charter defines who owns certified datasets, who can promote a report to production, and how to request enhancements. Office hours run for 30 days post-launch so questions do not pile up into a formal change request.

How are Power BI deployments managed across dev, test, and production?

Every production workspace is paired with matching development and test workspaces wired through Power BI deployment pipelines. Datasets are source-controlled as TMDL files in Azure DevOps or GitHub so pull requests can be reviewed by a second modeler before merge. Environment-specific parameters (connection strings, sensitivity label rules, capacity assignments) are swapped at deployment time using parameter rules rather than manual edits. Fabric deployment pipelines handle lakehouses, notebooks, and data pipelines with the same promotion pattern. Refresh schedules, gateway assignments, and alerting are applied through the Power BI REST API so they survive redeployment. A written change-management checklist covers dataset certification, dependency impact analysis, and rollback procedure for every promotion to production.

Which industries and data sources do you support most often?

Heaviest experience sits in healthcare, financial services, energy, manufacturing, and public sector, because each of those verticals pushes a different dimension of Power BI: HIPAA-bound PHI handling, transaction-grain reconciliation, time-series sensor data, cost-center-driven manufacturing variance, and FedRAMP-aligned government reporting. Source systems frequently include Microsoft Dynamics 365, SAP ECC and S/4HANA, Salesforce, Workday, Epic and Cerner (Oracle Health), Infor, and a long tail of legacy ODBC databases connected through on-premises data gateways. Fabric engagements add Azure Data Lake Storage, Snowflake, Databricks, BigQuery, and Amazon S3 through OneLake shortcuts. Regardless of source, the modeling discipline is identical: conformed dimensions, documented grains, and a semantic layer that hides the join complexity from report authors.

Deep Dive into Medallion Architecture

<h2>Deep Dive into Medallion Architecture in Microsoft Fabric</h2>

The medallion architecture (Bronze, Silver, Gold) is a data engineering pattern that progressively refines raw data into business-ready analytics assets through three distinct layers, each serving a specific purpose in data quality, transformation, and consumption. This architecture has become the de facto standard for organizing enterprise lakehouses in Microsoft Fabric, providing clear boundaries between raw ingestion, validated transformation, and curated business models.

Having implemented medallion architectures for organizations processing everything from healthcare claims to retail point-of-sale transactions, I can tell you that the concept is deceptively simple but the execution demands careful planning. The organizations that succeed treat each layer as a contract — with defined schemas, quality gates, and ownership — while those that fail treat it as a loose suggestion and end up with a disorganized data swamp wearing a "lakehouse" label.

<h2>Why Medallion Architecture Matters in 2026</h2>

Before medallion architecture became standard, most data lakes followed a "dump everything in and figure it out later" approach. The result was predictable: data scientists spent 80% of their time finding, understanding, and cleaning data rather than analyzing it. Business users had no confidence in the numbers because different teams applied different transformation logic to the same raw data, producing conflicting reports.

The medallion pattern solves this by creating explicit layers with clear responsibilities:

<ul> <li>Bronze (Raw): Exact copies of source data, preserving the original format and all records including errors. This is your audit trail and reprocessing safety net.</li> <li>Silver (Validated): Cleaned, deduplicated, conformed data with standardized schemas. This is where data engineering applies business rules and quality checks.</li> <li>Gold (Business): Aggregated, modeled, business-ready datasets optimized for specific consumption patterns — dashboards, ML models, or operational reports.</li> </ul>

In Microsoft Fabric, this architecture maps naturally to the Lakehouse and Warehouse workloads, with Delta tables providing ACID transactions, time travel, and schema enforcement at every layer.

<h2>Bronze Layer: Raw Ingestion Done Right</h2>

The bronze layer captures source data in its original form with minimal transformation. The cardinal rule is: never lose data at this layer. Every record from every source system lands here, including malformed records, duplicates, and late-arriving data.

Bronze layer design principles:

<ul> <li>Append-only ingestion: Never overwrite bronze data. Use append mode so you maintain a complete history of everything received from source systems. This enables reprocessing when business rules change.</li> <li>Include metadata columns: Add ingestion_timestamp, source_system, source_file_name, and batch_id to every bronze table. These columns are invaluable for debugging data issues, tracking lineage, and identifying when a source system sent bad data.</li> <li>Preserve original data types: If the source sends dates as strings, store them as strings in bronze. Type conversion happens in silver. This prevents ingestion failures when source systems send unexpected formats.</li> <li>Partition by ingestion date: This enables efficient incremental processing in the silver layer — the silver pipeline reads only new bronze partitions rather than rescanning the entire table.</li> </ul>

Fabric-specific bronze patterns:

Use Fabric Data Factory pipelines or Spark notebooks for bronze ingestion. For real-time sources, Fabric Eventstream captures streaming data into bronze Delta tables. For file-based sources, use Lakehouse file shortcuts to reference files in OneLake or external storage (ADLS Gen2, S3) without copying data.

For high-volume ingestion scenarios, optimize your Spark jobs with the techniques in our <a href="/blog/fabric-spark-optimization">Spark optimization guide</a> to ensure bronze ingestion completes within your processing windows.

<h2>Silver Layer: Where Data Engineering Happens</h2>

The silver layer is where the heavy lifting occurs. This layer transforms raw bronze data into clean, validated, conformed datasets that multiple gold-layer consumers can trust. Getting silver right is the most important investment in your medallion architecture.

Silver layer transformations:

Transformation	Purpose	Example
Schema enforcement	Standardize column names and types	Convert "cust_nm" to "customer_name" (STRING)
Deduplication	Remove duplicate records	Deduplicate by business key + timestamp
Data type casting	Convert to correct types	Parse date strings to DATE, amounts to DECIMAL
Null handling	Apply business rules for missing data	Default unknown regions to "Unassigned"
Referential integrity	Validate foreign key relationships	Flag orders with invalid customer IDs
Standardization	Normalize values across sources	Map "US," "USA," "United States" to "US"
Slowly changing dimensions	Track historical changes	Implement SCD Type 2 for customer addresses

Delta merge for incremental processing:

The silver layer should use Delta MERGE operations to process only new or changed records from bronze. A typical pattern reads bronze records with ingestion_timestamp greater than the last silver processing watermark, applies transformations, and merges results into the silver table. This incremental approach processes only changed data rather than recomputing the entire silver table on every run.

Data quality gates:

Implement explicit quality checks between bronze and silver. I use a quarantine pattern: records that fail validation rules land in a separate quarantine table with the failure reason, while clean records proceed to silver. This prevents bad data from propagating to gold while preserving the problematic records for investigation. Common quality checks include:

<ul> <li>Required fields are not null</li> <li>Dates fall within reasonable ranges (not year 1900 or year 2099)</li> <li>Numeric values are within expected bounds</li> <li>Categorical values match allowed value lists</li> <li>Cross-field consistency (end_date >= start_date)</li> </ul>

<h2>Gold Layer: Business-Ready Analytics</h2>

The gold layer contains datasets optimized for specific business consumption patterns. Unlike silver — which aims to be a general-purpose clean data store — gold tables are purpose-built for their consumers. A gold table for a Power BI executive dashboard looks very different from a gold table feeding a machine learning model.

Gold layer design patterns:

Star Schema for BI: For Power BI consumption, gold tables should follow star schema design — fact tables containing measures and foreign keys, surrounded by dimension tables containing descriptive attributes. This structure optimizes both Power BI import performance and DAX query patterns. See our guide on <a href="/blog/semantic-model-practices">semantic model best practices</a> for how gold layer design flows into Power BI.

Aggregation Tables: Pre-aggregate common query patterns. If your executive dashboard always shows monthly revenue by region, create a gold table at that grain rather than forcing Power BI to aggregate millions of daily transaction rows at query time. This reduces dashboard load times from 30 seconds to under 2 seconds.

Feature Tables for ML: Machine learning models need wide, denormalized tables with engineered features. Gold tables for ML consumption should pre-compute features like "customer_lifetime_value," "days_since_last_purchase," and "rolling_30day_average" so data scientists can focus on modeling rather than feature engineering.

Denormalized Reporting Tables: Some reporting scenarios benefit from fully denormalized tables that combine dimensions and facts into a single wide table. This simplifies queries for Fabric SQL endpoints and non-Power-BI consumers like Excel direct query users.

<h2>Implementing Medallion Architecture in Fabric Lakehouse</h2>

Microsoft Fabric provides several options for organizing medallion layers:

Option 1: Single Lakehouse, Multiple Schemas Create bronze, silver, and gold schemas within a single Lakehouse. This is the simplest approach and works well for small-to-medium implementations. All data stays in one Lakehouse, simplifying security and management.

Option 2: Separate Lakehouses per Layer Create bronze_lakehouse, silver_lakehouse, and gold_lakehouse as separate Fabric items. This provides stronger isolation — you can assign different capacity, security policies, and access controls to each layer. The bronze lakehouse might allow data engineers full access, while the gold lakehouse is read-only for analysts.

Option 3: Hybrid with Shortcuts Use separate Lakehouses with OneLake shortcuts connecting them. The silver lakehouse references bronze tables via shortcuts (no data copy), and the gold lakehouse references silver tables similarly. This combines isolation benefits with the convenience of cross-lakehouse querying.

For most enterprise deployments, I recommend Option 2 (separate Lakehouses) because it provides the clearest security boundaries and enables independent scaling. The small additional management overhead is worth the governance clarity.

<h2>Handling Cross-Cutting Concerns</h2>

Data Lineage: Track how data flows from bronze through silver to gold. Fabric's built-in lineage tracking helps, but I also recommend maintaining a metadata table that records processing timestamps, row counts, and transformation versions at each layer boundary. When a business user questions a number, you can trace it back through every transformation to the original source record.

Schema Evolution: Source systems change their schemas regularly — new columns appear, data types change, columns get renamed. Bronze handles this naturally (append everything). Silver must handle schema evolution gracefully using Delta's schema evolution capabilities (mergeSchema option). Gold schemas should change through governed release processes because downstream reports depend on them.

Reprocessing: When you discover a bug in your silver transformation logic, you need to reprocess historical data. The bronze layer's complete history enables this — fix your silver pipeline, clear the affected silver partitions, and rerun from bronze. Without a complete bronze layer, this reprocessing is impossible, and you are stuck with incorrect historical data.

Testing: Treat your medallion pipelines like software. Unit test individual transformations, integration test the bronze-to-silver-to-gold flow, and run data quality assertions after every pipeline execution. Fabric notebooks support assertion-based testing that can fail a pipeline run when quality thresholds are not met.

<h2>Performance Optimization Across Layers</h2>

Each layer has different performance characteristics and optimization strategies:

<ul> <li>Bronze: Optimize for write throughput. Use large batch sizes, minimize partition count, and disable unnecessary indexes. Write speed matters more than read speed at this layer.</li> <li>Silver: Optimize for both read and write. Use Z-ordering on frequently filtered columns, compact small files regularly with OPTIMIZE, and implement efficient incremental processing.</li> <li>Gold: Optimize aggressively for read performance. Use V-Order (Fabric's read-optimized format), Z-order on query-specific columns, pre-aggregate wherever possible, and tune partition sizes for your consumption patterns.</li> </ul>

For connecting your gold layer to Power BI with optimal performance, explore <a href="/blog/real-time-analytics-fabric">real-time analytics in Fabric</a> and our guide on <a href="/blog/power-bi-direct-lake-mode-guide-2026">Direct Lake mode</a> which reads gold Delta tables directly without import.

<h2>Common Mistakes to Avoid</h2>

<ul> <li>Skipping bronze: Teams that load directly into silver lose their reprocessing safety net. Always land raw data first.</li> <li>Too many gold tables: Creating a separate gold table for every report leads to duplication and inconsistency. Build reusable gold datasets that serve multiple reports.</li> <li>Silver = dump zone: Without clear transformation standards, silver becomes another unstructured mess. Define and enforce transformation patterns.</li> <li>Ignoring data quality: Without quality gates between layers, errors in source data propagate all the way to executive dashboards. Implement validation at every layer boundary.</li> <li>Overcomplicating the architecture: Some organizations add Platinum, Copper, and Diamond layers. Stick to three layers unless you have a very specific, documented reason to add more.</li> </ul>

The medallion architecture is not just a technical pattern — it is an organizational framework that assigns clear responsibilities, quality standards, and ownership at each data transformation stage. Get the architecture right, and your entire analytics platform becomes more trustworthy, maintainable, and scalable.

Deep Dive into Medallion Architecture

Frequently Asked Questions

Is medallion architecture required in Fabric?

How do I handle data quality failures in medallion architecture?

What is the medallion architecture in Microsoft Fabric?

How many Lakehouses should I use for medallion layers?

Can Power BI connect directly to each medallion layer?

Related Articles

Building a Modern Data Lakehouse with Microsoft Fabric

Getting Started with Fabric Notebooks and PySpark

Building ML Models in Microsoft Fabric

Related Services

Microsoft Fabric Consulting

Data Analytics

Architecture Consulting

Industry Solutions

Need Help With Power BI?

Ready to Transform Your Data Strategy?