What is Microsoft Fabric?

Microsoft Fabric is a unified SaaS analytics platform that combines data engineering, data warehousing, real-time analytics, data science, and business intelligence in one integrated environment.

How is Fabric different from Azure Synapse?

Fabric is the evolution of Synapse with a SaaS model, unified experience, and OneLake storage. It simplifies the analytics stack by eliminating the need to manage separate services and data copies.

What is OneLake in Microsoft Fabric?

OneLake is Fabric's unified data lake that automatically stores all data in Delta Parquet format. It provides a single copy of data accessible by all Fabric workloads, eliminating data silos.

How does Fabric pricing work?

Fabric uses capacity units (CUs) that can be purchased as reserved or pay-as-you-go. Capacities can be paused when not in use, and all workloads share the same capacity pool.

Can I use Fabric with existing Power BI reports?

Yes. Existing Power BI reports work seamlessly in Fabric. Power BI is a core workload within Fabric, so your current investments are fully preserved and enhanced.

Microsoft Fabric Architecture: A Deep Dive Into the Unified Analytics Platform

What Makes Fabric Different

Microsoft Fabric is not simply a rebrand of existing Azure data services. It represents a fundamental architectural shift: a unified analytics platform built on a single data lake (OneLake) with integrated compute engines for every analytics workload — data engineering, data warehousing, real-time intelligence, data science, and business intelligence.

Before Fabric, building an enterprise analytics platform on Azure meant provisioning and integrating separate services: Azure Data Factory for pipelines, Azure Synapse for warehousing, Azure Databricks for data science, Azure Data Lake Storage for storage, and Power BI for visualization. Each service had its own security model, storage format, metadata catalog, and billing structure.

Fabric collapses this complexity into a single SaaS platform with shared governance, a single security model, and one copy of the data.

OneLake: The Foundation

What Is OneLake?

OneLake is Fabric’s built-in data lake, analogous to OneDrive for data. Key characteristics:

One lake per tenant — every Fabric capacity in your organization shares a single OneLake instance
Built on Azure Data Lake Storage Gen2 — full ADLS Gen2 compatibility with hierarchical namespace
Open format — data is stored in Delta Lake (Parquet + transaction log), an open standard accessible by any tool
Automatic provisioning — no storage accounts to create, manage, or secure separately
Multi-cloud shortcuts — create references to data in AWS S3 or Google Cloud Storage without copying it

OneLake Hierarchy

OneLake organizes data in a familiar hierarchy:

OneLake (Tenant)
├── Workspace: Sales Analytics
│   ├── Lakehouse: SalesLakehouse
│   │   ├── Tables/
│   │   │   ├── customers (Delta table)
│   │   │   ├── orders (Delta table)
│   │   │   └── products (Delta table)
│   │   └── Files/
│   │       ├── raw_data/
│   │       └── staging/
│   ├── Warehouse: SalesWarehouse
│   ├── Semantic Model: SalesModel
│   └── Report: SalesReport
├── Workspace: HR Analytics
│   └── ...
└── Workspace: Finance
    └── ...

Shortcuts: Zero-Copy Data Access

Shortcuts are one of OneLake’s most powerful features. A shortcut is a reference to data stored elsewhere — another OneLake location, an ADLS Gen2 account, an S3 bucket, or a GCS bucket. The data is not copied; the shortcut provides a transparent access layer.

Use cases for shortcuts:

Cross-workspace data sharing — the Finance workspace creates a shortcut to the Sales lakehouse’s customer table without duplicating data
Hybrid cloud — reference data in AWS S3 alongside data in OneLake without ETL
Legacy migration — create shortcuts to existing ADLS Gen2 storage while gradually migrating to Fabric-native items
Data mesh — each domain owns its data in its workspace and publishes shortcuts for cross-domain access

Compute Engines

Data Engineering (Apache Spark)

Fabric provides a fully managed Apache Spark environment for large-scale data processing:

Spark pools — auto-scaling Spark clusters with Starter pools (instant start) and custom pools
Notebooks — interactive development with Python, Scala, R, and SparkSQL
Spark job definitions — scheduled batch jobs for production pipelines
VS Code integration — develop locally and deploy to Fabric
Libraries — install custom Python and R packages per workspace or session
Lakehouse integration — Spark reads and writes directly to OneLake Delta tables

Key architectural detail: Fabric Spark uses a shared metadata layer. When Spark writes a Delta table to a lakehouse, that table is immediately visible in the SQL endpoint and Power BI — no ETL, no sync, no delay.

Data Warehouse

Fabric’s data warehouse provides a full T-SQL experience:

T-SQL compatibility — familiar SQL Server syntax for queries, views, stored procedures, and functions
Distributed query engine — columnar storage with distributed processing for fast analytical queries
Cross-database queries — query across warehouses and lakehouses within the same workspace
Clone tables — zero-copy table clones for development and testing
Time travel — query data as it existed at a previous point in time

Warehouse vs. Lakehouse SQL Endpoint:

Feature	Warehouse	Lakehouse SQL Endpoint
Write operations (INSERT, UPDATE, DELETE)	Full DML support	Read-only
Stored procedures	Yes	No
T-SQL views	Read-write	Read-only
Security (row-level, column-level)	Full support	Limited
Performance optimization	Manual (statistics, indexes)	Automatic
Best for	Complex transformations, reporting	Exploration, ad-hoc queries

Real-Time Intelligence

Fabric’s real-time analytics engine (based on Azure Data Explorer / Kusto) handles streaming data:

Eventstreams — ingest data from Azure Event Hubs, Kafka, IoT Hub, custom apps, and database change data capture
KQL Database — store and query streaming data with Kusto Query Language
Real-time dashboards — live visualizations that update as data arrives
Reflexes — event-driven triggers that fire actions based on data conditions

Architecture pattern for IoT monitoring:

IoT Devices → Event Hubs → Eventstream → KQL Database → Real-Time Dashboard
                                              ↓
                                         Reflex (alert when temperature > threshold)

Data Science

Fabric integrates data science capabilities:

Notebooks with MLflow tracking for experiment management
Models registered in the Fabric model registry
PREDICT function for scoring models directly in T-SQL or Spark
Semantic link for accessing Power BI semantic models from notebooks

Data Factory

Fabric Data Factory handles data movement and orchestration:

Dataflows Gen2 — Power Query-based transformations (low-code ETL)
Data pipelines — orchestration workflows similar to Azure Data Factory
Copy activity — move data from 100+ source connectors into OneLake
Scheduling — cron-based and event-based pipeline triggers

The Medallion Architecture in Fabric

The medallion architecture (Bronze → Silver → Gold) is the recommended pattern for organizing data in Fabric:

Bronze Layer (Raw)

Raw data ingested from source systems without transformation
Stored in the lakehouse Files section or as Delta tables
Full fidelity — preserves the exact data as received from the source
Serves as the system of record for auditability

Silver Layer (Curated)

Cleaned, validated, and standardized data
Data type enforcement, null handling, deduplication
Conformed dimensions (consistent customer IDs, product codes)
Stored as Delta tables in the lakehouse Tables section

Gold Layer (Business-Ready)

Aggregated, enriched data optimized for specific business use cases
Star schema models for reporting (fact and dimension tables)
Pre-computed metrics and KPIs
Consumed by Power BI semantic models and operational applications

Sources → Bronze (raw) → Silver (curated) → Gold (business-ready) → Power BI
  ↑          ↑                ↑                    ↑
  Data    Pipelines/       Spark/SQL           Semantic
  Factory  Dataflows       Notebooks            Models

Security Architecture

Workspace-Level Security

Roles: Admin, Member, Contributor, Viewer
Microsoft Entra ID integration — assign roles to users, groups, and service principals
Workspace identity — managed identity for accessing external resources

Item-Level Security

OneLake data access roles — control who can read specific folders and tables within a lakehouse
Row-level security (RLS) — filter data rows based on user identity in warehouses and semantic models
Column-level security (CLS) — restrict access to sensitive columns in warehouses
Object-level security (OLS) — hide tables and columns from users in semantic models
Dynamic data masking — mask sensitive data (SSN, email) in query results

Network Security

Private endpoints — access Fabric from within your VNet
Managed private endpoints — connect Fabric to data sources via private network
Trusted workspace access — allow specific workspaces to access secured storage accounts

Capacity and Licensing

Fabric uses capacity-based licensing measured in Capacity Units (CUs):

SKU	CU	Spark VCores	Max Memory	Approximate Monthly Cost
F2	2	4	6 GB	~$262
F4	4	8	12 GB	~$525
F8	8	16	24 GB	~$1,049
F16	16	32	48 GB	~$2,099
F32	32	64	96 GB	~$4,197
F64	64	128	192 GB	~$8,395

All workloads — Spark, SQL, pipelines, Power BI — share the same capacity pool. Fabric uses bursting to temporarily exceed your CU allocation for short workloads, and smoothing to average consumption over time, so you do not need to provision for peak demand.

Pause and resume: Fabric capacities can be paused when not in use, which is particularly valuable for development and testing environments.

Design Patterns

Pattern 1: Centralized Data Platform

One team manages a central Fabric capacity with shared lakehouses:

Central data engineering team owns ingestion, transformation, and governance
Business teams consume data through Power BI semantic models
Simple governance but potential bottleneck on the central team

Pattern 2: Data Mesh

Each domain owns its data in dedicated workspaces:

Sales, Finance, HR each have their own workspace with lakehouses
Domains publish curated datasets via OneLake shortcuts
Central governance team manages tenant-level policies and standards
More autonomous but requires mature data culture

Pattern 3: Hub and Spoke

Hybrid approach:

Central hub workspace manages shared reference data and enterprise-wide transformations
Domain-specific spoke workspaces for team-level analytics
Shortcuts connect spokes to hub data without duplication

Migration Path

For organizations running Azure Synapse, Azure Data Factory, or Databricks:

Assessment — inventory existing pipelines, datasets, and consumers
OneLake shortcuts — create shortcuts to existing ADLS Gen2 storage for immediate access in Fabric
Migrate pipelines — convert ADF pipelines to Fabric Data Factory pipelines (high compatibility)
Migrate notebooks — port Databricks or Synapse Spark notebooks to Fabric (Spark API compatible)
Migrate Power BI — existing Power BI workspaces can be assigned to Fabric capacities
Decommission — retire legacy services once Fabric workloads are validated

Next Steps

Microsoft Fabric’s unified architecture reduces the complexity and cost of enterprise analytics, but realizing its benefits requires thoughtful design — from OneLake organization and security to capacity planning and workload optimization.

Al Rafay Consulting helps organizations design, migrate to, and optimize Microsoft Fabric deployments. Whether you are evaluating Fabric for a new project or planning a migration from existing Azure data services, our team brings hands-on experience across every Fabric workload.