Fabric ETL Framework
Stop hand-coding pipelines for every new data source. Fabric ETL is a metadata-driven, medallion-architecture framework built on Microsoft Fabric. Declare your data flow once in the Registry — the framework handles ingestion, quality, orchestration, and delivery to Power BI automatically.
Every new data source means new custom code. Every schema change breaks something. Every deployment is a manual ritual. Sound familiar?
Each new source gets a near-identical notebook. Multiply that by 10 clients, and you're maintaining a maintenance nightmare. One schema change means touching every file manually.
Data quality errors get buried. A null value slips through, a format changes upstream, and nobody knows until the report is wrong. By then the damage is done.
Promoting from DEV to production is a manual checklist. Changes are undocumented. Rollbacks are painful. Every release is a gamble.
Four layers. One consistent flow. Configured in code — not wired by hand every time.
Layer 1
The single source of truth for your entire data platform. Define each source table once — schema, column mappings, data quality rules, merge strategy, Gold transformations, dependencies — all in one Python object.
Layers 2 & 3
Raw data lands in Bronze as-is. The Silver layer standardises, validates, and deduplicates it. The Gold layer transforms Silver into Power BI-ready dimensional models — star schemas with aggregations and business logic already applied.
Built-in
Six rule types enforced automatically on every run: type casting, regex patterns, value ranges, uniqueness checks, row count thresholds, and Dutch BSN checksum validation. Blocking rules stop bad data; non-blocking rules log warnings with a full audit trail.
Layer 4
The orchestration engine reads the Registry, builds a dependency graph automatically, and executes parallel batches of up to 8 jobs. Tables run in the right order, every time, without manual coordination. SCD Type 2 history tracking is a config option, not a custom build.
From raw source data to a Power BI dashboard — without writing a single one-off pipeline.
Define your source system in the Registry — name, schema, load type, quality rules, transformations. One config object, everything declared.
The framework reads the Registry, resolves dependencies, and runs Bronze → Silver → Gold automatically in parallel batches. No manual wiring.
Power BI connects directly to Gold layer semantic models. Dashboards refresh on schedule. Stakeholders get accurate, up-to-date data every time.
The framework was designed with enterprise outcomes in mind — not just developer convenience.
Add a Registry entry, run the pipeline. No new notebooks, no new deployments.
Every run logged. Every DQ violation captured. Your audit trail is always complete.
Track how every record changed over time. Retroactive analysis and regulatory compliance built in.
Upstream added a column? The framework detects and applies it automatically. No emergency fixes.
Sensitive columns hashed or partially masked at the framework level. AVG/GDPR-ready by design.
DEV → TST automated via Azure Pipelines. ACC → PRD requires a manual approval gate. No surprises in production.
The framework powers a live healthcare data integration platform in the Netherlands, ingesting multi-format source data from provider systems, enforcing healthcare-grade data quality rules (including BSN checksum validation), and serving dimensional models for downstream analytics — all running automatically on Microsoft Fabric.
The framework is built on Microsoft Fabric and the broader Azure data platform — and Jonan is certified across the full stack.
DP-600 · DP-700 · DP-203 · PL-300 · AZ-900
Whether you're starting a new Fabric project or inheriting a brittle pipeline mess — let's talk about bringing this framework to your organisation.
Get in Touch