10+ years experience in:
Designing and developing
scalable Medallion Data Lakehouse architectures.
Expertise in data ingestion,
transformation, and curation using Delta Lake and Databricks.
Experience integrating
structured and unstructured data sources into star/snowflake schemas.
Building, automating, and
optimizing complex ETL/ELT pipelines using Azure Data Factory (ADF),
Databricks (PySpark, SQL, Delta Live Tables), and dbt.
Implementing orchestrated
workflows and job scheduling in Azure environments.
Strong knowledge of relational
(SQL Server, Synapse, PostgreSQL) and dimensional modeling.
Advanced SQL query
optimization, indexing, partitioning, and data replication strategies.
Experience with Apache Spark,
Delta Lake, and distributed computing frameworks in Azure Databricks.
Working with Parquet, ORC,
and JSON formats for optimized storage and retrieval.
Deep expertise in Azure Data
Lake Storage (ADLS), Azure Synapse Analytics, Azure SQL, Event Hubs, and
Azure Functions.
Strong understanding of cloud
security, RBAC, and data governance.
Proficiency in Python
(PySpark), SQL, and PowerShell for data engineering workflows.
Experience with CI/CD
automation (Azure DevOps, GitHub Actions) for data pipelines.
Implementing data lineage,
cataloging, metadata management, and data quality frameworks.
Experience with Unity Catalog
for managing permissions in Databricks environments.
Expertise in Power BI (DAX,
data modeling, performance tuning).
Experience in integrating
Power BI with Azure Synapse and Databricks SQL Warehouses.
Familiarity with MLflow,
AutoML, and embedding AI-driven insights into data pipelines.