April 23, 2025
Microsoft just dropped some major updates to Org Apps (Preview) in Power BI — and if you’re a BI leader or manager overseeing reporting at scale, this one’s for you. What Are Org Apps (Preview),
April 9, 2025
🚀 A New Way to Interact with Fabric Using Python As a data engineer working with Microsoft Fabric, you’ve probably spent time navigating APIs, managing authentication, and making repetitive API calls to fetch workspace details,
March 26, 2025
The AI industry stands at a crossroads. On one side, powerful models like GPT-4 dazzle with their capabilities but drain budgets with eye-watering costs. On the other, smaller models promise affordability but lack the sophistication
March 12, 2025
Artificial Intelligence (AI) is advancing rapidly, and Large Language Model (LLM) agents are leading this evolution. Unlike traditional AI systems limited to basic text generation, LLM agents now incorporate memory retention, strategic planning, and autonomous
February 26, 2025
SPARK IS MUCH CHEAPER AND FASTER TO EXECUTE THAN DATAFLOWS! I’m going to tell you why in this blog. Microsoft Fabric offers a powerful unified analytics platform where you can leverage various engines to design
February 12, 2025
In the data science and data engineering fields, Python is the dominant language for processing data. Most beginners are introduced to Pandas, which provides a way to work with DataFrames, allowing them to manipulate datasets
January 7, 2025
“Data is your organization’s most valuable asset – but who can access it? That’s where things get tricky.” TL;DR Problem: Broad permissions overexpose data, while fine-grained controls are hard to manage at scale, risking security
April 16, 2024
TL;DR: This article explores building a robust data lakehouse using the medallion architecture, which organizes data into three layers—Bronze for raw data ingestion, Silver for data transformation, and Gold for optimized data aggregation. Best practices
March 6, 2023
Notebooks in Synapse Azure Synapse Analytics’ most appealing feature at first glance is the Synapse Studio. One unified UX across data stores, notebooks and pipelines. Notebook experience is appreciated the most among folks who read
January 29, 2023
TLDR; In the case that you want to route all the outbound traffic from Databricks clusters to Azure Firewall, you need to create a UDR (User Defined Route) and add it to the subnet where
January 22, 2023
TLDR IP Access list is one of the ways to netowrk-isolate Azure Databricks. It is a list of IP addresses that are allowed to access Azure Databricks. You can use this list to control access
January 8, 2023
Credit Thank you to Olivier Martin for your valuable insights and contributions to this post. Oliver Martin is a Microsoft Cloud Solution Architect for data analytics & AI. TLDR When creating a linked service to
September 6, 2022
TLDR; The documented minimum permissions required for using the synapsesql connector for spark to read or write data from Synapse SQL Pools is giving high privileges to spark users even though the required operation is
November 5, 2021
Workload management for cloud data warehouses is one of the most important administrative task for DBAs/Data Engineers (depends who manage the data warehouse). With cloud offerings fully or partially automating many other administrative tasks, the
October 26, 2021
TLDR; Don’t assume that your SQL MI database is protected from deletion by using Azure TLDR2; if you prefer video, here’s the demo Azure SQL Managed Instance (SQL MI) falls architecturally in the sweet spot
August 4, 2021
Last year I published a series of youtube videos to explain with demos and whiteboard the Azure SQL Database Networking. I used SQL DB as an example for PaaS service however the same concepts apply
February 13, 2021
I had the pleasure recently to deliver a session at the Worldwide Software Architecture Summit. I see a rise of more virtual-only events phenomenon even beyond Covid as many now realized that the value/cost balance
October 26, 2020
Recently I needed to help a customer to call Databricks API and since there are many ways to do this I must start by scoping the scenario This is Azure Databricks not Databricks on another
September 13, 2020
In some cases you want to end the Azure Data Factory (ADF) pipeline execution based on a logic in the pipeline itself. For example, when there’s no record coming from one of the inputs datasets
September 9, 2020
This is an updated version of my article at Medium.com originally written on December 2019 as some changes happened since then Azure Databricks workspace is a code authoring and collaboration workspace that can have one