Initially, I wanted this blog to be dedicated mainly to Power Query. Since that time many things have changed, disrupting my daily routines as a Power BI Developer. I started to work as Power BI Admin, and with introduction of Fabric, my area gets bigger and bigger every day. But the day has finally come, where I am happy to invite you to first article where I start sharing my Power Query knowledge.

In this very first article we will cover 3 Ws of Power QueryWhat is it, Why you should learn it and Where is it available. I hope you enjoy it.

How it all started for me?

Like many of you, I started my BI journey with Excel. All the data preparation was heavily dependent on Excel formulas and VBA. At that time, we’ve been using 2013 version of Excel, where Power Query was still a hidden add-on. I was taking over one of the reports from my colleague in Finance, and she showed me Power Query. Her previous manager inspired them to learn it, to help them in their daily tasks. At first, I was thinking that 3rd syntax in single tool is a bit too much (I didn’t know yet about Power Pivot either, which would be the 4th syntax), but after a short demo I quickly realized the value of that tool, and how much easier it is to handle data manipulation tasks comparing to VBA. I came back to my open space and told my manager that I start to learn Power Query. Although it wasn’t appreciated at all, it didn’t stop me from learning it and automating all my repetitive reporting. Not to mention using it on a daily basis to provide ad-hoc analysis. Even when started my journey with Power BI, I was mainly focusing on Power Query and data modelling, rather than making two ends meet using complex DAX. I am really excited that I can finally start sharing my love for this tool with others.

What is Power Query?

Power Query is a data integration engine provided by Microsoft. Integration process covers data extraction, transformation, and load from multiple sources (ETL). Power Query runs on a powerful M engine, that allows to complete complex tasks in seconds. It provides rich capabilities available to end users via User Interface (no-code approach), and over 700 functions for those who are willing to explore M language (low-code approach). At the time of this article, according to MS documentation Power Query provides 196 connectors, that allow to easily connect to various data sources. If nothing works for you, there is a Power Query SDK available, that you can use to build your own custom connector.

Why should you learn Power Query?

Imagine being given a task: you are supposed to provide the data showing top 5 customers with the highest sales records for last 5 years and present the data using a million unit. The problem is that data is stored in SharePoint (or other repository), there are over 500 excel files containing the data, data has been extracted manually by different people, therefore columns are not arranged in the same manner in all of the files and on top of that, all the excel files contain only ID information for your customers and you need to map it with data stored in another file. With Power Query you can:

  1. Quickly combine all the files together. If only you are relying on column headers, data will be combined based on column names, not based on column order.
  2. With combined data, you can quickly filter out the data that is not relevant for your analysis. Maybe you are supposed to focus only on specific Region or Country, etc. Not to forget that we must use only last 5 years of data.
  3. Using Gropy By you can quickly aggregate the sales amount for each customer.
  4. Using Merge, you can add data from other files/tables, to enrich your analysis and show Customer Name instead of Customer ID.
  5. Divide sales value by milion to satisfy the requirement how the data must be presented.

If you were using VBA, you know that this task would require quite a code to work properly. With Power Query I guarantee you could complete your analysis without writing a single line of code. And for those of you that are already experienced Power Query users, you could complete entire task in minutes. Still not convinced? There are also very important reasons why you should learn Power Query, that are related specifically to Power BI. I will cover them in next paragraph.

In next article I will cover first look at Power Query, and we will go through the demo scenario that I described in this section. Stay tuned 🙂

Why is Power Query so important for Power BI?

There are four reasons why. First one is very trivial. This is very first experience that you will use in order to get the data for your Power BI Report. Even if you are not performing any data enrichment/transformation/cleaning tasks, you still must connect to the data using one of the connectors available in Power Query. Second reason is much more important as it’s related to quality of your reports. Your end users expect not only beautiful reports, but also a good quality of data. It is extremely important to load clean, curated data to your reports, to provide maximum value for report consumers. Once you master Power Query, you will be able to accomplish all the transformation/cleaning tasks much faster. Third reason is about report performance. By delivering Power BI reports aligned with available best practices, you will find your Semantic Models working much smoother and more reliable, providing exceptional user experience. Last but not least (4th reason) we have the aspect of total cost that your organization pays for working with Power BI. It is related to Power BI Reports that are published to a Premium Workspaces (backed by a Premium or Fabric Capacity). Why is it important? Well, those are the resource that customers buy from Microsoft, and they are responsible to make sure that all your reporting runs smooth. Tenant and Capacity Admins can access a Fabric Capacity Metrics App provided by Microsoft to check how organizational Capacities are being used:

Figure 1. Example of Capacity consumption over time.

Let’s simplify this example and assume, that we talk about Power BI only for this chart (no other Fabric workloads included). What you see is marked with blue color are background activities, in nutshell everything related to your semantic models or dataflows refresh. Red color shows interactive consumption, mostly related to report consumption. While interactives are completely unpredictable, and depending on quality of reports may cause the spikes on chart as you can see, background activities are more stable, and not that easily disrupted by “noisy neighbors”. This is because Microsoft treats them differently in terms of their impact on Capacities, by smoothing the consumption over certain period of time. Interactive part is smoothed over 5 minutes (it’s not fixed time interval according to documentation), background activities are smoothed over 24 hours period. What it means for us is that it’s much easier to maintain the Capacities, as all the heavy lifting done during refreshes is spread evenly throughout the day.

Because of being more stable and predictable, background consumption increase can force you at some point of time to take actions, whether you must of think about reducing the consumption or scaling your capacities. The later means spending more money, which probably you want to avoid.

That is why this is extremely important to make sure that Power Query solutions behind your Semantic Models and Dataflows are optimized as much as possible. Even if achieved capacity saving is not big from single refresh perspective, remember that this number scales when multiplied by number of refreshes per day, and number of Semantic Models in a single Capacity.

That’s why Power Query skills are very important to keep your Power BI costs as low as possible.

Short history of Power Query

For all the geeks out there, who like to learn something about Power Query that they can share during the cocktail parties, here is a short history of this amazing tool. Many people think that it all started with Excel. I know I thought it started as an addon for Excel 2013. However, it all began in 2011, when Power Query was a project within Azure Experiments. This was an umbrella term for Microsoft’s innovation projects, aiming to explore new technologies integrated within Azure. This was also the case for Power Query – previously called as Data Explorer – as it was designed to deliver the insights, leveraging Azure capabilities. Fun fact, today we also have a tool in Azure called Azure Data Explorer, that allows to analyze large volumes of streaming data. But I am digressing.

Power Query was first introduced in Excel in 2012, so it was still within Excel 2010 version, and continued in Excel 2013 as an addon, meaning it wasn’t available out of the box, you needed to install it on your computer. As of Office 2016, Power Query was already an integral part of Excel.

In Parallel Microsoft was working on another product. In 2010, SQL Server Reporting Services Team picked up a Project Crescent. Public download was released in 2011. In 2013, project evolved and at that time it was built on top of 4 Excel addons: Power Query, Power Pivot, Power View and Power Maps. This time, name of product will already be familiar to you, as it was named Power BI for Office 365. This tool, renamed to simply Power BI, was first released to the public in 2015.

From Power BI perspective, next important step would happen around 2017-2018, when we saw first iterations of Dataflows. It was great decision to bring Power Query experience to Power BI Service, where users could build re-usable entities, instead of working in silos within individual Power BI Desktop files.

This is only small fragment of entire history, because (as you will find out in next paragraph) Power Query can be found not only in Excel and Power BI.

Where can you find Power Query?

This could be another reason why I think you should learn Power Query. It’s because this is ubiquitous tool in Microsoft Realm. Here is the full list that I managed to assemble from different sources:

  1. Microsoft Excel – although timeline for Excel and Power BI are not that far away, without any doubts Power Query was quickly adopted mainly by Excel community.
  2. Power BI Desktop – very obvious pick for a second spot on the list. Desktop tool (initially known as Power BI Designer) that couldn’t exist without Power Query.
  3. Power BI Report Server – on-prem version of Power BI for those who can’t or don’t want work in the cloud.
  4. Power BI Report Builder – for those of you who missed the info, as of March 2024, Power Query is also available for Paginated Reports, which is a big deal.
  5. Dataflows (Power BI Service) – Power Query online experience, great feature that helps create re-usable entities and avoid working in silos.
  6. Power Platform Dataflows – you often find that Power Query is available in Power Apps, Power Automate and Dynamics 365 (specifically for Customer Insights). However, if I were to point the Power Platform application that is tied to the Dataflows, it wouldn’t be any of the mentioned three. Dataflows for Power Platform is used to connect to external data sources and load them into Dataverse. Once you create a Dataflows within an Environment, it will be available for all three mentioned apps. What is interesting, we could re-use Power Platform Dataflows within Power BI as well.
  7. Azure Data Factory – as we already covered, it all started with Azure. Here, within Pipelines we have a Power Query activity, that is used to transform the data. There is also a Data flow activity, which is not related to Power Query at all.
  8. Fabric Data Factory – here we can find Dataflows Gen 2 activity, which is very similar to Dataflows. However, the most important difference is that here we can decided where we want to store the results of Dataflows transformation.
  9. SQL Server Integration Services – as a part of SSIS data flows, you can use a Power Query step to connect and transform the data before it is loaded to SQL Server.
  10. SQL Server Analysis Services – Get Data experience with the tool, similar to what now have for Paginated Reports.
  11. Power Query SDK for Visual Studio Code – extension for VS code related to development and testing Custom Connectors for Power Query.

Even if you say that point 9 and 10 are not relevant anymore, it is still a lot of areas where Power Query is used. How many did you know?

Conclusion

I hope this article helps to answer 3 questions addressed at the beginning. What is Power Query, Why should you learn it and Where is it available. This is only just the beginning and, as promised already, next article will cover the first look into Power Query in Power BI Desktop, and we will walk through basic demo scenario in Why should you learn Power Query section.

As always, thank you for reading and see you in next article 🙂

Picture of Pawel Wrona

Pawel Wrona

Lead author and founder of the blog | Works as a Power BI Architect in global company | Passionate about Power BI and Microsoft Tech

Did you enjoy this article? Share with others :)

5 8 votes
Article Rating
Comments notifications
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Related Posts