Power BI Copilot part 4 – Dataflows Gen 2

This part may be a bit of a stretch. You might say that Dataflows Gen 2 is not part of Power BI, but another experience within Microsoft Fabric – Data Factory. Of course, you are absolutely right. Unfortunately, for now this is the only Power Query Experience, that is enhanced with the Copilot. It is not (or not yet) available for Dataflows Gen 1 or within Power BI Desktop. In one of my previous articles: Excuse me, which way to Fabric? I was describing Dataflows Gen 2 as a first natural step, that Power BI Developer may take towards MS Fabric.

As a reminder, this is a part of Copilot series:

What are we going to cover?

As it was done in previous articles, we will go through current capabilities, send few prompts, validate the accuracy and measure the impact on Capacity. As you could see with Copilot Pane in Power BI Desktop, there are certain capabilities that Copilot provides. They are surfaced for end users as selectable options, which helps to send prompts that are easier to understand for Copilot. With Dataflows Gen 2 we have the same approach. There are currently three capabilities available, as described in Microsoft’s documentation:
 
  • Generate new transformation steps for an existing query.
  • Provide a summary of the query and the applied steps.
  • Generate a new query that may include sample data or a reference to an existing query.

I will basically explore these options together with you, and I can’t wait to see the end result.  This is how it looks in the tool:

Figure 1. Copilot’s greeting message.

As you can see, we have here three mentioned capabilities, but for blank dataflow, it suggests going with Get Data first. This is where we will start this session.

Get Data

Copilot offers help with transformation, but of course we must first connect to the data source. This is what we see in welcome message from Copilot. Once you hit “Get data” button, you will see the same window with connectors available for Dataflows:
 Figure 2. Connector selection in Dataflows Gen 2.
 
I will connect to the Retail Data I am using so far for Copilot tests. Data is connected and there is a plenty of tables to work with:
 Figure 3. Dataflows instance connected to Retail Data.
 
And… basically that’s it. Get data step is done, and it seems that Copilot doesn’t have much to offer here. However, what I guess is a nice touch here, is that Microsoft assumes that Copilot is used here by someone new to Power Query and may not necessarily know where to start. Copilot not only suggests connecting to the data first, but also helps to find button, in case someone is not familiar with User Interface. I like the idea of Copilot helping to find available options quicker, so, I hope this type of support will be further developed.
 
Coming back to the topic, the data is here, let’s start asking questions.

 

Transform Data

There is some interesting data available in dimension_city table. I will start with something easy. I have there two columns of datetime type, I will ask Copilot to transform them to date types. Step is generated and is correct. We could argue that Copilot could suggest a better name for the step, but let’s not be too picky:
 
 Figure 4. Datetime column transformed to date type.
 
If you are not satisfied with the results, you can hit “Undo” button under Copilot’s response. My prompt was very specific. I mentioned exact column names and data types. I will try to send a bit more generic prompt, asking to change date columns (without specifying which ones) to combination of month and year in numeric format. Again, generated step is correct. I must say I am really happy with result:
 
 Figure 5. Date column transformed to yyyyMM representation.
 
This time, I will even show the code generated by Copilot. As well as in previous transformation, the code generated is pretty impressive:
  #"Custom 1" = let
    Source = Custom,
    #"Transformed Date" = Table.TransformColumns(
      Source,
      {
        {"ValidFrom", each Date.ToText(_, "yyyyMM"), type text},
        {"ValidTo", each Date.ToText(_, "yyyyMM"), type text}
      }
    )
  in
    #"Transformed Date"

Two things happened here. Copilot found my date type columns and applied efficient formula to change the format of displayed value. We could maybe expect to set the type to integer, but it would require additional transformation to change the text to number. But again, let’s not be picky. I am really surprised with the results so far. Another transformation I would like to apply is related to Location column:

Figure 6. Snapshot of Location column.

We have here coordinates, that we could use on Map visual in Power BI, but this format is useless for further analysis. What I would like to do here is to extract the numbers from bracket and split it into two columns using space a delimiter. Looking at the range of values, first number should be longitude, second number is the latitude. Of course, a lot depends on how precise the prompt is. Maybe I expected too much or didn’t describe my need very well, but this step failed. Here is the snapshot of my prompt:

Figure 7. Failed transformation on Location column.

The problem is that formula didn’t extract the numbers from brackets first, so first “space” in that column is found after “POINT”, so, we end up with values that can’t be converted to numbers:

Figure 8. Error on “Longitude” column.

Let’s to break it down in two steps. I will ask first to extract the values from the brackets and (in second step) to split the column in two.

Figure 9. Successful extraction of values in Location column.

As you can see, firs step worked as expected. I will now try to split the columns into Longitude and Latitude using Copilot:

Figure 10. Splitting Location column into Longitude and Latitude.

We made it. This looks really good, and not a single line of M code was required (although it’s interesting to see the code generated by Copilot :)). Next transformation I am going to run is presented in MS documentation as another capability:

  • Generate a new query that may include sample data or a reference to an existing query.

But there is no option like that available in the chat, so I will try to use “Add step that…” again, to generate such query. I am looking at my data, it seems to me that Country, Continent, Region and Subregion columns contain redundant data. Of course, in real world example you would probably more values here, and business requirement would be to keep them all. However, for the test purpose let’s assume that we could get rid of them.

Figure 11. Partially failed output. Grouping by performed within the same Query.

Based on the action I selected for Copilot to perform (Add step that…), I was actually expecting a new step to be created within the same query. Let’s try something different, I will not use the suggested start for the Copilot, but I will simply write my prompt from scratch:

Figure 12. Creating new Query with Copilot, by referencing existing query.

It’s done but the query is a simple reference, let’s try something more complex and maybe filter data for Query created from reference (I will undo the recent output):

Figure 13. Reference existing Query and perform Group by operation,

New Query is created, and I confirmed my predictions. Everything is related to a single Country, which could allow me to recommend removing the unnecessary columns. Let’s create another Query but this time with filtering:

Figure 14. Reference Query with Filter operation.

It worked again. New Query is created and filtered as expected, even though I deliberately mentioned “State” column, while available column is named “StateProvince”. Last thing I will try to do will be the most complex operation so far. I would like to use the information from fact_sale Query, and calculate Total Sales values by City, using the City name from “dimension_city” Query. This would require Group By operation using CityKey in fact_sale table, and then join the City Name information from “dimension_city” Query (or vice versa):

Figure 15. Failed Merge operation on “dimenstion_city” and “fact_sale” Queries.

Unfortunately, my prompt was too generic. Query was created based on “dimension_city” Query only, where another column required for the calculation is simply not present. Let’s try again (I will use “Undo” option”). Unfortunately, this time it didn’t work out. Copilot asked to re-phrase my request. Let’s try simple:

Figure 16. Failed Group By operation on “fact_sale” Query.

It’s not what I needed. Step is added but simply calculated TotalIncludedTax, which is even funnier, as the value in the column with exact same name is already available. But let’s give it one more try:

Figure 17. Successful Group By operation by CityKey.

Ok, this time it worked. Grouping by is done the right way. Now it’s time to add City name to this picture. Unfortunately, it didn’t work out. Copilot again asked to re-phrase my request:

Figure 18. Merge operation performed using Copilot.

It worked, even though my prompt was very specific, it is still something I would consider as quite an achievement. What is also worth to mention is that we asked Copilot to do one thing, and there is one more step created on the list that is exposed to end-user. However, if we look at the code generated by Copilot, there are of course two operations done there, merge queries + expand columns. However, for the end-user we see only one operation in the list. I am including that pice of code below:

let
    Source = #"Custom",
    #"Merged Queries" = Table.NestedJoin(
        Source, {"CityKey"},
        dimension_city, {"CityKey"},
        "dimension_city",
        JoinKind.LeftOuter
    ),
    #"Expanded dimension_city" = Table.ExpandTableColumn(
        #"Merged Queries", "dimension_city", {"City"}, {"City"}
    )
in
    #"Expanded dimension_city"

On one side, from the perspective of someone who wants to learn, building code in this manner could be harder to understand. On the other hand, for Power Query beginners, approach where we can quickly remove step generated by Copilot, by removing one applied step, is probably easier to handle.

Describe Query

Last option that we can use with Copilot within Dataflows Gen 2 is to Describe the Query. I used on “fact_sale” table, and here is the result:
Figure 19. Describe Query with Copilot – fact_sale query.

Query descriptions in Copilot focus on technical details, not business context. While this is useful, it’s important to manage expectations. Copilot streamlines query summaries, especially helpful with inherited or poorly documented Power BI solutions. It simplifies understanding complex queries, ideal for generating concise descriptions.

There is not much we can squeeze out of Describe Query feature, so, let’s have a look at the “dimension_city” table as well:

Figure 20. Describe Query with Copilot – dimension_city query.

 

How much does it cost?

Time for Capacity impact analysis. Judging by the length of this article, you might get an impression that we did a lot. But is it really? We sent couple of prompts, but in the end we did only a few transformations. What is the cost then?
Figure 21. Capacity consumption generated by Copilot.
And again, we could as a question, is this a lot? Not for a single interaction, but we could easily see it grow beyond that, when multiple developers are working on their solutions. Using our standard approach so far, to assume that we will have 20 developers during the day working with Copilot, instead of base 1.26 % we will have over 25 %. This is definitely a lot. Let’s bring the numbers from previous tests as well (all numbers for P1 Capacity):
  • Report generation
    • Single session – 0.4 %
    • 20 developers’ number – 8 %
  • DAX Query View:
    • Single session – 0.92 %
    • 20 developers’ number – 18.4 %
  • Dataflows Gen 2:
    • Single session – 1.26 %
    • 20 developers’ number – 25.2 %

Total number for 20 developers working with mentioned Copilot features sums up to 51.6 % for P1 Capacity. That’s huge number. Let’s make the number more realistic and assume 10 developers scenario only, the number we get is still 25.8 % of base Capacity. And we are not done yet. What is left is the report summary feature in Power BI Service, that we will cover in next article.

Conclusion

I must admit, the accuracy of Dataflow Gen 2 Copilot is pretty impressive. Together with DAX Query View, Copilot in Dataflows provides the best experience so far. Of course, the more you know about the transformations you want Copilot to run for you, the more precise your prompt is, and by extension, more accurate output from Copilot. I am still not convinced that Copilot is easy to use by beginners, but if you already know something, you might achieve a lot without writing a single line of code. Capacity cost continues to grow, more than expected, but let’s wait till the last part is tested to say the final word.
 
As always, thank you for staying till the end of the article, and see you in next one 🙂
Picture of Pawel Wrona

Pawel Wrona

Lead author and founder of the blog | Works as a Power BI Architect in global company | Passionate about Power BI and Microsoft Tech

Did you enjoy this article? Share with others :)

5 7 votes
Article Rating
Comments notifications
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Related Posts