Cross-region Workspace migration using Fabric Notebooks and REST APIs

Hello everyone! We continue the topic of Cross-region Workspace Migration, while handling Large Semantic Models. I highly recommend checking previous articles on this topic, to get better understanding of what we are trying to achieve here:

Algorithm and overall approach are very similar to PowerShell script, but there is one major difference in this approach. Full code you get from my GitHub.

Problem statement

Large Semantic Storage Format for Power BI Semantic Models is a very useful feature, but there are things we should know when deploying them. One of the problems you might have occurs when you need to move your Workspaces to a different Capacity. If new Capacity resides in a different Azure Region, Workspace is migrated, but Large Semantic Models physically stay in the old region, causing Power BI Report and Semantic Model to be disconnected -> Power BI Report won’t work anymore.
 
This forces you to approach the migration topic with caution. If you are interested, here is a full article dedicated to Large Semantic Models: Large Semantic Models Storage Format

 

Prerequisites

You need the following to be able to run the script:
  • Tenant Admin role activated on your account
  • By definition, Workspace will not be moved to a different Region if it contains Fabric Items -> script works for Power BI Items only.
  • Import the notebook to Microsoft Fabric
  • Finish configuration in the code in Configuration section:
    • SCRIPT_MODE: (0 – to fetch all workspaces from specified Capacity, 1 – to provide list of workspaces manually)
      upn – your organizational user email, it will be used to grant Workspace access in case it is missing
    • SOURCE_CAPACITY/SOURCE_WORKSPACES: depending on Script Mode selected, provide either Capacity ID or IDs of Workspace you would like to migrate
    • TARGET_CAPACITY: Capacity where workspaces will be migrated
    • OUTPUT_FILE_PATH: if provided, script will save logs to given location
    • ADMIN_UPN: User Principal Name of your account. It will be used to grant Workspace Access to handle Semantic Models

How script handles authentication

Old school approach for running REST APIs would be call REST API to get Authorization Token, pass it in a request header, and refresh the token in case it expires. Not to mention, that access token request quite sensitive details, and best practice is not to hardcode it in our solution. Therefore, proper way would be to setup Azure Key Vault and store your credentials there… So, a lot of work. In Notebook I am using Semantic Link package. Inside of this package, you can connect to one of two REST Clients: FabricRestClient and PowerBIRestClient. When you are running a notebook in Fabric, you are already authenticated with your Microsoft Account. REST Clients take advantage of this fact, and they handle token exchange automatically. For you it simply means that your code is simpler, as you don’t need to care about calling another REST API to get the Authentication Token, and passing it in header for the rest of your API Calls. 
 
One thing to have in mind is that your Identity is established when you start Spark Session in the notebook. Therefore, if you start your notebook and you forgot to elevate your access to Tenant Admin, you must kill the session and start it again to see the effect.
 
There is one more thing regarding REST Clients, that makes your code easier to write. When you want to run Power BI REST APIs, URL path is the following:
  • https://api.powerbi.com/v1.0/myorg/xxx
Rest Client allows you to ditch the part in “bold”, making urls shorter. To show you an example, here is the code required to do it “old school”:
def get_access_token():

    client_id = mssparkutils.credentials.getSecret('https://akv-reference.vault.azure.net/', 'your_client_id')
    client_secret = mssparkutils.credentials.getSecret('https://akv-reference.vault.azure.net/', 'your_client_secret')
    tenant_id = mssparkutils.credentials.getSecret('https://akv-reference.vault.azure.net/', 'your_tenant_id')

    api = 'https://analysis.windows.net/powerbi/api/.default'
    auth = ClientSecretCredential(authority = 'https://login.microsoftonline.com/',
                                tenant_id = tenant_id,
                                client_id = client_id,
                                client_secret = client_secret)

    token = auth.get_token(api)
    return token.token

# Get Fabric Capacities
def get_fabric_capacities(p_access_token):
    api_url = f"https://api.fabric.microsoft.com/v1/admin/capacities"
    headers = {
        'Authorization': f'Bearer {p_access_token}',
        'Content-Type': 'application/json'
    }

    tenant_settings_response = requests.get(api_url, headers=headers)

    return tenant_settings_response.json()
And here is the same code achieved with PowerBIRestClient:
#Initiate the client
client = fabric.PowerBIRestClient()

# Get Fabric Capacities
def get_capacities():
    api_url = "v1.0/myorg/admin/capacities"
    response = client.get(api_url)
    return response.json()
The difference is pretty big. All of this is achieved thanks to importing sempy.fabric package at the beginning of the code.
 

Script test scenario

Test Scenario is exactly the same as for PowerShell Script. There are three workspaces and 3 different expected outcomes:
  • Workspace 1:
    • Contains Large Semantic Models
    • All Models should convert fine
    • Workspace will be migrated to new Capacity
  • Workspace 2:
    • Contains Large Semantic Models, among them one is bigger than 10 GB
    • One Semantic Model is expected to fail conversion
    • Workspace will not be migrated, due to failed conversion
  • Workspace 3:
    • Doesn’t contain Large Semantic Models
    • Workspace should be migrated to a new Capacity
I am using SCRIPT_MODE = 1, so I am expected to provide the list of Workspaces in SOURCE_WORKSPACES parameter. I am also using the OUTPUT_FILE_PATH, so my logged items are saved to csv files in OneLake. You can do it in two ways. One requires a Lakehouse to be attached to your notebook. This approach allows you to use shorter paths to store files. However, we will use fully qualified path to OneLake:
Get ABFS path to OneLake Folder
Figure 1. Get ABFS path to OneLake Folder.

How script works

We start with Configuration section, where all parameters must be setup as described above. Next block is responsible for Functions Definitions. Every single function used later in the code is defined here. Next block, script starts to actually do something is Resolve Input Parameters. It checks if provided input parameters are correct and can be found in the Tenant: Target Capacity, Source Capacity / Source Workspaces. When validation is done, Script will print out the findings:
 
Fabric Notebook - Cross-Region Workspace Migration - Resolve Input parameterss
Figure 2. Resolve Input Parameters for Fabric Notebook.
Then, program enters Main Loop, where one workspace is processed at the time:
 
  1.  Check if End User has workspace access, grants it if needed.
  2. Generate a dataframe with all Semantic Models in a Workspace, with Large Storage Format enabled. If there are no Large Semantic Models, Workspace is ready for migration. Script continues with step 7.
  3. Process Semantic Models one by one:
    1. Try converting Semantic Model to Small Storage Format.
    2. If conversion is successful -> log item in the proper collection. If conversion fails -> raise conversion_error and log item in the proper collection.
  4. Wait for conversion to complete. This information is recorded in basic Workspace level API as capacityMigrationStatus. Script continues when status changes to “Migrated”.
  5. If there was no conversion_error raised, Workspace is migrated to a new Capacity.
  6. Semantic Models are restored to a Large Storage Format. This happens regardless of step 5. In case conversion failed in step 3.2., script checks if there were any semantic models converted in the current workspaces, and will restore them in this step to keep Workspace in “untouched” state. When conversion is successful, script restores Large Semantic Model simply after moving Workspace to a new Capacity.
  7. Revoke Workspace access if it was granted at the beginning.

Code prints the summary for each Workspace, allowing to track the progress. Let’s look at the summary generated for Workspace 1:

Fabric Notebook - Cross-Region Workspace Migration - Track progress for workspace1
Figure 3. Migration progress for Workspace1.

As you can see, script was executed as expected. Large Semantic Models were found and processed, Workspace was moved to a new Capacity, and Semantic Models were reverted back to Large Storage Format (PremiumFiles in API). Now, let’s have a look at what happened in Workspace 2:

Fabric Notebook - Cross-Region Workspace Migration - Conversion progress for Workspace2
Figure 4. Migration Progress for Workspace 2.

Here we see expected complications. First of all, one Semantic Modell was too big to be converted, therefore, script no longer processes remaining Semantic Models. Conversion Error is detected, so Workspace was not moved to a new Capacity. Finally, script noticed that one Semantic Model was converted to Small Storage Format, before conversion_error was raised, therefore, it is now being restored to initial state.

Finally, let’s have a look at Workspace 3:

Fabric Notebook - Cross-Region Workspace Migration - Conversion Progress for Workspace 3
Figure 5. Migration Progress for Workspace 3.

Again, script executed as expected. No Large Semantic Models were detected; therefore, Workspace was moved to a New Capacity without any problem.

Lastly, script builds dataframes from output logs and saves them in csv format in your Fabric Lakehouse (assuming that OUTPUT_FILE_PAH was provided). You can review all the Semantic Models that were converted during migration process if you want to double check if everything went well, or you can find those that caused errors, and prevented Workspace from migration:
 
Fabric Notebook - OneLake view of failed datasets output.
Figure 6. Output File with Failed Semantic Models.
As you can see, script works fine and allows you to migrate Power BI Reports to a new Region. Even if you still need to resolve the models that failed in the process, it will at least take care of the rest automatically, saving you a lot of time.

 

Conclusion

There are lot of things to consider when you plan to migrate your Workspaces to a new Capacity Region. This script will help you to take care of Power BI related items and to avoid potential errors related to Large Semantic Models migration. You can always ask a question if it is worth to invest the time and develop a script like this. As always – it depends. The more Workspaces you have under your control, the more it makes sense to automate the process. Besides, there is always undeniable benefit – learning. To work with Fabric / Power BI Rest APIs you must go through a lot of documentation, test solutions, see what could go wrong and how to avoid it. Because of this single reason, I very much recommend trying automating as much as possible, as you will learn a lot down the road.
 
As always, thank you for reading till the end, and see you in next article 🙂
Picture of Pawel Wrona

Pawel Wrona

Lead author and founder of the blog | Works as a Power BI Architect in global company | Passionate about Power BI and Microsoft Tech

Did you enjoy this article? Share with others :)

0 0 votes
Article Rating
Comments notifications
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Related Posts