I recently had a big challenge with one of my customers where due to the sheer volume of data and network connectivity speed I was hitting the 5-hour limit for processing of data into my premium per user dataset.

My solution was to change the partitions from monthly to daily. And then once I have all the daily partitions merge them back into monthly partitions.

The challenge I had was I now had to process daily partitions from 2019-01-01 to 2021-11-30. This was a LOT of partitions and I had to find a way to automate the processing of partitions.

Not only that, but I had to ensure that I did not overload the source system too!

In my example below I am going to demonstrate how I created the CSV File and then the PowerShell script to run.

I wanted to process 4 partitions at once, as I found this the sweet spot to process fast enough, but also not overload the source system.

The reason for using PowerShell is that if I had to use any of the cloud services there is the potential for it to time-out. And with my actual customer I was working with it ran for 2 full days (Yes 48 hours) to complete the refreshing of the daily partitions.

One final thing to note is that on my PPU App Workspace I had already configured my Power BI Service Principal account with Admin access as shown below.

Creating the CSV File with the required partitions

The first thing I needed to do was to create a CSV file which would have the required partitions I wanted to process.

Each person might have a different way of doing this. I used TSQL to create the CSV dataset I required as shown below.

I know the format of the Partition Names which always follows the format of YYYYQQMMDD EG: For the Date of 2020-03-04 the PartitionName would be 2020Q10304

PowerShell Script to the rescue

Below is the PowerShell script that I used, which would loop through the above CSV file and process the partitions.

The first thing is to make sure that I had the SQLServer PowerShell module installed.

If I did not have it installed, I would open PowerShell as an Administrator and then run the following command

Install-Module SqlServer

As shown below once installed I could see it installed

What this script does it it connects using the Service Principal.

It then gets the CSV file into a table.

It then uses an XMLA query and processes the partitions one line at a time from the CSV file and loops through all the lines in the CSV file.

Here is the script, with all the details commented on where you would need to put in your own details.

#1 User Account Details

# ClientID
$User = 'Service Principal Client ID'

# Secret Key
$PlainPassword = 'Service Principal Secret'

$SecurePassword = $PlainPassword | ConvertTo-SecureString -AsPlainText -Force
$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $User, $SecurePassword

# Premium or PPU XMLA End Point
# NOTE: The Workspace Connection below must be URL Encoded for the script to work.
#       You can get this from your App Settings and then Workspace Connection
$AnalysisServiceServer = "powerbi://api.powerbi.com/v1.0/myorg/PpuWorkSpaceConnection"

# Power BI Dataset Name
$AnalysisServiceDatabase = "Dataset name"

# Tenant ID
$TenantID = "Tenant ID"

# Table with Incremental Partitions
$TableName = "Sales"

# This is the CSV File with all the partitions that need to be processed.
$CSV = Import-Csv "D:\PPU_Partitions\DailyPartitions.csv"

#Looping through each of the lines in the CSV File
foreach($LINE in $CSV)
{

        $requestBody = @"
          {
  "refresh": {
    "type": "full",
    "objects": [
      {
        "database": "$AnalysisServiceDatabase",
        "table": "$TableName",
        "partition": "$($LINE.Partition1)"
      },
      {
        "database": "$AnalysisServiceDatabase",
        "table": "$TableName",
        "partition": "$($LINE.Partition2)"
      },
      {
        "database": "$AnalysisServiceDatabase",
        "table": "$TableName",
        "partition": "$($LINE.Partition3)"
      },
      {
        "database": "$AnalysisServiceDatabase",
        "table": "$TableName",
        "partition": "$($LINE.Partition4)"
      }

    ]
  }
}
"@ 

#$requestBody

        #PowerShell command to process the Partitions              
        Invoke-ASCmd -ServicePrincipal -Credential $cred -TenantId $TenantID -Server $AnalysisServiceServer -Database $AnalysisServiceDatabase -Query $requestBody

        #Writing out the each partitions that has processed to see where I am at.
        Write-Host "Processed Partitions " $($LINE.Partition1) ", " $($LINE.Partition2)

}

And this is what it looks like when it is busy processing the partitions

Please note that it will appear that nothing is going on, it is processing the data.

You can have a look using SQL Server Profiler with my previous blog post to view what is actually happening How to connect to a Power BI Premium Per User using SQL Profiler – Reporting/Analytics Made easy with FourMoo and Power BI

Summary

NOTE: If you wanted to process more or less partitions in Parallel you would need to do the following:

  • In the CSV File ensure that on each line you match the number of partition names in a single line with how many you want to process in parallel.
    • In my example I was processing 4 partitions in parallel, so I had 4 partition names in one line.
    • If I wanted to process 2 in parallel, I would only have 2 partition names.
  • I would also update the PowerShell script and remove the last 2 Partitions to ensure that it matched the CSV file.
    • This is what the requestBody would look like with 2 partitions being processed in parallel.
        $requestBody = @"
          {
  "refresh": {
    "type": "full",
    "objects": [
      {
        "database": "$AnalysisServiceDatabase",
        "table": "$TableName",
        "partition": "$($LINE.Partition1)"
      },
      {
        "database": "$AnalysisServiceDatabase",
        "table": "$TableName",
        "partition": "$($LINE.Partition2)"
      }

    ]
  }
}
"@ 

In this blog post I have shown how to process Premium/Premium Per User Partitions where it can be automated and run over a long period. I also showed how you could then configure it to run.

Thanks for reading any questions or comments are welcome!