Following on from my previous Power BI Infographic blog post, I also have had the idea to create an Infographic for the Power BI Gateway.
This came about after reading the blog post by Kasper De Jonge Enterprise gateway sizing “recommendations”, and there was a lot of valuable information which I have put into the Infographic as well as detailed further below.
One of the main takeaways that I learnt from this was that I got a better understanding not only of how the Power BI Gateway works, but also how the Power BI Service works and consumes data.
Very simply put, the Power BI Gateway is currently responsible for 3 things.
- The Query Editor is where the processing takes place when you refresh your data, the Query Editor Engine is actually built into the Power BI Gateway. So if your dataset uses the Query Editor for any data shaping, then this is done within the Power BI Gateway.
- The Live/Direct Query is used to do live connections or direct queries against the various sources as shown below.
- The final and very important piece is the routing of the data between your on-premise network and the Azure Service bus, which then connects to the Power BI Service.
The takeaway for me from this was that firstly the Power BI Gateway actually has the Query Editor engine built in.
Secondly was that currently within the Power BI Service, it consumes the data that is already shaped into the required tables. And this would explain why currently when you import data directly, what you can do with the data is very limited.
Power BI Gateway Infographic
Page one is the graphical overview of the Power BI from an end to end view.
You can access the images from the following link: Power BI Gateway Infographic – Latest Version
In this section below I summarize from Kasper’s Blog post, if you are interested in more details around the Power BI Gateway, then please read below.
Data is collected from Source.
- Ideally the connection between your On-Premise Gateway and your source is as fast as possible.
- This also includes the least amount of hops as possible.
It then goes through the Power Query Engine (which is part of the On-Premise Gateway)
- So if it is using queries, where the transformations happen in the source system it will not use many resources.
- But if it does all the work in the Query Engine, then it will require CPU & Memory to do the transformations and shaping of the data.
- Once it is in the correct shape, it then needs a fast connection to transfer from the On-Premise Gateway to the Power BI Service.
Live Connect or Direct Connect
- The only data that gets moved here is the data required to render the visual.
- The On-Premise Gateway acts as in a pass along mode, passing the data to the source, getting the results and passing it back to the Power BI Service.
- There is some CPU used for routing, but not a significant amount.
To ensure that this is as seamless as possible the connection needs to be as fast as possible.
- So reducing any additional hops will possibly make the experience quicker.
Sizing – It depends
- How many data models are going to be refreshed concurrently?
How Power Query dependent / intensive is the data sources?
- If it is straight forward connections (To SQL Server, DDBMS systems)?
- Or is all the work done in the Query Editor?
- The more Cores the better, as well as memory if the current models when being refreshed in Power BI Desktop are memory instensive.
How much data is being moved?
- The more data that is being moved, the more cores required for routing.
How many users will be using Direct Query or Live Connect?
- If the underlying models use Row Level Security (RLS), as this will then require separate queries per user.
Monitoring On-Premise Gateways
It is essential to monitor the On-Premise gateways to understand how the business is using their data sources, as well as the usage on the actual server.
You can use the documentation described here: Troubleshooting the On-Premises Data Gateway
And then the section Performance Counters
This might be required if you have got a large user base and want to have specific data refreshed as a priority
Create additional dedicated On-Premise Gateway Servers.
- Have one for Corporate wide reports, used by BI Team and CFO and C’s
- Have a second one which can be used by your business users to refresh their ad-hoc reports.
- This will ensure that you can isolate any issues or errors.