What makes up a Power BI Desktop PBIX File
I know that I personally have been interested in what makes up a PBIX file and in this blog post below I will explain from my understanding what are the different parts that make up the PBIX file. It is rather interesting in that it is actually made up of a few different aspects.
How to view the contents of a PBIX file
The starting place, as well as a question some people might have, is how do you view or know what the contents of the PBIX file are?
From my understanding the PBIX file is loosely based on the XLSX file, in that there is a very simple way to see the underlying contents.
So in order to view the contents of a PBIX file you can do the following below.
Either right click on the PBIX file and select Rename, or double click the File to get the option to rename.
Then rename the file from the extension of PBIX to ZIP
You will get prompted with the following Window asking “Are you sure you want to change it?”
- Click Yes
So now the file will have the ZIP extension
Now if you double click the ZIP file you will see all the contents, which I will go through the known contents in this blog post. How awesome is that?
The report folder contains the following 2 files below.
The Layout file contains all the information with regards to the Report Layout. Which is essentially the report sheets, as well as the placement of the visuals and all of their related properties.
This is a snippet of what the contents of the file looks like below, and as you can see it stores a lot of information that is not very friendly to read.
The LignuisticSchema file appears to hold the contents for the Sheet names if you rename them from the default “Page 1”
As you can see above I renamed a sheet to “Item count on Slicer” and when I open the LignuisticSchema file I see the following below.
This XML file contains all the content within the PBIX file
The DataMashup file contains all of your Query Editor information.
From my understanding it contains all of the following.
- Connection Details to your Source Data
- File Names or database names
All the Table information
- Within each table it also contains all the steps
As you can see below here is a snippet from the DataMashup file and in the content
highlighted below it is where it has a step called “Calculated Week of Month”
And here is the identical step in the Query Editor.
With the same syntax from the Query Editor.
It is important to note that you can actually copy the DataMashup file and send it to someone who is working with the same data, and get them to replace it with your copy. This will mean that they now have got all the Query Editor information in their
Power BI Desktop file.
The DataModel file is the file that actually stores all of your data in a highly compressed format.
Essentially this is your Power BI In-Memory Analysis Services model. As you can see below it has some detail information and then the stored data. This is where I think to myself a lot of the Power BI Magic happens because it is where the blazing fast query performance comes from.
The size of this file also is an indication of how much memory your Power BI Desktop file will consume.
As with my example the file size is 486KB, which once again shows how good the Vertipaq Compression Engine is.
If you are interested there is a great book which goes into much more details around the Vertipaq engine, in which you can read a snippet in terms of how the Vertipaq engine works here: The VertiPaq Engine in DAX
And I would suggest getting a copy of the book if you are interested in the finer details.
The DiagramState file appears to store the information for the Table and Matrix locations, but not for the Matrix Preview from what I can gather.
It would appear that the Metadata file contains all the names with regards to what you see when in the Report View.
As you can see below from my Metadata file I have highlighted the Table names in Green below. I have modified the source data so that it is easier to read.
And highlighted the Parameter Names in highlighted the Table names in GREY below.
Here is a list of my tables
Here is the list of my Parameters
SecurityBindings, Settings & Version Files
When I opened up the files SecuritySettings, Settings and Version they appeared to not have any meaningful content or details to talk about. Possibly someone else might have some input as to what these files are responsible for.
I do hope that having a look at the contents that make up the PBIX file has provided a bit more insight as to how a PBIX file works and pieces together.
NOTE: You can rename the file from a ZIP back to a PBIX to get back to your original file and open it again with Power BI Desktop.
Great article! A while ago I discovered a bug using this method of “unzipping” the pbix file. I’m unsure if this is fixed in the latest update – when you add images (background or in a container), they get encoded and are entered into the Layout file. This inflates file size which is how it should be. But when the images are deleted from the pbix, the encoding still remains and thus file size remains the same. Same thing happens when there is a change in picture i.e. the pbix stores both images and the size becomes huge!
Thanks. And you are indeed 100% correct that it inflates the file size and then if you remove it, it does not decrease. I am hopeful that it is a bug and will be fixed in due course.
[…] https://gqbi.wordpress.com/2017/05/02/what-makes-up-a-power-bi-desktop-pbix-file/ […]
[…] Gilbert recently written a great blog post about the structure of PBIX file, and I strongly recommend everyone to read it. For this blog post, my focus is on M script section of that. so, I’ll go through other parts of the structure briefly. A Power BI Desktop file is a file with *.PBIX extension. This file is a renamed version of a ZIP file. You can simply rename the file to *.zip to see the structure of it. Then you can simply unzip it into a folder. […]
[…] really interesting blog from Reza Rad, where he leverages off one of my existing blog posts (What makes up a Power BI Desktop PBIX File) and goes into more details around the DataMashup […]
how about an excel file?
The format is very similar for the Excel file, you can also put it into a ZIP and see what is inside.
The one thing I have not surmounted was the ability to edit any of these files.
I would like to be able to swap data sources on the fly – Development Db / QA DB / Prod DB.
I would like to be able to have some preset colors. In PowerBi you can only map colors that exist to specific colors. My client wants to publish a report with data that will be updated regularly. They cannot pick “10 displays #FF11BB” if when we create the report there is no data value 10 to map a color. This forces us to mock up data and swap the data source. Ideally we would like to insert these values by editing the layout file. I’ve not succeeded at that yet.
Whilst I understand your requirement, I am not sure that there is currently a way to do this, due to the dynamic nature of any dataset. And whilst your dataset might be known, the only thing that springs to mind, is to look at the Power BI Themes, and see if it can be done in this manner?
Power BI Report Themes
I’m trying to convert SSRS reports to Power Bi through automation. Firstly, is this task feasible? Secondly, if it is, where do I start from?
Hi there, I would first ask the question if you are trying to replicate exactly what is in the SSRS report. Because as I am sure you are aware, some functionality in SSRS does not exist with Power BI reports. And likewise there are some incredible features in Power BI that cannot be done in SSRS.
I honestly would just ensure that I have the same source dataset in Power BI and then build it from scratch in Power BI.
[…] of the file, rename the extension to .zip, and unzip the contents. Gilbert Quevauvilliers wrote a great blog exploring these contents to see what is included in the unzipped .pbix file. By unpacking the contents of the .pbix file, […]
Has anyone tried to parse the report file? What is a good tool to use?
Are you referring to the PBIX file?
Datamashup’s structure is documented here : https://msdn.microsoft.com/en-us/library/mt577248(v=office.12).aspx
Hi there. Thanks for the great post.
I tried to unzip *.pbix file into a folder, modify nothing, and zip it again to pbix extension. However it’s not working when opening in PBI Desktop, I got the error that the file “is corrupt or invalid report file”.
Any ideas how can we zip it back to pbix if we want to do some xml hack to the files ?
Thanks in advance!
I have found that I always make a copy of the PBIX before I make any changes, to ensure that I have got a working PBIX.
I have found that it is a bit of trial and error to find out which files to keep and which ones to delete. It is a bit of a tedious process but that is how I have done it in the past.
Let’s say you’ve extracted the pbix/zip into folder X. Now zip the files inside folder X only not folder X itself.
This might be possible but always use a copied version because sometimes it can go wrong.
It seems that you also have to remove the SecurityBindings file before making the new zip file.
Thanks for letting us know!
[…] So I decided to try to build one myself. Based on the incredibly informative blog posts of Gilbert Quevauilliers, Reza Rad, Jese Navaranjan, Imke Feldman, and David Eldersveld, among others, I was able to create […]
The things you explained were awesome. Could you also list down some applications of extracting the contents of pbix file
I was preparing a usecase for my company about the things we can do with the pbix zip file
You can have a look at Power BI Helper which does exactly this, where they take apart the PBIX and put it into a meaningful way to view the data. I know that Reza (the author) had to code it in Visual Studio to extract the information.
Power BI Helper – Radacad
I am currently trying to write a script that will make themes for a folder full of pbix files by doing the following:
1. Rename pbix files to zip, extract zip files 2, Replace theme (json) files with new theme to make all reports consistent 3. Zip files and rename them to pbix
I seem to keep getting a corrupt error when opening the files. I can do this task manually through the built in Windows zip preview (without extracting) by replacing json files but get the corrupt error when trying to automate this. I have also tried to delete the SecurityBindings file before zipping the folder any have had no success. Has anyone been able to accomplish something like this without needing to manually do it?
I have found that it is quite a complex process to unzip and rezip it up without it breaking something. I am not aware of anyone being able to do this.
This post has really helped me a lot!
But now i want to extract the fields used in a sheet and what are the associated filters used in the sheets.
Can someone help me with which file to refer after unzipping and how to decode it.
Thanks in advance
Also where can i find the relationship between the table…..like data model related information.
You can use Tabular Editor to see these details.
Thanks for the response Gilbert!
Is there a way we can manually decode from the unzipped files.
I think you can do it using Base64 encoding and Visual Studio code.
I do not know of any other way.
[…] PBIX file is really a zip file with XML files in it. To the best of my knowledge this has been reverse engineered by the community and […]
Great article, thanks! 🙂
Would this be useful in order to apply proper source control to pbix files? I would like to store all the structures on DevOps (except the DataModel of course), so I can easily see the history of the changes by using the compare function. Also, this would allow me to see the dependencies (i.e. the queries used in the model) without having to open the pbix in PowerBI Desktop…
What do you think?
(sorry I meant to write under the original post, not as a reply to this 16th July comment!)
Unfortunately I do not think you will be able to store the individual parts. You might be able to do this taking it apart with a zip file. But I am not sure on how to put it all back together successfully as this is currently not supported.
For few reports i am not able to extract the datamashup file, hence not able to find the backend details. Can you let me know the reason for the same ?
Thanks for the comment. I would suggest using the Feb 2021 version of Power BI Desktop
The underlying structure has changed and you should now be able to view the mashup data.
In PBIX file the contents are compressed and encoded . I suggest you unzip PBIT file of the same report. In that you can easily read the backend data like table/column details in Notepad or notepad ++.
So if I have a pbix file with a direct query connection or live connection where data is actually stored in database and not imported to pbix file, my data model file should be of minimum size..right?? but thats not the case, also I see my pbix file size increases i.e. my data model file size increases when data in my database increases. It shouldn’t be ..right??
FYI, I am using a cloud database Azure table storage
If its DQ only the meta data is stored. If your file size is large its then storing it in an imported mode
But as far I know when connecting with Azure tabular storage, I don’t get any option to select import, dq or live connection I simple put the tabular storage connection details. AS far I know cloud database are using live connection as their mode of connectivity. Can you put some light over it as what kind of connection is there with Azure tabular storage
It would depend on how you are connecting to the dataset.
Is this in the PBI service or how?
What do you mean by how I am connecting. In Power Bi Desktop, I select Azure Table Storage from list of data source and it asks me to put the connection details. there is no option to select Import or Direct Query or Live Connection..
What is happening even though you are selecting Azure Table Storage it is importing all the data into your PBIX file, that is why it is getting bigger every day because there is more data being loaded!
So Azure table storage supports Import mode even though I am not selecting it. That means direct query or Live connection is not supported as with other cloud data sources.
It would appear so.
thanks for confirming. its weird that azure table storage is promoted to handle large records and they don’t have direct query support which eventually puts a limit on large data size models as import wont be able to handle very large data.
Great post. Quick question, is it possible to retrieve a list of the field synonyms from one of the files?
I think it might be possible as it should be stored in the Linguistic file.
But please be aware that this is not supported and it could break your PBIX file, so always make a backup before trying anything.