How to get Sempy (Semantic-link) to run when being triggered from a data pipeline which runs a Notebook in Fabric
Below is where I had an error when trying to run a notebook via a data pipeline and it failed.
Below are the steps to get this working.
This was the error message I got as shown below.
Notebook execution failed at Notebook service with http status code – ‘200’, please check the Run logs on Notebook, additional details – ‘Error name – MagicUsageError, Error value – %pip magic command is disabled.’ :
I first had to make sure that in the Fabric (Power BI Service) the persona was set to Data Engineering.

I then clicked on Environment to create a new environment.

I then gave my new Environment a name and clicked Create

I then needed to add the Semantic-Link (Sempy).
- I first clicked on Public Libraries.
- I then clicked on “Add from PyPI”
- And finally in the library I then typed in “semantic-link”, which then automatically selected the latest version.

I then clicked on Save and Publish.

It first saved and then confirmed the pending changes before publishing, I clicked on Publish all.

I then clicked on Publish in the next screen prompt.

I then clicked on View Progress to view the progress of the publish.
NOTE: This does take some time to complete so please be patient!

Once completed I could see my Environment in my Fabric Workspace

I then went into my Notebook and once it opened I clicked on Environment and changed it to my Environment “FourMoo_Sempy” as shown below.

I then got confirmation of the environment change.

Now in the first part of the code I needed to load the sempy using the code below.
# First need to install Semantic Link %load_ext sempy
In my Notebook I am querying data from a semantic model and outputting it to a table called “Sales_Extract”
# Get the Power BI Workspace and Dataset
#Workspace Name
ws = "PPU Space Testing"
#Dataset Name
ds = "WWI Sales - Azure SQL Source - PPU - 4 Years - 2 Days"
# Reference: https://learn.microsoft.com/en-us/python/api/semantic-link-sempy/sempy.fabric?view=semantic-link-python#sempy-fabric-evaluate-measure
df = (
fabric
.evaluate_measure(
workspace=ws,
dataset=ds,
groupby_columns=["'Date'[Yr-Mth]"],
measure='Sales'
)
)
# Convert to Spark DataFrame
sparkDF=spark.createDataFrame(df)
sparkDF.show()
#Table Name
table_name = "Sales_Extract"
#Write to Table
sparkDF.write.mode("append").format("delta").save("Tables/" + table_name)
Here is the table shown when testing to make sure that the notebook has run successfully.

In my data pipeline I used the Notebook transformation and configured it to use my notebook I created in the previous steps.

I then tested running my data pipeline and it ran successfully as shown below.

I then confirmed this in my Lakehouse table as shown below.

One additional item to show is if I wanted to use this Environment to be the default in your App Workspace, I would it by going into my Workspace settings.
I then did the following to change the default Environment.
- I expanded “Data Engineering/Science”
- I then clicked on “Spark settings”
- Next, I clicked on “Environment”
- The next step was to enable the option to set the default environment.
- Finally, I then selected my Environment as shown below “FourMoo_Sempy”

Summary
In this blog post I have shown how I created the environment to allow me to be able to run the Sempy (Semantic-link) python package when running it from a data pipeline.
I hope you found this useful and any comments or suggestions are most welcome.
[…] Gilbert Quevauvilliers sets up an environment: […]