SSIS – Using Date and Time Cache Lookups for faster inserts into Fact Tables

What happened is that I found that when doing inserts on larger data sets, it would take a long time to complete the date and time joins. If I excluded this the data would be returned a lot quicker.

 

So my solution to the problem was to use the Cache Transform and then the Loopkup Transform in the SSIS Data flow task. This enabled the query to be returned a lot quicker.

 

Example:

·         We are first going to create our Cache Transform for our Date and Time

·         Then we are going to insert data into our Fact table from the following Adventure Works table below using the Lookup Transformation

o    dbo.FactInternetSales

·         We will be using SSDT and SQL Server 2014

 

You can get a copy of this here: http://msftdbprodsamples.codeplex.com/releases/view/55330

·         I used the AdventureWorksDW2012 Data File and AdventureWorks Multidimensional Models SQL Server 2012

 

Creating the Cache Transform for Date and Time

NOTE: We inserted a time table into the AdventureWorksDW2012 database, which has every minute for a day which will return 1440 rows.

 

NOTE II: You will only have to create the Date and Time Cache Transform once. Unless your Date table changes every time you load your data. This is because in the later steps you will see how you just select your file.

 

1.       We will first start with setting up the Date Cache Transform.

a.        Go into your SSIS and create a new SSIS Package.

b.       Then drag in a data flow task, which we renamed to Date Cache Transform

c.        Then double click to go into the Date Cache Transform data flow task.

d.       Now in the SSIS Toolbox, drag in the ADO NET Source

                                                               i.      NOTE: The reason we use the ADO NET Source, is because it is better at getting the correct Data Types for our Cache Transform.

e.       We then configured it with the following as shown below

                                                               i.      clip_image002

                                                              ii.      NOTE: The reason that we converted our FullDateAlternateKey to Date, is because when we do our lookup later, we want it to have JUST the Date Data Type

f.         If we went into the Advanced Editor for the ADO NET Source, we can verify that the FullDate is the Data type of “Database Date

                                                               i.      clip_image004

g.        Now in the SSIS Toolbox, click under Other Transformations and drag in the Cache Transform

                                                               i.      clip_image006

h.       Then drag the blue constraint to your Cache Transform.

i.         Then double click your Cache Transform to go into the Properties

j.         Now when it opens you will see that there is no Connection Manager. Click on New

                                                               i.      clip_image008

k.        This will then open the Cache Connection Manager Editor

l.         We then configured it as shown below:

                                                               i.      clip_image010

                                                              ii.      As you can see above we put in the name.

                                                            iii.      Then we also selected to Use file cache.

1.       Then we configured the File name with a UNC

2.       NOTE: The reason that we did this is so that if you execute this package from another location it will still find the location.

3.       NOTE II: You must ensure that the account running the SSIS job has access to the folder location and file on the server.

m.      Then click on Columns. This is where you will configure which column will be used for the actual lookup.

                                                               i.      In our example we want to do the lookup on the FullDate, so next to FullDate we will put the Index Position of 1

                                                              ii.      NOTE: You can have more than one lookup, and each additional Index Position will increment in number.

                                                            iii.      clip_image012

n.       Then click Ok to go back to the Cache Transformation Editor.

o.       Now click on the Mappings in the left hand side

p.       You should now see the following:

q.       clip_image014

r.        NOTE: From above you can see because we created the Index Position of 1 for the FullDate, it has got the magnifying glass. Which represents that this can be used in the Lookup later.

s.        Click Ok.

t.         You can now run this data flow task to populate your Date Cache Transform.

u.       clip_image016

2.       Next will be setting up the Date Cache Transform.

a.        Go into your SSIS and create a new SSIS Package.

b.       Then drag in a data flow task, which we renamed to Time Cache Transform

c.        Then double click to go into the Time Cache Transform data flow task.

d.       Now in the SSIS Toolbox, drag in the ADO NET Source

                                                               i.      NOTE: The reason we use the ADO NET Source, is because it is better at getting the correct Data Types for our Cache Transform.

e.       We then configured it with the following as shown below

                                                               i.      clip_image018

f.         If we went into the Advanced Editor for the ADO NET Source, we can verify that the Time is the Data type of “database time with precision

                                                               i.      clip_image020

g.        Now in the SSIS Toolbox, click under Other Transformations and drag in the Cache Transform

                                                               i.      clip_image006[1]

h.       Then drag the blue constraint to your Cache Transform.

i.         Then double click your Cache Transform to go into the Properties

j.         Now when it opens you will see that there is no Connection Manager. Click on New

                                                               i.      clip_image021

k.        This will then open the Cache Connection Manager Editor

l.         We then configured it as shown below:

                                                               i.      clip_image023

                                                              ii.      As you can see above we put in the name.

                                                            iii.      Then we also selected to Use file cache.

1.       Then we configured the File name with a UNC

2.       NOTE: The reason that we did this is so that if you execute this package from another location it will still find the location.

3.       NOTE II: You must ensure that the account running the SSIS job has access to the folder location and file on the server.

m.      Then click on Columns. This is where you will configure which column will be used for the actual lookup.

                                                               i.      In our example we want to do the lookup on the Time, so next to Time we will put the Index Position of 1

                                                              ii.      NOTE: You can have more than one lookup, and each additional Index Position will increment in number.

                                                            iii.      clip_image025

n.       Then click Ok to go back to the Cache Transformation Editor.

o.       Now click on the Mappings in the left hand side

p.       You should now see the following:

q.       clip_image027

r.        NOTE: From above you can see because we created the Index Position of 1 for the Time, it has got the magnifying glass. Which represents that this can be used in the Lookup later.

s.        Click Ok.

t.         You can now run this data flow task to populate your Time Cache Transform.

u.       clip_image029

3.       Now if you are using SSIS 2012 or 2014, we will convert the Date and Time Cache Connections to Project connections.

a.        This is so that later we can select them from our other SSIS Packages.

b.       Right click on each Cache Connection Manager and select Convert to Project Connection

                                                               i.      clip_image031

c.        They will now look like the following in the Connection Managers

d.       clip_image033

 

Creating your TSQL Query so that it can be used for the Lookup Transformation

In the steps below we need to ensure that we create the correct data types in our TSQL Query so that when we do the lookup it will map correctly.

 

NOTE: If the data types within SSIS are not an exact match between your TSQL Query and the Lookup Transformation it will not allow you to create the mapping.

 

1.       Below is the TSQL Example from our dbo.FactInternetSales table.

SELECTTOP 1000 [ProductKey]

      ,[OrderDateKey]

      ,[DueDateKey]

      ,[ShipDateKey]

      ,[CustomerKey]

      ,[PromotionKey]

      ,[CurrencyKey]

      ,[SalesTerritoryKey]

      ,[SalesOrderNumber]

      ,[SalesOrderLineNumber]

      ,[RevisionNumber]

      ,[OrderQuantity]

      ,[UnitPrice]

      ,[ExtendedAmount]

      ,[UnitPriceDiscountPct]

      ,[DiscountAmount]

      ,[ProductStandardCost]

      ,[TotalProductCost]

      ,[SalesAmount]

      ,[TaxAmt]

      ,[Freight]

      ,[CarrierTrackingNumber]

      ,[CustomerPONumber]

      ,[OrderDate]

      ,[DueDate]

      ,[ShipDate]

      ,[DistinctCustomers]

      ,Convert(date,[OrderDate])as DateForCache

      ,Cast(convert(varchar(5),convert(time(0),[OrderDate]))+‘:00’asTime(0))as TimeForCache

  FROM [AdventureWorksDW2012].[dbo].[FactInternetSales] with (nolock)

2.       As you can see above we have converted out OrderDate to the Date data type.

3.       We have also converted our OrderDate to only show use the Time, using the data type of Time(0)

a.        NOTE: If this is not done in this way, when the time goes over 12pm it would then display it was 01pm instead of 13:00

4.       Now we can use the above query in our data flow task to Insert into our fact table.

 

Using the Lookup Transformation for inserting data into our Fact table

Below we will now use our TSQL Query, as well as our Lookups to our Date and Time Cache Transform

 

1.       Go into SSIS, and go to your SSIS Package where you want to insert your data into the Fact table.

2.       Then drag in your data flow task.

a.        If needed rename your data flow task.

3.       Now drag in an ADO NET Source.

a.        NOTE: The reason for the ADO NET Source is so that it will get the correct data types from our query.

b.       I have found in the past that the OLE DB Source at times does not find the correct data types for the Lookup Transform.

4.       We then put in the query from step 1 in the section above.

a.        NOTE: We went into the Advanced Editor to ensure that the Date Types for the DateForCache and TimeForCache columns are correct.

b.       clip_image035

c.        clip_image037

5.       Now we need to create our first lookup for our Date Cache Transform.

a.        In the SSIS Toolbox drag in the Lookup under the Common section

                                                               i.      clip_image039

b.       We then renamed the Lookup to Date Lookup

c.        Then drag the Blue constraint to our Date Lookup.

d.       Double click on the Date Lookup to go into the Properties

e.       On the General section we configured it with the following:

                                                               i.      clip_image041

                                                              ii.      NOTE: The reason we chose the Cache Connection Manager is so that we can use the cache file we created earlier.

f.         Then click on the Connection.

g.        If you configured your Date and Time Cache Connections as Project Connections you will see them in the Cache Connection Manager

                                                               i.      Click on the drop down and select Date Cache Connection

                                                              ii.      clip_image043

h.       Then click on Columns, and we configured it with the following:

i.         clip_image045

j.         As you can see above we have used our Input column which was DateForCache, and then mapped it to our Lookup Column called FullDate which we configured earlier.

k.        We then selected the DateKey as our Output Alias

l.         Then click Ok.

6.       Next we configured our  lookup for our Time Cache Transform.

a.        In the SSIS Toolbox drag in the Lookup under the Common section

                                                               i.      clip_image039[1]

b.       We then renamed the Lookup to Time Lookup

c.        Then drag the Blue constraint to our Time Lookup.

d.       When you do this it will pop up with the Input Output selection window

                                                               i.      For the Output select the Lookup Match Output

                                                              ii.      clip_image047

                                                            iii.      Click Ok.

e.       Double click on the Time Lookup to go into the Properties

f.         On the General section we configured it with the following:

                                                               i.      clip_image041[1]

                                                              ii.      NOTE: The reason we chose the Cache Connection Manager is so that we can use the cache file we created earlier.

g.        Then click on the Connection.

h.       If you configured your Date and Time Cache Connections as Project Connections you will see them in the Cache Connection Manager

                                                               i.      Click on the drop down and select Time Cache Connection

                                                              ii.      clip_image049

i.         Then click on Columns, and we configured it with the following:

j.         clip_image051

k.        As you can see above we have used our Input column which was TimeForCache, and then mapped it to our Lookup Column called Time which we configured earlier.

l.         We then selected theTimeKey as our Output Alias

m.      Then click Ok.

7.       Now finally drag in your ADO NET Destination.

a.        Drag the blue constraint to your ADO NET Destination.

b.       When you do this it will pop up with the Input Output selection window

                                                               i.      For the Output select the Lookup Match Output

                                                              ii.      clip_image052

                                                            iii.      Click Ok.

c.        Then go into our ADO NET Destination Properties and select your destination Fact table.

d.       And then ensure that your mappings are correct.

e.       If you scroll to the bottom you should see the mappings from your Lookups above to your Fact Table

f.         clip_image054

8.       Then finally run your SSIS Package and you should see the following:

SQL Server Integration Services (SSIS) – Using Send Mail Task for data validation checks

What I wanted to do was to be notified in my data if there were not enough rows of data, or if the financials were too low, indicating that there was a potential issue with the data coming in.

And if it was too low to fire off an email. If it was correct then to not send an email. SO that we would only be notified when there is an issue.

So in order to explain this I will explain it with an example below, which will make it easier to understand.

 

Example:

·         We want to compare the count of SQL Server Analysis (SSAS) Partitions for our Internet Sales Cube.

·         If the count of partitions increases then send an email.

·         If the count of partitions remains the same then do nothing.

 

You can get a copy of this here: http://msftdbprodsamples.codeplex.com/releases/view/55330

·         I used the AdventureWorksDW2012 Data File and AdventureWorks Multidimensional Models SQL Server 2012

 

You can refer to my previous blog post if you want to see how we populate the SSAS Partition Details

                PUT IN HERE

 

NOTE: We are using SQL Server 2014 and SQL Server Data Tools with Visual Studio 2013

 

 

1.       Create a new SSIS Package.

2.       Next we create the following 2 variables, which will hold our Previous and Current Partition Counts

a.        clip_image002[6]

3.       Next we will load the data into our variables using an Execute SQL Task and a SQL Server TSQL Script.

a.        NOTE: I normally would put the TSQL script into a stored procedure, but for this example I am going to put it in the package so that you can see how it got populated.

b.       From the SSIS Toolbox drag in an Execute SQL Task, then configure it with the following:

                                                               i.      As you can see we gave it a name, changed the result set to Single Row and created our connection

                                                              ii.       clip_image004[6]

c.        Then click on the Ellipses next to SQLStatement and put in the following TSQL code.

Declare @PreviousCountOfSSASPartitions asint

Set @PreviousCountOfSSASPartitions =

       (

         Selectcount(1)

         FROM [AdventureWorksDW2012].[dbo].[Mart_TD_SSAS_PartitionDetails] with (nolock)

         where [Parent_MeasureGroupID] =‘Fact Internet Sales 1’

         andconvert(date,CreatedTimeStamp)=‘2013-07-09’

       )

 

Declare @CurrentCountOfSSASPartitions asINt

Set @CurrentCountOfSSASPartitions =

       (

         Selectcount(1)

         FROM [AdventureWorksDW2012].[dbo].[Mart_TD_SSAS_PartitionDetails] with (nolock)

         where [Parent_MeasureGroupID] =‘Fact Internet Sales 1’

         andconvert(date,CreatedTimeStamp)>=‘2013-07-09’

       )

 

Select   @PreviousCountOfSSASPartitions as PreviousCountOfSSASPartitions

        ,@CurrentCountOfSSASPartitions as CurrentCountOfSSASPartitions

                                                               i.      Then click Ok.

d.       Now in the left hand pane click on Result Set and configure it so that we can populate our variables as shown below:

                                                               i.      clip_image006[6]

e.       Then click Ok.

4.       Next we are going to create a Send Mail Task so that when we get an error to send out the email.

a.        We dragged in a Send Mail Task and configured it to send via our Mail Server.

b.       Along with who to send it to, as well as the details in the email.

c.        NOTE: You can add expressions to add what you would to send dynamically.

                                                               i.      If you are going to put something in the actual email message, it is easiest to compile it in Notepad and then copy and paste it. This way it will keep your content as expected.

5.       Now we are going to use the precedence constraint in order to decide if we must send the email or now.

a.        Drag your precedence constraint from your Populate variables for Count of Partitions to our send mail task

                                                               i.      clip_image008[6]

b.       Then right click on the Success Precedence Constraint and select Edit

c.        We then configured it with the following as shown below

d.       clip_image010[6]

e.       For the Expression we configured it saying if the Previous Partition Count is less than the Current Partition Count then continue and send the email.

@[User::PreviousCountOfSSASPartitions] <  @[User::CurrentCountOfSSASPartitions]

f.         Then click Ok, and Ok again.

g.        You will now see that we have put in an expression

                                                               i.      clip_image012[6]

6.       Now when we run it when the partition counts are different it will send the email as shown below

a.        clip_image014[6]

7.       And then if we run it when the Partition counts are the same, it will NOT go to the send Mail Task

a.        clip_image016[4]

SQL Server Integration Services (SSIS) – Re-run a specific Control Flow Item if it has an error

We had an issue where we only wanted to re-run a particular control flow item when it failed or had an error. We did not want to do this for our Entire SSIS Package but just for a specific Control Flow Item.

 

Example: Inserting Data into a Staging Table and if it fails to then re-try the insert 3 times

 

You can get a copy of this here: http://msftdbprodsamples.codeplex.com/releases/view/55330

·         I used the AdventureWorksDW2012 Data File and AdventureWorks Multidimensional Models SQL Server 2012

 

NOTE: This example is using SQL Server Data Tools (SSDT) for Visual Studio 2013, and SQL Server 2014

 

1.       Create your new SSIS Package.

2.       Create the following variable:

a.        clip_image002

b.       The reason we have the LoopQuitParent variable is for the following:

                                                               i.      It will be used to exit out of our For Loop Container when it has any other value than zero

                                                              ii.      Within our For Loop Container depending on what happens to our Control Flow items, we will then set the LoopQuitParent to 1 if the Control Flow item has success.

                                                            iii.      If the Control Flow Item fails then we will the LoopQuitParent to 0 (zero) so that the For Loop Container will try again.

3.       Then drop in a For Loop Container into your Control Flow.

a.        We renamed our For Loop Container to:

Retry Staging Insert

b.       We then configured it as shown below:

c.        clip_image004

d.       What we are saying here is when our variable called LoopQuitParent is not equal to zero then exit.

4.       Next we are going to drag in what we want to test to retry within our For Loop Container.

a.        As per our example we are going to first truncate our staging table.

b.       And then Insert data into our Staging table.

c.        Below is a script to create our Staging Table

CREATETABLE [dbo].[Staging_SSAS_PartitionDetails](

       [Name] [varchar](300)NULL,

       [ID] [varchar](300)NULL,

       [CreatedTimeStamp] [varchar](300)NULL,

       [LastSchemaUpdate] [varchar](300)NULL,

       [Description] [varchar](300)NULL,

       [LastProcessed] [varchar](300)NULL,

       [State] [varchar](300)NULL,

       [Type] [varchar](300)NULL,

       [AggregationPrefix] [varchar](300)NULL,

       [StorageMode] [varchar](300)NULL,

       [CurrentStorageMode] [varchar](300)NULL,

       [StringStoresCompatibilityLevel] [varchar](300)NULL,

       [CurrentStringStoresCompatibilityLevel] [varchar](300)NULL,

       [ProcessingMode] [varchar](300)NULL,

       [ProcessingPriority] [varchar](300)NULL,

       [StorageLocation] [varchar](300)NULL,

       [RemoteDataSourceID] [varchar](300)NULL,

       [Slice] [varchar](300)NULL,

       [EstimatedRows] [varchar](300)NULL,

       [AggregationDesignID] [varchar](300)NULL,

       [EstimatedSize] [varchar](300)NULL,

       [Parent_MeasureGroupID] [varchar](300)NULL,

       [Parent_CubeID] [varchar](300)NULL,

       [Parent_DatabaseID] [varchar](300)NULL

)ON [PRIMARY]

d.       NOTE: This table is based on the previous blog post for SSAS Partition Details

SSAS (SQL SERVER ANALYSIS SERVICES) – GETTING ALL PARTITION INFORMATION FROM SSAS DATABASE

e.       Next we put in an Execute SQL Task and put in our Truncate Table Statement for the above table.

                                                               i.      clip_image006

f.         Next we put in our Data Flow Task to insert our data into our Staging Table

                                                               i.      clip_image008

g.        Next we can drag our Success Precedence Constraint between our Truncate Table and our Insert into items.

                                                               i.      clip_image010

h.       Next in order to populate our LoopQuitParent variable for when it succeeds we need to do the following:

                                                               i.      Drag in an Execute SQL Task, and we are going to configure it so that if this Execute SQL Task is run it will set the variable LoopQuitParent equal to 1, which in turn will result in the For Loop Container exiting.

                                                              ii.      We put the following in the General Section of our Execute SQL Task

                                                            iii.      clip_image012

                                                            iv.      NOTE: We renamed it to: SUCCESS – Set Variable LoopQuitParent Equal to 1

                                                              v.      Then we click on the Result Set and configured it so that our variable would be populated

                                                            vi.      clip_image014

i.         Next in order to populate our LoopQuitParent variable for when it fails we need to do the following:

                                                               i.      Drag in an Execute SQL Task, and we are going to configure it so that if this Execute SQL Task is run it will set the variable LoopQuitParent equal to 0, which in turn will result in the For Loop Container continuing.

                                                              ii.      We put the following in the General Section of our Execute SQL Task

                                                            iii.      clip_image016

                                                            iv.      NOTE: We renamed it to: FAILURE – Set Variable LoopQuitParent Equal to 0

                                                              v.      Then we click on the Result Set and configured it so that our variable would be populated

                                                            vi.      clip_image018

j.         Next we need to drag our Success Precedent Constraint to our SUCCESS – Set Variable LoopQuitParent Equal to 1 Execute SQL Task as Shown below:

                                                               i.       clip_image020

k.        Next we need to drag our Failure Precedent Constraint to our FAILURE – Set Variable LoopQuitParent Equal to 0 Execute SQL Task as Shown below:

                                                               i.      clip_image022

5.       The next part is to configure our For Loop Container so that when it does fail how many times we want it to retry.

a.        NOTE: Because we are using the MaximumErrorCount Property for our For Loop Container it works like the following:

                                                               i.      It will always run at least once.

                                                              ii.      For example, if you wanted your package to retry 3 times, you would have to set the MaximumErrorCount = 6

1.       This is because the first time it run’s it is trying to execute. And NO retry has started.

2.       Then it will retry another 3 times each time starting again, making the total times it will need to be configured to 6.

                                                            iii.      So the calculation for the retry is the following:

1.       Total Retries X 2

2.       EG: Total Retries must be 3 times

a.        3 X 2 = 6

b.       We went into the Properties for our For Loop Container and set the MaximumErrorCount = 6

                                                               i.      clip_image024

6.       The final step is if your For Loop Container does error to not propagate to the parent or SSIS Package

a.        NOTE: The reason for doing this is so that your entire SSIS Package will NOT fail during our retries.

b.       Make sure that before you do the steps below you HAVE clicked on the For Loop Container

                                                               i.      As with our example it is called Retry Staging Insert

c.        To change this setting click on the Event Handlers tab at the top of your SSIS package

                                                               i.      clip_image026

d.       Next click on the following below to enable the Event Handler

                                                               i.      clip_image028

                                                              ii.      NOTE: As you can see above our In error event handler is for our For Loop Container called ‘Retry Staging Insert’

e.       Next go to your variables, click on the Grid Options and ensure to select “Show system variables”

                                                               i.      clip_image030

f.         Now scroll down in your Variables until you get to the name called Propagate, with the Scope of On Error

                                                               i.      Change the value to False

                                                              ii.      clip_image032

g.        Now save your SSIS Package.

h.       Now your For Loop Container will not cause your entire SSIS Package to fail.

7.       Now if you run it and it works the first time, it should run through and exit as shown below

a.        clip_image034

8.       Now if we cause the data flow task to fail, when looking at the Execution results we see the following to retry 3 times

a.        clip_image036

b.       When looking at the Truncate Table Staging_SSASPartitionDetails task we see the following

c.        clip_image038

d.       As you can see above it retried a total of 3 times.

e.       And as you will also see due to our configuration not to Propagate to our SSIS Package even though it failed it did complete with success

 

 

SSIS – Using SQL Server Analysis Services (SSAS) MDX query into Variable in Execute SQL Task

What I wanted to do was to get the output from my MDX query, and then use this in SSIS and take the output and put it into an SSIS Variable which could then be used later in your SSIS Package.

 

Example:

·         We are going to get the Internet Sales Amounts for 2 days.

o    01 June 2008 – $44,650.70

o    02 June 2008 – $35,496.03

·         Then we are going to put the two values into a variable.

·         Then we are going to use a Precedence constraint to compare the values, and if the Internet Sales amount is lower when compared then continue onto the next step.

o    So if the value on 02 June 2008 is lower than the value on 01 June 2008 continue to the next step.

o    So as from the above values it should continue onto the next step.

o    If the amount is higher then do not do anything.

·         NOTE: The reason that we are using an MDX query is so that we can get our data back as quickly as possible.

 

You can get a copy of this here: http://msftdbprodsamples.codeplex.com/releases/view/55330

·         I used the AdventureWorksDW2012 Data File and AdventureWorks Multidimensional Models SQL Server 2012

 

1.       The first thing that we are going to do below is get our MDX query which connects to the AdventureWorksDW2012 SSAS Cube.

a.        This query is for Internet Sales for 01 June 2008

Selectnonempty {[Measures].[Internet Sales Amount]} on 0

,nonempty {(

                     [Date].[Date].&[20080601]

                     )} on 1

from [Adventure Works]

                                                               i.       

b.       This query is for Internet Sales for 02 June 2008

Selectnonempty {[Measures].[Internet Sales Amount]} on 0

,nonempty {(

                     [Date].[Date].&[20080602]

                     )} on 1

from [Adventure Works]

                                                               i.       

2.       Now create your new SSIS Package.

3.       Next we will create our 2 variables that will be used to populate using the MDX queries above.

4.       Below is how we have created them:

a.        clip_image002[1]

b.       NOTE: We have created them as Int64 due to us getting back a numeric value.

                                                               i.      And this can also be compared in our Precedence Constraint

5.       Next we need to create an OLE DB Connection to our SSAS Cube by doing the following below:

a.        Right click in your Connection Managers section and select New OLE DB Connection

b.       When the window opens click on New

                                                               i.      Now from the drop down select the following:

1.       clip_image004[1]

2.       NOTE: You have to select the above so that we can connect to our SSAS Instance.

                                                              ii.      Now once configured you will see the following below:

1.       clip_image006[1]

                                                            iii.      If you want you can click on Test Connection to ensure that it can connect.

                                                            iv.      Then click Ok and then Ok again.

c.        You should now see the following in your Connection Managers Window:

                                                               i.      clip_image008[1]

                                                              ii.      NOTE: If required you can make this a Project Connection if your SSIS Project is in Project Deployment mode.

6.       Now what we are going to do is to assign our MDX Query to our variables using the Execute SQL Task.

7.       Drag in an Execute SQL Task and complete the following steps for One Day back

a.        As per our example I renamed the Execute SQL Task to the following:

Get Internet Sales Amount for One Day Back into Variable

b.       I went into the Execute SQL Task Properties and configured it with the following:

                                                               i.      clip_image010[1]

c.        We then selected our OLE DB connection that we created in step 5 above:

                                                               i.      clip_image012[1]

d.       Next under SQLStatement we are then using our MDX query from Step 1a above:

                                                               i.      clip_image014[1]

                                                              ii.      Then click Ok.

e.       Next click on the Result Set and click on Add

                                                               i.      Now where it says Result Name change this to the following:

[Measures].[Internet Sales Amount]

                                                              ii.      And then ensure that the Variable Name is:

User::InternetSales_OneDayBack

                                                            iii.      clip_image016[1]

f.         Then click Ok.

g.        Now right click and select Execute Task to test that the variable will be populated.

h.       clip_image018[1]

8.       Drag in an Execute SQL Task and complete the following steps for Two Days back

a.        As per our example I renamed the Execute SQL Task to the following:

Get Internet Sales Amount for Two Days Back into Variable

b.       I went into the Execute SQL Task Properties and configured it with the following:

                                                               i.      clip_image010[2]

c.        We then selected our OLE DB connection that we created in step 5 above:

                                                               i.      clip_image019[1]

d.       Next under SQLStatement we are then using our MDX query from Step 1b above:

                                                               i.      clip_image021[1]

                                                              ii.      Then click Ok.

e.       Next click on the Result Set and click on Add

                                                               i.      Now where it says Result Name change this to the following:

[Measures].[Internet Sales Amount]

                                                              ii.      And then ensure that the Variable Name is:

User::InternetSales_TwoDaysBack

                                                            iii.      clip_image023[1]

f.         Then click Ok.

g.        Now right click and select Execute Task to test that the variable will be populated.

h.       clip_image025[1]

9.       Now the final step is to have a next step to go to if the value is lower.

10.    For our example we are going to put the next step as a Send Mail Task

a.        So we dragged in the Send Mail Task and configured it so that it can send an email if run.

b.       Next we dragged the success precedence constraints as shown below:

c.        clip_image027[1]

d.       NOTE: The reason it was done in this order is so that we can get both variables populated before we change the constraint options.

e.       Now double click on the success precedence constraint that goes from Get Internet Sales Amount for Two Days Back into Variable to the Send Mail Task

                                                               i.      clip_image029[1]

f.         Now in the Precedence Constraint Editor Window complete the following:

                                                               i.      Change the Evaluation operation from Constraint to Expression

                                                              ii.      clip_image031[1]

g.        Then where it says Expression click on the Ellipses button.

h.       Now in order to check if the value is less to continue we put the following:

@[User::InternetSales_OneDayBack] <  @[User::InternetSales_TwoDaysBack]

                                                               i.      Click on Evaluate Expression to ensure that it is valid.

                                                              ii.      clip_image033[1]

i.         Then Click Ok and Ok again to go back to the Control Flow

11.    Now finally run your SSIS Package and it should complete with the following as shown below:

12.    clip_image035[1]

13.    NOTE: If you had to change your Precedence Constraint to the following below and re-run the above in order to test if the value was higher you would get the following result:

a.        clip_image037[1]

 

SSIS (SQL Server Integration Services) – Using the Lookup Transformation and cache and how to handle NULL Values

I had a situation where I was using the Lookup transformation and then loading this into the SSIS Cache, but I wanted all rows to be inserted using the SSIS Cache, even if there was a NULL Value. Below explains how I overcame this.

 

Example:

·         In our Source data we have Products, but they might not be in our Product Dimension table, due to the source systems not having the required data.

o    NOTE: While this is not ideal and we either should fail the component or redirect it, for this example this is fine.

·         We have created a row in our Products Table for Products that we do not find.

o    EG:

§  ProductsSK ‘-1’

§  ProductsName ‘Product not Found’

·         So when we insert data we want to ensure that even when products are not found, they will still exist in the Lookup Transformation and cache.

 

1.       I created a Lookup Transformation in SSIS which I then configured as the following explained below.

2.       On the General Tab I configured it with the following as shown below:

a.        clip_image002

b.       NOTE: The reason that I selected the Redirect rows to no match output in the Specify how to handle rows with no matching entries, is because due to the configuration of our Lookup transformation all the rows WILL find a match.

3.       Next click on the Connection on the left hand side.

a.        clip_image004

b.       Now you will select your OLE DB connection to your database where your Product Dimension is located.

c.        Then select Use results of an SQL Query.

d.       Then what we did was to put in the following SQL Query which we will explain why we did this after the query

  SELECT[ProductsSK],[ProductsSK]asProductsSKToUse      

  FROM[dbo].[Mart_TD_Products]with (nolock)

 

  UnionAll

 

  Selectnullas[ProductsSK],1 asProductsSKToUse

                                                               i.      Now as you can see above we have create a duplicate column names from our Dimension Table.

                                                              ii.      The reason for this is in the second part of the query with the Union All

                                                            iii.      Now you can see that we have actually put in a NULL value, and then given this NULL value a value of -1

1.       NOTE: This corresponds to the data that we already have in our Products Dimension table.

                                                            iv.      So now when you load your query into the SSIS Cache it will also have a value of the following:

1.       clip_image006

e.       Now when the lookup is running in SSIS, when it finds a NULL value it will assign it a value of ‘-1’ to the ProductsSKToUse

4.       Then click on Columns and create the lookup

                                                               i.      clip_image008

                                                              ii.      As you can see above we dragged the ProductsSK to our ProductsSK in our Available Lookup Columns

                                                            iii.      And then below you can see that our Output Alias is ProductsSKToUse

5.       This will then be used in our insert into our Fact Table.

6.       And in this way when there is a NULL Value it will still be matched.

a.        So as you can see for our 2 rows, we had 2 matches even when 1 row was NULL

b.       clip_image010

SSIS – Running single SSIS packages in Parallel to get faster performance for multiple data loads

What we are going to show you below, is that by using ONE SSIS package, we can re-use this one SSIS package multiple times to get data from multiple sources, possibly from different databases hosted in different locations around the world. So in effect we are loading our data from one SSIS package in parallel.

 

NOTE: By completing the steps below we have seen our packages perform and complete a lot quicker due to running in parallel.

 

The thing to note with my example below is that all the database have exactly the same structure within the databases, where ever they are hosted.

 

Example Details

·         Databases are hosted in different locations around the world.

o    And there were multiple databases on each database system.

·         All the databases have exactly the same table and schema structures.

·         Within each hosting location we have different IP Addresses and database names.

 

Our requirement was to load the data as quickly as possible from our multiple locations. This is how we achieved this.

 

NOTE: I am not going to go into the details on how we got our connection details to connect to each system. If someone has a requirement for this then I can possibly put this into another blog post.

 

Creating your Control SSIS Package to run in Parallel

What I am going to explain below is how we use a control SSIS package, which controls how we run our SSIS packages in parallel.

 

1.       The first thing that we did was to create the following variable called:

a.        Name:

                                                               i.      LoopQuitParent

b.       Scope:

                                                               i.      PackageName

c.        Data Type:

                                                               i.      Int32

d.       NOTE: The reason that we create this is because of the following:

                                                               i.      This is going to be used in the steps further down in our For Loop container to know when to exit the For Loop.

                                                              ii.      It is also going to be used in the actual SSIS package that runs in parallel, and then it has completed to use a script task to populate this variable from our parent package.

2.       As with normal data warehousing practice we first truncate our staging tables.

a.        clip_image002

3.       Next is where we have our sequence container.

a.        It is within this sequence container that we put how many packages we want to run in parallel.

4.       So within our Sequence container we put the following

a.        First we dragged in a For Loop container inside our Sequence Container.

b.       We then configured our For Loop container with the following:

                                                               i.      clip_image004

                                                              ii.      What we are saying above is when our variable from step 1 @LoopQuitparent not equals zero then finish or complete.

c.        Next inside our For Loop Container we put an Execute Package Task

                                                               i.      NOTE: This is our SSIS Package that is going to run multiple times or in Parallel.

                                                              ii.      As per our example if you look at the Package Window on the left hand side you will see where we have configured our SSIS Package

                                                            iii.      clip_image006

d.       Then all that you need to do is to copy and paste the For Loop Container as many times as you want to run in Parallel.

                                                               i.      In our picture below we wanted to run it in parallel x 4

                                                              ii.      NOTE: We just renamed each For Loop container so that we know how many we have.

                                                            iii.      NOTE 2: It might be good to test how many will be the optimal number for you, as it is dependent on server resources, network interface and bandwidth to your locations of where the data is.

e.       clip_image008

5.       In the next steps we will explain how we configure our one SSIS package so that it can run in parallel as well as know when to exit.

 

Configuring our SSIS Package so that it can get the required details and run in parallel (SSIS-Parallel-LoadStaging.dtsx)

In the next steps I will explain how we get our one SSIS package (SSIS-Parallel-LoadStaging.dtsx) to get the required data into variables and then run in parallel.

 

1.       It is going to be easiest to explain if I can show you an example of what our Source Table looks like below. This is where we store the following which will be explained after the picture for clarity.

a.        clip_image010

b.       SourceSystemSK

                                                               i.      We use this to uniquely identify our Source Systems. This is used throughout our SSIS Packages and is put into our data warehouse and Cube so that we know from which system the data came from.

                                                              ii.      NOTE: This also enables us to get an overview of a particular system when it has more than one database.

c.        IPAddress

                                                               i.      This is the actual IP address that we will pass as an expression in our Connection manager in order to connect to the SQL Instance.

d.       SQLPortNumber

                                                               i.      This is the standard SQL Port number, but if for some reason it was on a different port we would specify it here.

                                                              ii.      NOTE: This can also get passed as an expression in our Connection manager.

e.       IsProcess

                                                               i.      This is used later within our SSIS package (SSIS-Parallel-LoadStaging.dtsx) and what we will explain later, is that when the row has been used and the data processed it changes the value from a 2 to a 1.

f.         DatabaseName

                                                               i.      This is the database name within the SQL Instance system we are connecting to.

                                                              ii.      NOTE: This is gets passed as an expression in our Connection Manager.

g.        Location

                                                               i.      This is used in our dimensions later if we want to know where the actual data is coming from.

h.       SQLInstanceName

                                                               i.      This is the actual SQL Instance name that we are connecting to.

i.         IsActive

                                                               i.      This is part of our process, where we test to see if we can connect to the actual SQL Instance and databasename. If we can the IsActive is set to 1, and if not it is set to 0 (Zero)

                                                              ii.      NOTE: We did not explain how we get the data into this table, as this would make this blog post a whole lot longer.

2.       So what we do in our first step is to get the SourceSystemSK into a variable.

a.        clip_image012

b.       But we need to explain what else we do within our Execute SQL Task.

c.        Below is the actual query with an explanation afterwards

— Section 1

DECLARE@ServerIDINT

 

— Section 2

BEGINTRAN

 

— Section 3

SETROWCOUNT 1

 

— Section 4

SELECT@ServerID=[SourceSystemSK]

FROM[dbo].[Mart_TD_SourceSystems]WITH (TABLOCKX,HOLDLOCK)

WHEREIsProcess= 2

 

— Section 5

UPDATE[dbo].[Mart_TD_SourceSystems]

WITH (TABLOCK)

SETIsProcess= 1

WHERE[SourceSystemSK]=@ServerID

 

— Section 6

SETROWCOUNT 0

 

— Section 7

SELECTISNULL(@ServerID, 0)asSourceSystemSK

 

— Section 8

COMMIT

                                                               i.      — Section 1

                                                              ii.      The first thing that we do is declare our @ServerID

                                                            iii.      — Section 2

                                                            iv.      Next we start a Begin Tran

1.       NOTE: The reason for this is so that we can actually lock the table whilst this transaction is happening.

2.       NOTE 2: This is so that when the SSIS Packages are running in parallel, we know for a fact that each package will only get a valid SourceSystemSK when trying to get another value. If it tries whilst another SSIS Package is busy it will wait until the transaction has been completed.

                                                              v.      — Section 3

                                                            vi.      We then set the RowCount to 1, this is so that we will only get one row back.

1.       NOTE: we are doing this because we only ever want to get back one SourceSystemSK

                                                           vii.      — Section 4

                                                         viii.      Next is our TSQL Select statement

1.       Here we are getting any SourceSystemSK from our table, where the IsProcess is set to 2.

2.       We are then putting this into our @ServerID variable.

                                                            ix.      — Section 5

                                                              x.      Next we are updating our table using the TABLOCK hint so that we can once again lock the entire table, to ensure that for our SourceSystemSK row it gets updated from a 2 to 1.

1.       NOTE: This then means that this data is and has been processed when the next package comes along looking for data to process.

                                                            xi.      — Section 6

                                                           xii.      We then set the RowCount back to 0 (zero)

                                                         xiii.      — Section 7

                                                         xiv.      Then we run a TSQL Select statement so that we can then pass our @ServerID as SourceSystemSK

1.       NOTE: This is because we are using the SourceSystemSK and passing this into a variable in our Execute SQL Task.

2.       NOTE 2: We also set the isNull to 0 (zero) so that when there are no more rows to process it defaults to zero, which is explained later.

                                                           xv.      — Section 8

                                                         xvi.      Finally we are then committing our transaction.

d.       Within our Execute SQL Task we also map the result set to our variable called:

                                                               i.      clip_image014

3.       The next thing that we need to look into is our Exit Loop

a.        clip_image016

b.       NOTE: The reason that we have this is because we need to pass a variable from our child package back to our Parent Package.

c.        So we put in a Script Task using Visual Basic 2010 and configured it with the following:

d.       NOTE: Click on Edit script to open the Script Task editor

PublicSub Main()

       

        ‘ Add your code here

       

        Dts.Variables(“LoopQuitParent”).Value = 1

 

        Dts.TaskResult = ScriptResults.Success

                                                               i.       

                                                              ii.      What the above does is pass the Variable name LoopQuitParent value of 1 back to the Parent package.

                                                            iii.      NOTE: This was configured in our For Loop Container in step 4 from the section above, where it was set to not equal to zero. So when it equals one then exit or complete

4.       Next we creating another Execute SQL Task, and in this task it then uses a Query within a variable to get the IP Address and Database name for our server that we want to connect to.

a.        clip_image018

b.       NOTE: This is based on our SourceSystemSK variable which we populated in step 2 above.

c.        This is what our Variable looks like, where we have configured it as an Expression

                                                               i.      clip_image020

                                                              ii.      As you can see from above when we run the above query, if our SourceSystemSK = 1 we would get the following details back

                                                            iii.      clip_image022

d.       Then in our Result set for our Execute SQL Task we would have the following:

                                                               i.      clip_image024

e.       Now we currently have the following in our variables for our SSIS Package

                                                               i.      SourceSystemSK, IPAddress and Database Name.

5.       Now you need to create a local OLE DB Connection Manager in your SSIS Package.

a.        NOTE: The reason that it must be a local connection, is so that when each package is running in parallel it will only populate the expressions for the current package.

b.       With our example we created a local OLEDB connection with the following name:

                                                               i.      ServerConnection.local-OLE

c.        Next right click and go into the Properties for your ServerConnection.local-OLE.

d.       Where it says Expressions click on the Ellipses button and configure it with the following:

                                                               i.      Property:

1.       InitialCatalog

2.       NOTE: This is the actual DatabaseName

                                                              ii.      Expression:

1.       @[User::DatabaseName]

                                                            iii.      Property:

1.       ServerName

2.       NOTE: This is the actual IPAddress to connect to the server.

                                                            iv.      Expression:

1.       @[User::IPAddress]

                                                              v.      IT will look like the following:

                                                            vi.      clip_image026

e.       Then click Ok.

f.         NOTE: Ensure that your DatabaseName and IPAddress variables have valid default Values in your Variables

                                                               i.      This is because when you put in your expressions above, it attempts to connect to the system to ensure that it is valid.

                                                              ii.      This will save you a lot of time waiting for it to timeout and will ensure that it is indeed valid.

6.       Now the next step is to put in what you want your SSIS Package to complete on each and every server.

a.        clip_image028

b.       With our example we would then get the Max RowID for each system which is stored in a separate table.

c.        Within our table we simply have the Max RowID, SourceSystemSK and the date it got inserted.

d.       Here is an example below:

e.       clip_image030

f.         This would then be in the Result Set

                                                               i.      clip_image032

g.        Then we have our Data Flow task which based on our Max RowID inserts the data into our Staging_tb_SalesInformation

h.       clip_image034

7.       Now the last thing to configure in this SSIS package (SSIS-Parallel-LoadStaging.dtsx) is the Precedence Constraint which can be seen in the picture below

a.        clip_image036

b.       The reason that we have this is so that we know for our package when it has completed going through our entire list of servers (and table with the connection details) and populate our LoopQuitParent variable back to our Parent Package, and let that particular For Loop Container complete.

c.        NOTE: What we did in step 2 above is your will see that we set an IsNull in section 7 and set this to 0 (zero).

d.       This is now used in our Precedence constraint and configured with the following for the “Get the IPAddress  and DatabaseName into Variable based on SourceSystemSK” which is on the left hand side.

                                                               i.      We configured the Precedence Constraint with the following:

                                                              ii.      clip_image038

                                                            iii.      And here what we are saying is if we have any value greater than zero then get our values, and get the data down.

e.       This is now used in our Precedence constraint and configured with the following for the “Exit Loop” which is on the right hand side.

                                                               i.      We configured the Precedence Constraint with the following:

                                                              ii.      clip_image040

                                                            iii.      And here what we are saying is when there are no more values for the SourceSystemSK, which will be set to zero, then exit out and populate the LoopQuitParent so that this can go back to the parent package.

8.       Now we have configured our SSIS Package which can run in parallel.

9.       This is what the entire package looks like

a.        clip_image042

 

Final Notes

·         Now when you run the Control package, and it runs, you will see all your For Loop containers running at once.

·         The thing that you will not essentially see is if you have configured it to run in parallel with 4 SSIS packages, you will only see one running, but actually all 4 are running at once.

·         Once it has all completed you will see your For Loop Container complete with the green arrow.

o    You can then go into your Staging_tb_SalesInformation table and validate it to see that you have got all your required information from your Source Systems.

 

If you have any questions or queries then please do not hesitate to contact me.

SSIS – Getting Partition Names for the past 12 months

I had a situation where I was using a conditional split, so that when data was coming in from my source tables it would dynamically go into the correct partitioned table.

 

Below is the script that I used, as well as for reference the conditional split

 

— ==================================================================================================================

— What we are doing here is to create a Proc so that we can get the PartitionNames going back 12 months.

— While for the first time this is a manual process this can be used for other Parition Schemes going foward.

— The only thing that will need to change if the Partition Naming convention is the same is the Partition name

— ==================================================================================================================

 

— ==================================================================================================================

— 1. Here we are getting our current partition naming scheme into a variable.

— Example of partition scheme: Mart_TF_InternetSales_2014_02

— ==================================================================================================================

Declare@PartitionNameasvarchar(30)

Set@PartitionName=

       (

              Select‘Mart_TF_InternetSales_’

       )

 

— ==================================================================================================================

— 2. Here we are getting our Current Month Details.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@CurrentMonthNameasvarchar(8)

Set@CurrentMonthName=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-0, 0)))

       )

 

— ==================================================================================================================

— 3. Here we are getting our 1 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@OneMonthBackasvarchar(8)

Set@OneMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-1, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 2 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@TwoMonthBackasvarchar(8)

Set@TwoMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-2, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 3 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@ThreeMonthBackasvarchar(8)

Set@ThreeMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-3, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 4 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@FourMonthBackasvarchar(8)

Set@FourMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-4, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 5 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@FiveMonthBackasvarchar(8)

Set@FiveMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-5, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 6 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@SixMonthBackasvarchar(8)

Set@SixMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-6, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 7 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@SevenMonthBackasvarchar(8)

Set@SevenMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-7, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 8 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@EightMonthBackasvarchar(8)

Set@EightMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-8, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 9 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@NineMonthBackasvarchar(8)

Set@NineMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-9, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 10 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@TenMonthBackasvarchar(8)

Set@TenMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-10, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 11 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@ElevenMonthBackasvarchar(8)

Set@ElevenMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-11, 0)))

       )

 

— ==================================================================================================================

— 4. Here we are getting our 12 Month Back.

— NOTE: We are using our Date Dimension to get our Year and Month, we are then replacing the dash “-” with an

—       underscore “_” as per our Parition Naming Scheme.

— ==================================================================================================================

Declare@TwelveMonthBackasvarchar(8)

Set@TwelveMonthBack=

       (

              SelectReplace(YearMonth,‘-‘,‘_’)

                                                       fromMart_TD_Datewith (nolock)

                                                       whereFullDate=convert(date,dateadd(ms,+0,DATEADD(mm,DATEDIFF(m,0,getdate()  )-12, 0)))

       )

Select

               @PartitionName+@CurrentMonthNameasCurrentMonthParititionName

              ,@PartitionName+@OneMonthBackasOneMonthBackPartitionName

              ,@PartitionName+@TwoMonthBackasTwoMonthBackPartitionName

              ,@PartitionName+@ThreeMonthBackasThreeMonthBackPartitionName

              ,@PartitionName+@FourMonthBackasFourMonthBackPartitionName

              ,@PartitionName+@FiveMonthBackasFiveMonthBackPartitionName

              ,@PartitionName+@SixMonthBackasSixMonthBackPartitionName

              ,@PartitionName+@SevenMonthBackasSevenMonthBackPartitionName

              ,@PartitionName+@EightMonthBackasEightMonthBackPartitionName

              ,@PartitionName+@NineMonthBackasNineMonthBackPartitionName

              ,@PartitionName+@TenMonthBackasTenMonthBackPartitionName

              ,@PartitionName+@ElevenMonthBackasElevenMonthBackPartitionName

              ,@PartitionName+@TwelveMonthBackasTwelveMonthBackPartitionName

 

And here is the conditional split

NOTE: The reason that it is in RED is because I changed the actual datetime column that I have in my dataset.

NOTE 2: Normally this would be black to indicate that there are no conditional errors

 

clip_image002

SCOM (Systems Center Operations Manager) Cube and Data warehouse

What I have done is I have found that it is very handy having all our SCOM data in a data warehouse and then having it in a SSAS (SQL Server Analysis Services).

Within the zip file which can be located here below it has all the details for the following:

·         SCOM Data Warehouse and OLAP Documentation.docx

o   This has all the details required to get up and running.

·         Restore the SQL Server Database

·         Deploy your SSIS Project

·         Deploy your SSAS Project

·         Deploy your SSRS Reports

Zip File location: http://sdrv.ms/1dwycbX 

So below is some examples of what can be achieved when the data is sitting inside SSAS.

SCOM report, to analyze and predict based on the past when disks will be running out of space.

·         This utilizes the KPIs within SSAS

clip_image002[4]

Using Excel we can get a trend to analyze what has been happening on a server.

·         As I am sure you are aware there are a whole host of other additions that can be added to the Excel file to make it easier for people to use such as:

o   Slicers

o   Timeline – If Excel 2013

clip_image004[4]

Using Excel we can also use the KPIs that have been created to look at our Goals, Status and trends

clip_image006[4]

 

I have also created additional reports where based on the past average of CPU % Time Used, we then compared this to the current days average if it is greater than 150% then display the data. I have then put this into SSAS Tabular and used Power View in SharePoint in order for guys to quickly see the data.

I have tried to ensure that everything should work as expected.

If there are any issues or something that does not work, please contact me in the contact section below.

I do hope that this will be as useful for you as it has been and continues to be for me.

Data warehousing – Bringing down data twice but avoiding deleting on the data warehouse Fact Table

This is better explained with our example below.

 

Example:

·         We had a situation where we were getting data down, but due to the nature of the data we were getting is we were missing data.

·         Also another note was the fact that we did not store the granular data, but aggregated the data, so due to this it would be difficult to try and find out the new data, or missing data

·         So what we wanted to do, was to find a way where we could reload the missing data, but avoid doing a delete on the fact table.

o    NOTE: The reason for avoiding the delete is because the fact table was large, so the delete would take a long time.

§  Along with this it means that the log increases and the loading time increases.

·         So for the solution below it is easier explained with some dates.

o    Our Current Date is:

§  2013-11-06

o    Previous Days Date is:

§  2013-11-05

o    2 Days ago Date is:

§  2013-11-04

 

Solution

Below is what we implemented so that there would be no deletes but a truncation of the data.

 

1.       Initially our starting point to get new data down was always the previous day.

a.        We changed this so that it would always be 2 days back.

b.       NOTE: This is so that we could then get our data for 2 days.

c.        So as per our example our start date would be:

                                                               i.      2013-11-04

2.       Next once the data was down we then went through all the same steps until it got to the point where we wanted to insert it into the Fact Table.

a.        Here we created a new table called Mart_TF_TableName_Yesterday

3.       What we then did within our SSIS package is we then created a conditional split and based on the date we would then transfer the data to the required table.

a.        So if the date was for the previous day it would go into the table Mart_TF_TableName_Yesterday

                                                               i.      As with our example the conditional split would take the date: 2013-11-05 and split that into the table: Mart_TF_TableName_Yesterday

b.       And then if the date was older than the previous day it would go into the Fact Table.

                                                               i.      So if the date was before 2013-11-05 it would then go into the Fact Table

                                                                  clip_image002

4.       Now what this meant is that we always had our previous day’s data in a separate table.

a.        So when the next load happens we could then just truncate the table Mart_TF_TableName_Yesterday

                                                               i.      NOTE: This is because we are loading 2 days’ worth of data, the data from 2 days ago would be going directly into the Fact table.

1.       So as with our example when the data loaded tomorrow, the data for the date 2013-11-05 would now go into the Fact Table.

5.       The final thing that we did was to then add the table Mart_TF_TableName_Yesterday into our SSAS Cube as a partitioned table.

a.        This was so that for your current data and date, even though the data would not be 100% correct it would be 97% correct.

SSIS–Conditional Split based on Date

Below is a quick blog where I am using a conditional split based on your Date Column, so that you can then output your data to different tables or destination data flow tasks.

 

I also found that there are bits and pieces of information as to how to use the Expressions within the Conditional Split. So this might help someone looking on how to achieve this.

 

Example Scenario

 

·         In my example I am going to conditionally split the data by Month, so that when the conditional split has completed it will then direct the output to either the current month or the previous month.

·         The column which contains the DateTime data type is called:

o    DateTimeCreated

 

1.       Go into your Data Flow Task and either create your query or stored Proc for your Data Flow Source

2.       Then from your SSIS Toolbox drag in the Conditional Split

a.        clip_image001[4]

3.       Now double click on the Conditional Split and you will need to configure it with the following for the Current Month

a.        Where it says Output Name, put in CurrentMonth

b.       Then where it says Condition put in the following below, which will be explained afterwards how this works:

DATEPART(“MONTH”,ImportDateTime)==DATEPART(“MONTH”,GETDATE())

                                                               i.      What we are doing above is using the DatePart and getting the Month For our DateTime column which is called DateTimeCreated, highlighted in RED

                                                              ii.      Next we are using the compare or equal to for our SSIS Condition, highlighted in PURPLE

                                                            iii.      The final part is what you are comparing it to, so here we are comparing it to the current date, highlighted in ORANGE

c.        Now to do this for the Previous Month there is only one addition to your Condition, which is shown below

                                                               i.      Put in the Output Name of PreviousMonth

                                                              ii.      DATEPART(“MONTH”,DateTimeCreated) == DATEPART(“MONTH”,GETDATE())– 1

                                                            iii.      Here we have put a -1 at the end of our comparison, because this is outside of the DatePart function which is for our Month it is subtracting one month from our Current Date, highlighted in GREEN

d.       This is what it will look like once completed:

e.       clip_image003[4]

4.       Then the final step is to then drag your Data Flow Destination flow item to your Data Flow Destination

a.        Once dropped onto your Data flow Destination you will then get the Input Output Selection Window.

b.       Click on the Drop down and depending on your Data Flow Destination select your Output

                                                               i.      In the screenshot below it was the CurrentMonth

                                                              ii.      clip_image005[4]

5.       So once completed it will look like the following:

6.       clip_image006[4]