SSIS & TSQL – Geo- location from IP Address to Country

Below are the details firstly from getting your Geo – location information from a table which has your IP Address to country information.

And then getting your IP Address information from your source systems and quickly getting this to a Country so that you can see where the people are coming from.

 

Geo – Location information

There are a lot of Geo – Location places where you can download the IP Address ranges which then relate to countries.

NOTE: Due to IP Address ranges changing over time, it is a good idea to ensure that your IP Address Range Info table gets updated at the very least once a month.

 

·         You can download it from the following location:

o    http://geolite.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip

 

You can use the following below to select only Valid IP Addresses

o    where[IP Address]not like‘%[^.0-9]%’

 

Using SSIS, tables, indexes and stored procedures to get from IP address to country as quickly as possible.

Below are the additional steps to get the data from an IP Address mapped to a country?

 

Source Tables

Below is a list of source tables which you will require, and will be used later

 

1.       IP Address Range Info Table

a.        This is the table that has the IP Address ranges, which map to a Country (Normally a CountryID)

b.       NOTE: You can create this table from the download above.

c.        Once you have your table one of the biggest things is to create the index below:

CREATE CLUSTERED INDEX[IXC_Start_End_IP] ON [dbo].[tb_IPAddressRangeInfo]

(

       [StartIPAddress]ASC,

       [EndIPAddress]ASC

)WITH (PAD_INDEX=OFF,STATISTICS_NORECOMPUTE=OFF,SORT_IN_TEMPDB=OFF,DROP_EXISTING=OFF,ONLINE=OFF,ALLOW_ROW_LOCKS=ON,ALLOW_PAGE_LOCKS=ON,FILLFACTOR= 100,Data_Compression=Page)

 

GO

                                                               i.      NOTE: In the above Index we created a FILL Factor of 100% and we also compressed the Index.

1.       Also remember to update the index when you put new data into your IP Address Range Info Table.

2.       Country Table

a.        This is the table which has the mapping from CountryID to Country Name

 

Converting from IP Address to IP Number

Next what you will need to do is to convert from an IP Address to an IP Number.

NOTE: This is required because the IP Address Range Info table has the IP Ranges stored as IP Numbers.

 

1.       What we did in order to make the Process quicker in terms of getting it from an IP Address to country is we first only selected the distinct IP Addresses.

a.        NOTE: here it was just a simple select. Also note what we used to only get valid IP Addresses

Select Distinct([IP Address])

 

from staging_tb_ClientSourceInformation with (nolock)

where [IP Address] not like‘%[^.0-9]%’

Option (Fast 10000)

b.       We then put this into a table called:

                                                               i.      [dbo].[staging_tb_DistinctIPAddresses]

2.       The next step is we now are going to Update our column called IPNumber.

a.        We do this by using the following below which is in a proc to calculate the IP Number:

UPDATE

— This will be our Staging Table

       dbo.[staging_tb_DistinctIPAddresses]

SET

       [IPNumber]=  (CAST(dbo.fn_Parse([IP Address],‘.’,1)  ASBIGINT)* 16777216)+(dbo.fn_Parse([IP Address],‘.’,2)*65536)+(dbo.fn_Parse([IP Address],‘.’,3)* 256)+dbo.fn_Parse([IP Address],‘.’,4)

WHERE

— This works to ONLY get the IPAddress not like ‘%[^.0-9]%’

       [IP Address] not like‘%[^.0-9]%’and

       [IPNumber]    ISNULL                                 AND

       [CountryStateSK]ISNULL

                                                               i.      NOTE: We are using the Parse Function in order to convert each octet into our IP Number.

b.       At the same time we are also updating our destination table with the IP number, so that later when we join back to get the CountryID we have a column to match on.

                                                               i.      NOTE: The query is identical to above, but we are just changing the table we are updating.

3.       Now this is the part that used to take the longest, where we are looking at the IP Address Range Info and then finding the CountryID and then inserting this into a new table with just the IPNumber and CountryID

a.        The Table we insert into is called:

                                                               i.      [dbo].[Staging_tb_DistinctIPAddressesWithCountrySK]

b.       NOTE: The reason for the Insert into a table is because that is quicker than doing an update when the dataset gets large.

c.        NOTE 1: This is also the section where we are using the new TSQL Syntax to find the CountryID from the IP Address info Range as quickly as possible.

d.       Now here is the proc where we do this:

Select C.ID as CountryStateSK,IPNUmber

 

FROM

       dbo.[staging_tb_DistinctIPAddresses]            A

 

LEFTOUTERJOIN

       [dbo].[tb_Country]   CON

              C.ID (select CountryId from [dbo].[tb_IPAddressRangeInfo]

where StartIPAddress=(select max(StartIPAddress) from [tb_IPAddressRangeInfo] where StartIPAddress<=IPNumber)

and EndIPAddress >=IPNumber) 

 

where  

       A.[IP Address] not like‘%[^.0-9]%’                           AND

       A.[IPNumber]IS NOT NULL

                                                               i.      As you can see above the section highlighted in RED is where we are getting the CountryID from our IP Number Range

e.       We take the output of this data and insert into an additional staging table.

                                                               i.      As explained above using the new TSQL Syntax to find the CountryID from the IP Address info Range as quickly as possible

4.       The last part is where we now use our Distinct IPNumbers and CountryID to join back to our Source Table based on the IP Number.

a.        And here we then update the Source Tables CountryStateSK with the relevant details as shown below:

Update dbo.Staging_tb_ClientSourceInformation

Set CountryStateSK I.CountryStateSK

from [dbo].[Staging_tb_DistinctIPAddressesWithCountrySK]asIwith (nolock)

       Inner join dbo.Staging_tb_ClientSourceInformationasSwith (nolock)

              on i.IPNumber S.IPNumber

 

— The reason that this is set to 1580 is because that is the Row we inserted where we have no data

Update dbo.staging_tb_ClientSourceInformation

Set CountryStateSK = 1580

from dbo.staging_tb_ClientSourceInformationwith (nolock)

where CountryStateSK is null

                                                               i.      As you can see above this is a simple update statement based on the IP Number.

 

Reference to the Parse Function

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

Create FUNCTION[dbo].[fn_Parse](@String       VARCHAR(500),@Delimiter    VARCHAR(10),@Position              INTEGER)

RETURNS VARCHAR(500)

AS

 

–DECLARE @String          VARCHAR(500)

–DECLARE @DelimiterVARCHAR(10)

–DECLARE @Position        INTEGER

—     SET @Delimiter = ‘|’

—     SET @Position = 3

—     SET @Counter = 1

 

BEGIN

       DECLARE @Counter           INTEGER

       DECLARE @Return_Value      VARCHAR(500)

 

       SET @Counter= 1

 

       WHILE @Counter<@Position

       BEGIN

 

              SET @String=  RIGHT(@String,LEN(@String)-CHARINDEX(@Delimiter,@String))

 

              SET @Counter @Counter 1

 

       END

 

       SET @Counter CHARINDEX(@Delimiter,@String)

 

       SET @Return_Value=

              CASE

                     WHEN @Counter > 0 THEN LEFT(@String,@Counter1)

                     ELSE @String

              END

 

       RETURN (@Return_Value)

 

END

 

 

SQL Server 2012 Integration Services Error Creating Catalog Database

This is the error that I got when trying to create the SSIS Catalog Database in SQL Server 2012.

The certificate, asymmetric key, or private key data is invalid. Changed database context to ‘SSISDB’. (Microsoft SQL Server, Error: 15297)

Error creating SSIS Catalog Database in SQL Server 2012

Just a quick background to how I got to the error when creating the SSIS Catalog database in SQL Server 2012.

I was planning to upgrade my current installation of SQL Server 2008 R2 to SQL Server 2012. So in order to do this I got a VM created. I then took across my Model, Master and MSDB Databases. I then restore the Master database, then the Model and finally the MSDB database. There were quite a few more steps but I got my VM in the same state as my current live server was.

I then went ahead and upgraded to SQL Server 2012. Everything looked to be 100% until I tried to create the SSIS Catalog Database in SQL Server 2012.

Below are the steps that I completed in order to finally create my SSIS Catalog Database in SQL Server 2012.

  1. The first thing that I did was to change the account for the SQL Server Integration Services 11.0 to the System Account.
  2. I then added the System Account to the SQLServerMSSQLUser$SQLServer$InstanceName under the Groups in your Server.

Next after hitting and missing many times I ran the following script in my SQL Server 2012 installation.

I opened SQL Server Management Studio

I created the script below

Create Asymmetric key MS_SQLEnableSystemAssemblyLoadingKey
FROM Executable File = 'C:\Program Files\Microsoft SQL Server\110\DTS\Binn\Microsoft.SqlServer.IntegrationServices.Server.dll'

The reason for creating this Asymmetric Key is because in my SQL Server 2008 R2 database I could not find thisAsymmetric Key in my Master Database.

I then ran the following script below twice to ensure that it would force the Regenerate of the Serivce Master key

ALTER SERVICE MASTER KEY FORCE REGENERATE;

Now what I think might have happened in my case is that when I created and installed my Original SQL Server I had been running the SQL Server as a Domain Account. Yes I know this is best practise and I did change the Service Account back at a later Stage.

So I then ran the following script below

ALTER SERVICE MASTER KEY WITH OLD_ACCOUNT = 'DomainName\User1', OLD_PASSWORD = 'GuessThePassword01010'

The thing to note here is that I had to keep on Guessing the Old_Password until I got the following error below:

Msg 15507, Level 16, State 1, Line 1
A key required by this operation appears to be corrupted.

NOTE: I am not 100% sure if this works or does not work, but in my case when I did run it and completed the following steps I could then create the SSISDB.

After completing the above steps I then restarted my Server.

Once the Server has been restarted go back into SQL Server Management Studio. I then ran the following to create the user account which is used when creating the SSISDB. As you can see this user uses the Asymmetric Key wecreated earlier.

USE [master]
GO

/****** Object:  Login [##MS_SQLEnableSystemAssemblyLoadingUser##]    Script Date: 04/04/2012 15:39:57:PM ******/
CREATE LOGIN [##MS_SQLEnableSystemAssemblyLoadingUser##] FROM ASYMMETRIC KEY [MS_SQLEnableSystemAssemblyLoadingKey]
GO

Once the above user has been created I then went to my Integration Services Catalogs, right clicked and selectedCreate Catalog. I then put in my Password and clicked Ok.

My Integration Services Catalog was created.

NOTE: I then made sure to backup my SSIS Master Key File for the SSISDB

I hope that this might help someone else out if they get the above error. It took me 2 weeks of uninstalling and re-installing, and upgrading to SQL Server 2012 to get this working.

DAX Function error – An argument of function ‘LEFT’ has the wrong data type or has an invalid value

Today I was working in SSAS Tabular Mode for SQL Server 2012.

What my goal was, was to remove version numbers from Products. This was so that instead of having the following below in a report:

  • Microsoft .Net Framework 1.x
  • Microsoft .Net Framework 2.x
  • Microsoft .Net Framework 3.x
  • Microsoft .Net Framework 4.x

I would have one name instead:

  • Microsoft .Net Framework

I was using the DAX Left Function, and nested inside the Left Function I was getting the count of Characters to find out where the “.x” was.

Here is an example of my DAX Function”

=LEFT([ProductName],FORMAT(SEARCH(“.x”,[ProductName],1,-1),0)-3)

This was then giving me the error:

An argument of function ‘LEFT’ has the wrong data type or has an invalid value

At the time I did not realize that in my DAX Syntax for the Search Function I had said if it is not found then give it the value of -1. So SSAS Tabular was indeed correct in giving me the above error.

In order to fix this it was simply adding the IfError DAX Function in order to get it to work:

=IFERROR(LEFT([ProductName],FORMAT(SEARCH(“.x”,[ProductName],1,-1),0)-3),[ProductName])

Now in my PowerView Report it is showing just the Specific Products and not their Version Numbers, thereby making the report look a lot better and less cluttered.

Date Filtering in Excel from your Analysis Services Dimension

I recently had to set up another Date Dimension and went through the process of setting up all the properties in the date dimension so that when it is used in Excel, the end users would then be able to use the Date Filters built into Excel.

The Excel Date Filters are very handy when creating Excel documents. As an example if you create your Date Filter to be for the last month. Every time you open your Excel document it will automatically filter to the current month.

So below are the details on how to set this up on your Date Dimension. I have used the Adventure Works DW 2008R2 Database and Analysis Services Project.

  1. Open up your Dim Date Dimension
  2. Then you need to click on your Date Key and select Properties
  3. You MUST now add the following to your Date Key Properties:
    1. Where it says Key Column make sure it has the default which should be:

i.      DimDate.DateKey(Integer)

  1. Where it says Name Column click on the Ellipses button and select the following:

i.      DimDate.FullDateAlternateKey  (WChar)

ii.      NOTE: Make sure that it is set to WChar

  1. Where it says Value Column click on the Ellipses button and select the following:

i.      DimDate.FullDateAlternateKey (Date)

ii.      NOTE: Make sure that it is set to Date

  1. Then if you have a Hierarchy defined you need to create the attribute relationships in order for the Date to filter correctly in Excel.
  2. With our example we had a hierarchy with the following:
    1. IT went Calendar Year – Calendar Quarter – English Month Name – Full Date Alternate Key
    2. So in the Attribute relationships screen we created the following relationships with the screenshot below:

Figure 2 – Attribute Relationships Example

  1. NOTE: When creating the Attribute Relationships it always goes from Right to left, starting with the Year and ending with the Date.
  2. The final step is for each of the attributes used in your Hierarchy you must ensure that they ALL have a Name Column Value defined.
    1. As in our example for Date it had TD_Date.Date (WChar) in the Name column, as well as all the others for English Month NameCalendar Quarter and Calendar Year.
    2. Next you need to set the Type of your Dimension Attribute to match what your date function is
      1. EG:

i.      If it is Calendar Quarter then you need to set the Type for Quarter to Quarters

  1. You do this by doing the following for each of your Dimension attributes in your Hierarchy above

i.      So it would be for Full Date Alternate Key, English Month Name, Calendar Quarter, Calendar Year

ii.      So we are going to start with Full Date Alternate Key as our Example. But you would need to do this for all of the above in bold.

iii.      Click on Full Date Alternate Key in the Attributes pane.

  1. Then either right click and select Properties or click on Properties window
  2. Under Basic look for Type
  3. Click on the Drop Down Next to type
    1. Then click on Date, then the plus Sign next to Date and scroll down until you get to Days

  1. Then click on Days.
  2. Once done it will then look like the following:

  1. You will then need the Type for each of the following below:
    1. English Month Name:
    2. Type:

i.      Months

  1. Calendar Quarter
  2. Type:

i.      Quarters

  1. Calendar Year
  2. Type:

i.      Years

  1. You then need to put in the following so that you don’t get duplicates when trying to process the dateDimension
    1. NOTE: You do not have to do this for all the levels only specific ones below

i.      English Month Name

  1. Go into the Properties for English Month Name and under Key Columns put in the following, with the same order as shown below:

i.      Calendar Quarter

  1. Go into the Properties for Calendar Quarter and under Key Columns put in the following, with the same order as shown below:

  1. The final step is to set your Dimension to Time
    1. You do this by clicking on Dim Date at the top of your Attributes and right clicking andselecting Properties

i.      Where it says Type change this to Time

  1. Then finally do a Full Process on the Date DIM.
  2. Once that is done it will invalidate all the cubes associated with the date DIM, so you will also have to do a Full Process on the cubes for the new Date DIM to pull through.
  3. Finally go through to your Excel Spreadsheet, possibly refresh the data if it already had data.
    1. Then put in the Date Hierarchy in your Row Labels and put something in your values, you should then see your Date Filters as shown below

OLE DB error: OLE DB or ODBC error: Invalid column name ‘xxx’.; 42S2

Yesterday I thought it would be a quick addition to add a new column to my Fact table. Then add the dimension to Analysis Services. And finally add the dimension to the cube. but I ran into the error below and after struggling for quite a while I found the issue.

NOTE: This is possibly one of many solutions.

  1. I had added a new column to my Fact Table.
  2. I had then created and processed the new dimension and that was all working fine.
  3. I then added the Dimension to the cube, and when I tried to process the cube I got the following error:

Server: The current operation was cancelled because another operation in the transaction failed.

Errors in the OLAP storage engine: An error occurred while processing the ‘TF Alert SCOM’ partition of the ‘TF Alert SCOM’ measure group for the ‘SCOM Alerts’ cube from the SCOM database.

OLE DB error: OLE DB or ODBC error: Invalid column name ‘AlertClosedID’.; 42S22.

Internal error: The operation terminated unsuccessfully.

  1. It too me ages to figure it out and I found that in the Partitions tab of the cube I had created aNamed Query for the binding Type

 

  1. The reason for the cube failing to process was because in the above query it would not include my new column. Which now makes sense with the error message saying that it cannot find or invalid column name.
  2. So to fix this you could do one of two things:
    1. Add the new column name to your query
    2. Or change your Query Binding query to Select * from TableName

i.      This way if you ever added new columns it would automatically be included.

  1. I then went and processed the cube after adding the new column and it processed successfully.

SSIS 2012 – Inserting data into a SQL Server Table from an MDX Query

The reason for this blog post, is I recently had a query if it was possible to get data returned from an MDX Query and then insert the data into a SQL Server Table.

And the reason for putting it into SSIS was that we could schedule the job to run on a schedule.

I have inserted pictures from my own documentation to save me some time publishing this post.

The reason that I also like using the ADO NET Source is that you can use Expressions for both your ADO NET Source and Destination.

For my requirement I needed to ensure that my MDX Query only got data for the previous week. So in order to do this I had to create Variables which would be populated in the previous steps in my SSIS Package.

NOTE: The Previous MDX query did not include getting the dates for the previous week.

I could then use the ADO NET Source.SqlCommand Expression. 

And in this Expression I could then pass my variables to my MDX Query, to ensure that I could always get the previous dates data.

I then deployed my SSIS Project to the SSIS Server and scheduled the job as per the requirement.

And now it is running perfectly.

Date Filtering in Excel from your Analysis Services Tabular Model

Following on from how to format the date in Analysis Services UDM Model, today I found myself trying to do the same thing for the tabular model. Where Excel was viewing the dates as a label and not as an actual Date. And after some looking around it is fairly simple to implement so that you can then use the Date Functions within Excel.

  1. Go into your Tabular Project
  2. Go into your Date Sheet that you have imported.
  3. Then click on any Column within your Date Sheet, so that it is selected

i.      NOTE: I had already set the properties in my Date Column to the following:

  1. Now at the topclick on Table, then click on Date, then you will see Mark As Date Table.

  1. This will then open the Mark as Date Table Window
    1. Click on the drop down and select your correctly formatted Date Column.

i.      In my example below it was called Date

             

  1. Then click Ok.
  2. Now deploy and then process your Tabular Model.
  3. Now when you go into Excel you will see the following:

 

SharePoint – Power View URL Actions

This is related to using the PowerView reports in SQL Server 2012 using Analysis Services in the tabular mode.

I thought it would be helpful to post some of the URL Actions that you can use with your PowerView Reports which will add more functionality to your reports and to your end users.

So this will then enable you to have more options when sending out the PowerView Reports to end users, depending if you want them to create the reports or just view reports other people have created.

As I find more URL Actions I will then update them in this blog post.

The first two URL Actions that I have below I actually found from Dan English Blog, but I have put them in here again as a reference for myself also.

http://denglishbi.wordpress.com/2012/06/13/url-actions-with-reporting-services-power-view-rtm/

Before I explain the actual URL Actions it might be useful to explain where you would add the URL actions.

For any of the URL Actions below you simply need to add the URL Action to the end of your URL string. So this is what the normal URL would look like:

http://powerpivot/_layouts/ReportServer/AdHocReportDesigner.aspx?RelativeReportUrl=/PowerView/My%20Report%20Table%20Name/SQL%20Name%20Table%20Name%20Report.rdlx&ViewMode=Presentation

And after you add a URL Action it will have the following at the end, which will be highlighted in RED below

http://powerpivot/_layouts/ReportServer/AdHocReportDesigner.aspx?RelativeReportUrl=/PowerView/My%20Report%20Table%20Name/SQL%20Name%20Table%20Name%20Report.rdlx&ViewMode=Presentation&ReportSection=ReportSection

Below are the URL Actions

URL Actions

Each of the URL Actions will be explained below with a screenshot of what the expected outcome will be.

&ViewMode=Presentation

This is the default when you click on a report from SharePoint. It still shows the File, Edit Report and Full Screen Mode. As we as enabling the user to refresh the data.

&ViewMode=Edit

If in SharePoint you click on Edit in PowerView you will then go to the Edit Mode which enables the user to Edit the Report. Or it will show you the Ribbon for editing and creating your report

&PreviewBar=True

This is the same as the &ViewMode=Presentation.

&PreviewBar=False

This is where you will not show the Preview Bar at all. So the user will not be able to edit, go Full Screen or refresh the data once the report has loaded.

NOTE: In order to not show the PreviewBar at all, you have to ensure that your &ViewMode is set to Presentationalso. So you should add following URL Action at the end of your URL: &ViewMode=Presentation&PreviewBar=False

If the &ViewMode is set to Edit, it will still show the PreviewBar

&ReportSection=ReportSection

This is the default view that you will see when open the report

NOTE: If you do not have multiple views then the ReportSection will not apply to your report

&ReportSection=ReportSection3

This is when you create multiple views in your Report and you want the report to open on a particular View. As with the above example this will open on View Number 4. The reason for this is the first ReportSection is Zero, then one, two etc…

I hope that it helps you guys out

SSIS – Process Cube with XMLA and Analysis Services DDL

Generating XMLA Script and data from Cube

What I was looking to do, was to use XMLA and the Analysis Services Execute DDL Task in SSIS to dynamically process just a required partition that has already been created in SQL Server Analysis Services (SSAS). I found that doing this in SSIS was a quick and easy way in order to get my data processed.

UPDATE (06 Aug 2013): I have found a way to complete the process of using the XMLA Script without having to put any files on the server or in a physical location. So this has been updated to reflect this below.

With doing this in SQL Server 2012, I could then use the new functionality in SSIS as well as the additional reporting.

Below is the example based on the Adventure Works DW2008R2 Analysis Services Cube.

1.       To generate the XMLA script, log into your SSAS instance and then browse down to the cube that you want to process.

1.       In our example we have partitioned our cube so we want to only process the current month’s partition.

2.       We drilled down to the following level below so that we could get to the partition

i.     clip_image001

1.       Now right click on the Partition and Select Process

i.      In the Process Partition Window make sure that you change the Process Options to Process Data

ii.      Then at the top click on Script

clip_image002

iii.      NOTE: You can make any other changes you want in the Process Options Window

iv.      This will now then script the details into an XMLA file

1.       Now you have got your XMLA Script

2.       This is what the details look like

3.       clip_image003

Generating your Partition Name using TSQL and putting it into a variable in SSIS

1.       As you can see from step 2a above we have got the PartitionID, this is what will change each month which we will need to process.

2.       So next you need to create your Partition Name, and to do this we use TSQL to dynamically change the Monthly partition as we move through the months.

3.       So here is the sample script below that we used:

1.       NOTE: This can also be put into a PROC

Declare @MonthNumber as Int

 

Set @MonthNumber = (SELECT DATEPART(MONTH,GETDATE()-1))

 

 

DECLARE @YearNumber AS INT

Set @YearNumber = 2005

 

— NOTE: You can use this below to get the current Year — SET @YearNumber = (SELECT DATEPART(YEAR,GETDATE()-1))

 

— ==================================================================================================

— 2. The second parts then adds a leading zero to your month number, this is because by default — it escapes or leaves out

— the leading zero

— ==================================================================================================

 

Declare @LeadingZeroForMonth as varchar(2)

Set @LeadingZeroForMonth = (select RIGHT(REPLICATE(‘0’,5)+CONVERT(VARCHAR(2),@MonthNumber),2))

 

— ==================================================================================================

— 3. In the final part we put in the table name and then put in our month value as well as the year

— ==================================================================================================

Declare @SSASPartitionName as varchar(40)

Set @SSASPartitionName = convert(varchar(28),‘Fact Internet Sales – ‘) +convert(varchar(4),+ CONVERT(VARCHAR(4),@YearNumber))

 

select @SSASPartitionName as SSASPartitionName

1.       As you can see with the above script we are just getting the same name as what will be required in our XMLA Script

                         EG: clip_image004

1.       Now go into your SSIS Project and into your Package.

1.       Drag in an Execute SQL Task, and set it up so that you can populate your variable with the output of the Proc or TSQL in Step 3 above.

2.       NOTE: In our example we had the variable name of SSASPartitionName

2.       So now you have got your Partition Name and put it into a variable in SSIS

Using the XMLA Script and variable to output it to a variable in our SSIS Package

1.       We do this by creating a variable query in our SSIS Package, which is explained on how to create this below.

    1. NOTE: The Variable query is used to dynamically create a query which can then be used in later steps within the SSIS package.

2.       We are going to create a new variable which will contain our dynamic query by doing the following:

1.       Click on Add Variable and configure it with the following:

1.       Name: XMLAQuery_ProcessData_CurrentPartition

2.       Data Type: String

1.       NOTE: It has to be string because we will be passing the entire XMLA Script which we generated above.

2.       Then click on the Ellipses under Expression, which is on the far right hand side of the Variable details.

1.       This will then open the Expression Builder.

2.       Now paste the following into the Expression Builder:

“Select ‘<Batch xmlns=\”http://schemas.microsoft.com/analysisservices/2003/engine\>

       <Parallel MaxParallel=\”64\>

              <Process xmlns:xsd=\”http://www.w3.org/2001/XMLSchema\ xmlns:xsi=\”http://www.w3.org/2001/XMLSchema-instance\ xmlns:ddl2=\”http://schemas.microsoft.com/analysisservices/2003/engine/2\ xmlns:ddl2_2=\”http://schemas.microsoft.com/analysisservices/2003/engine/2/2\ xmlns:ddl100_100=\”http://schemas.microsoft.com/analysisservices/2008/engine/100/100\ xmlns:ddl200=\”http://schemas.microsoft.com/analysisservices/2010/engine/200\ xmlns:ddl200_200=\”http://schemas.microsoft.com/analysisservices/2010/engine/200/200\ xmlns:ddl300=\”http://schemas.microsoft.com/analysisservices/2011/engine/300\ xmlns:ddl300_300=\”http://schemas.microsoft.com/analysisservices/2011/engine/300/300\>

                     <Object>

        <DatabaseID>AdventureWorksDW2012</DatabaseID>

        <CubeID>Adventure Works</CubeID>

        <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>

        <PartitionID>“+ (DT_WSTR, 100) @[User::SSASPartitionName]  +”</PartitionID>

                     </Object>

                     <Type>ProcessData</Type>

                     <WriteBackTableCreation>UseExisting</WriteBackTableCreation>

              </Process>

       </Parallel>

</Batch>‘ as XMLAScript_CurrentPartition_ProcessData”

3.       From above you can see where we put in our Variable for the SSAS PartitionName highlighted in RED

4.       You can click on the Evaluate Expression to ensure that the expression is correct.

5.       clip_image005

3.       Next we are going to create a variable which will contain the entire XMLA Script generated in the previous step, by doing the following below.

1.       Click on Add Variable and configure it with the following:

1.       Name: XMLAScript_ProcessData_CurrentPartition

2.       Data Type: String

4.       Now drag in an Execute SQL Task and put it below where you generated the Table Name.

1.       We gave our Execute SQL Task the following name:

1.       Get XMLA Query for Process Data into Variable

5.       Next we are going to configure our Execute SQL Task to get its data from our Query we created in steps above, by doing the following:

1.       Go into the Properties of the Execute SQL Task and on the General page configure it with the following:

2.       Under Result Set changed this from None, to Single Row.

3.       Then configure the following under SQL Statement.

1.       Connection Type MUST be set to OLE DB

2.       Next to Connection, select your OLE DB Connection.

1.       In our example this was: SQL-AdventureWorks2008R2-ADO

3.       Next to SQLSourceType change this from Direct Input to Variable

4.       Next to SourceVariable, click on the Drop down and select the following Variable:

User::XMLAQuery_ProcessData_CurrentPartition

1.       NOTE: This is our Variable query which we created earlier which contains the entire XMLA Script to Process Data our current partition.

4.       The above should look like the following:

1.       clip_image006

5.       Now in the left hand side click on Result Set and configure it with the following:

1.       Click on the Add Button

2.       Now where it says Result Name change this to the following:

XMLAScript_ProcessData_CurrentPartition

1.       NOTE: The Result Set Name MUST match the column name that is outputted from your SQL Syntax otherwise the Result Set will fail.

3.       Now where it says Variable Name, click on the drop down and select the following:

User::XMLAScript_ProcessData_CurrentPartition

6.       The above should look like the following:

1.       clip_image008

7.       Click Ok to complete configuring the Execute SQL Task.

Configuring your Analysis Services Execute DDL Task to use your Variable and Process Data on your Current partition.

1.       Now we just need to configure the Analysis Services Execute DDL Task by doing the following.

2.       Drag in the Analysis Services Execute DDL Task, double click to go into the Properties

3.       Click on DDL

1.       In the right hand side where it says Connection you are going to have to create a connection to your SSAS, and Database

2.       Click on New Connection

3.       As with our example we created our connection

4.       Now where it says SourceType, click on the Drop down and select Variable

5.       Then next to Source click on the Drop down and select the Variable that we populated in the previous steps.

6.       As with our example we selected the following:

User::XMLAScript_ProcessData_CurrentPartition

7.       It should look like the following:

1.       clip_image010

8.       Then click Ok.

 

Now finally run the package and it should run and process the partition

NOTE: If you want to check if it worked, go down to the partition level and select Properties. Then look under the Status, and Last Processed date time

clip_image011

Finally here is my entire SSIS Package from start to end so you can see how it all pieces together.

clip_image013

 

SSIS – Fastest or quickest way to move data on a local system – Interesting testing

Over the weekend I was doing some work to in order to catch up a lot of data that I had loaded incorrectly. Whilst I was moving this data around I for some reason or other opened up the Resource Monitor on my 2012 Server. What I saw below confused me quite a lot.

Which I will explain after the screenshot below.

Resource Monitor

Now the interesting part about the above screenshot is that when this was taken the data flow was moving data on the same SQL Server instance, same SQL Server Database, just between different tables. The entire SSIS package was also running on the same physical server. And the attached storage is super quick, where it can get up to about 600 MB\sec.

My confusion was how could the Network card be used when data is moving locally on the server. There is no ways that this should be happening. And the impact of this, is that my data flow task would not be running as quickly and efficiently as possible. So I began testing and trying to find out what was the cause of this and how to ensure that I could find a work around.

So I began my investigation, in the past I had always used the ADO.NET source and destinations in my data flow tasks. And I did read the white papers which suggested using the OLE DB source and destinations in your data flow tasks, but in the past I did a lot of testing and found the performance throughput to be negligible. And I am going to assume that it was because of the screenshot above, which was due to the movement of data was between two servers, so it had to go via the network.

I then completed a series of tests, not only to find out which would be the quickest and most efficient way to transfer the data, but also which Data Flow Source would NOT go via the network card.

Here are the outcome of my tests below:

Data Flow Source Data Flow Destination Commit Size if applicable Destination Config Settings Time Taken in seconds Total Rows Throughput Network Card transferred
OLE DB Source OLE DB Destination Commit Size = 0 Only Table Lock Settings 24,22 3102471 128105,9955 No
OLE DB Source OLE DB Destination Commit Size = 0 Table Lock Settings, check constraints 20,22 3102471 153450,9348 No
OLE DB Source SQL Server Destination Default Defaults 28,47 3102471 108980,9962 No
ADO Net Source ADO Net Destination Default Defaults 98,72 3102471 31427,29363 Yes
ADO Net Source ADO Net Destination 10000 rows Batch Size 98,75 3102471 31417,42785 Yes
ADO Net Source ADO Net Destination 10000 rows Shared Memory 99,49 3102471 31185,31437 No
ADO Net Source ADO Net Destination 10000 rows Shared Memory 99,51 3102471 31177,47965 No

As you can see from the above tests, by far the quickest way for transferring the data is the OLE DB source. And if possible with the Commit Size of Zero.

And once again, I should have put into action what I had read in the Microsoft White Paper. For the life of me I cannot find the link to the white paper, but when I do I will update it with the link.

What I have found out from completing my testing is the following below:

  1. I found out that because I was configuring my ADO.NET Source and Destinations to be using Network Library called TCP/IP as shown below, SSIS would then transfer the data via TCP/IP. And due to it being configured this way it would then transfer the data via the network card.
    1. TCP/IP as shown below
      1. TCP-IP Network Libriary
    2. NOTE: Even though the databases were all local to SSIS it would still then transfer via the network card, thus limiting how quickly it can move the data.
  2. Even when changing the ADO.NET Source to use Shared Memory instead of TCP/IP it would still take a lot longer to move the data than the OLE DB Connection.
  3. It also was apparent in my testing the OLE DB was faster than the SQL Server Destination.

So in conclusion it is definitely quicker to use OLE DB for your source and destinations in your data flow tasks when you are moving data between tables on the same database.

The reason for this is that the OLE DB connection manager is smart enough to detect that if the databases and or tables are local, the to use the Shared Memory to move the data. Therefore making the transfer quick. If it is from different servers across the network then it will use TCP/IP in order to get the data.

And the larger the data that you want to move, the more of an impact this will have.

UPDATE: 13 May 2013

Just a quick update to blog I posted above. During the past week I was moving a lot of data again. And what I have found out and what I wanted to put into this blog here was the following:

  1. I had 2 servers one that was under memory pressure and a new server that was NOT under any memory pressure.
  2. Both of the servers were moving about 10 million rows, and both had an OLE DB Destination with the commit size set to zero.
  3. On the server with the memory pressure it kept on failing when trying to commit the 10 million rows.
  4. While on the server with NO memory pressure would commit the data fine.

What I concluded was that either SSIS or SQL Server does not have enough memory to take the entire dataset and commit it to the SQL Server, which was due to the memory pressure on the one server. Even though during the SSIS data flow task it looks like it is inserting the rows. When it fails there were no rows in the destination table.

I hope that this helps.