My goal was to find out which combination of data flow Source and Destinations would enable me to transfer data over a network as quickly and as efficiently as possible.

·         Below is what the outcome of the tests are

·         And then finally I have only detailed, based on the fastest data flow items how to configure the data flow source and destinations in order to get the maximum throughput.

Below is tests that I completed, each with what configurations I had configured to complete the test.

Just some details on how the test was performed:

·         It was performed between 2 servers which were on a 10Gbit Connection

·         At the time, there was nothing else running on the servers, they were both idle.

·         Both the Source and Destination tables had compression enabled on the physical tables.

 

Data Flow Source Type

Data Flow Source Configuration Settings

Data Flow Destination Type

Data flow Destination Configuration Settings

 Total Rows

Time In Seconds

 Rows\Sec

Network Speed during Transfer

ADO NET Source

Set Packet Size on ADO Connection Manager to: 32768

ADO NET Destination

Set Packet Size on ADO Connection Manager to: 32768

10 Million

169,828

58 883,11

14MB\sec

Batch Size: 0

ADO NET Source

Set Packet Size on ADO Connection Manager to: 32768

ADO NET Destination

Set Packet Size on ADO Connection Manager to: 32768

10 Million

158,75

62 992,13

14Mb\sec

In TSQL Query used the Hint “Option (10000)”

ADO NET Destination Batch Size: 10 000

OLE DB Source

Configured the OLE DB Source Connection Manager to use: SQL Server Native Client 11.0

OLE DB Destination

Set the Commit Size for the OLE DB Destination to zero “0”

   10 Million

21,046

475 149,67

100Mb\sec

In TSQL Query used the Hint “Option (10000)”

Configured the OLE DB Destination Connection Manager to use: SQL Server Native Client 11.0

OLE DB Source

Configured the OLE DB Source Connection Manager to use: SQL Server Native Client 11.0

OLE DB Destination

Configured the OLE DB Destination Connection Manager to use: SQL Server Native Client 11.0

   10 Million

20,578

485 955,88

110-115Mb\sec

In TSQL Query used the Hint “Option (10000)”

Set the Commit Size for the OLE DB Destination to zero “0”

Set Packet Size on OLE DB Connection Manager to: 32767

Set Packet Size on OLE DB Connection Manager to: 32767

OLE DB Source

Configured the OLE DB Source Connection Manager to use: Microsoft OLE DB Provider for SQL Server

OLE DB Destination

Configured the OLE DB Source Connection Manager to use: Microsoft OLE DB Provider for SQL Server

   10 Million

20,031

499 226,20

120Mb\sec

In TSQL Query used the Hint “Option (10000)”

Set the Commit Size for the OLE DB Destination to zero “0”

Set Packet Size on OLE DB Connection Manager to: 32767

Set Packet Size on OLE DB Connection Manager to: 32767

 

Observations from the above tests

·         The difference between the ADO NET source and ADO NET Destinations when compared to the OLE DB Source and OLE DB Destinations is significant.

o    It was just over 8 x faster.

·         It was interesting to note, that the network card throughput was almost 10 x faster when changing from the ADO NET to OLE DB.

·         So when moving large amounts of data, this would mean that data transfers can go from hours to minutes.

·         Below is a chart to show how significant the difference is:

·         clip_image002[7]

 

Configuration of OLE DB Data Flow Source and Destinations for maximum throughput over a network.

Below are the configuration settings for the final row in the table above.

 

1.       The first part is to configure your OLE DB Source Connection.

a.        Create your new OLE DB Source Connection, and then once it is open ensure that the following is selected:

                                                               i.      Where it says provider, from the drop down select:

1.       clip_image004[6]

b.       Then put in your Server Name and Database so that it will be shown as below:

                                                               i.      clip_image006[7]

c.        Next click on All in the left hand column.

d.       Scroll down near the bottom where it says Packet Size

                                                               i.      Change the value from 4096 to:

1.       32767

2.       NOTE: It must be the above number, if you put it any higher the connection WILL fail

                                                              ii.      This is what it looks like below:

1.       clip_image008[6]

2.       Next for the OLE DB Destination you will follow all the steps in Step 1, but this will just change to your destination Server.

3.       Now what sometimes makes the source query faster is using the TSQL Hint: Option (Fast 10000)  

a.        NOTE: This sometimes makes the query faster, and other times it can also slow the query down. So test first.

b.       EG:

SelectRowID,DateAdded

Fromdbo.tb_TableNamewith (nolock)

Option (Fast 10000)

4.       The final part is to configure the OLE DB Destination in the data flow task.

a.        Double click or right click on the OLE DB Destination and go into the Properties.

b.       Then ensure that you have configured with the following as shown below:

                                                               i.      clip_image010[11]

c.        NOTE: that the Maximum insert commit size is set to zero.

                                                               i.      You must ensure that you have enough memoryin order to only commit the entire transaction once.

                                                              ii.      If you do NOT have enough memory, when SSIS tries to commit the transaction, it will then fail.

                                                            iii.      If this happens to you, then configure the Maximum insert commit size, to accommodate your memory allocation.

 

Update – Potential issue with the Maximum Insert Commit size (25 July 2013)

1.       I recently was doing a large insert of data that was roughly 640 million rows, using the above method to get the data down as quickly as possible.

2.       I ran into an issue when it then had to complete the commit, which lead me to the following findings listed below.

3.       When using the Maximum Insert Commit size of zero and there is not enough memory on your server, SSIS then stores this data in the TempDB. So you are not really getting the benefit of having your data stored in memory.

4.       Due to the data now being transferred from the Temp DB into your working database, this once again meant that the performance was really bad. And once again due to the data not coming from the memory buffers the performance was really slow.

5.       Accompanied with this is that it was now logging ever insert into your working database. So due to the fact that it was now inserting and logging every row, my Log File drive ran out of space when the log hit 340 Gb.

6.       So once it failed the entire transaction rolled back.

Lessons Learnt Are

·         Only use the Maximum Insert Commit size when you are inserting maybe no more than 10 million rows at a time.

·         If it spills over to the Temp DB the performance is really slow.

·         Rather if it is a large data set that you are moving rather change the Maximum Insert Commit size to a value which has the best performance for the speed in which you can insert the rows.

0 thoughts on “SSIS – Fastest data flow task item for transferring data over the network”

Leave a Reply

Your email address will not be published. Required fields are marked *