SQLServerCentral Article

Review: Loading Sample Data with DataTect 1.6

,

So, you’ve created a database and application and want to

see how it operates with a substantial load of data? When dealing with

development projects, a large problem I find is the lack of test data in a

database to appropriately tune an application or database for 6 months down the

road. Time after time, development groups create a large application without

testing the application with large amounts of data. When the application goes to

production, it may be ill equipped to handle the real-world stress of data.

Datatect 1.6 from Banner Software provides an easy solution

to loading large amounts of sample test data into your database. It can load

data into nearly any OLE DB data source like Oracle, Sybase and of course, SQL

Server. It can also load data into an ASCII flat file for later usage by an

extract program. Loading this type of data the old fashion way could take weeks.

Traditionally, you would have either a human type the data or create a script.

During this review, I tested Datatect 1.6 in loading a SQL

Server 2000 database with 1 million records in the parent tables and 20 million

in the children tables. All the data loads were performed against the Northwind

database. I have evaluated other data loading programs in the past and this

product was the best by far. While the other products like Quest Software’s

DataFactory had a cleaner interface, it was much harder to use and was not as

stable with SQL Server. DataFactory did work fine in loading data into Oracle

and DB2 but struggled with SQL Server 2000 (it did work fine with SQL Server

7.0). The Banner Software product, Datatect however handled the high-stress job

of loading 128 million total records into SQL Server 2000 with ease. This amount

of data equated to 50 GB in the Northwind database. 

Other products I tried locked up at after a few thousand records.

Datatect made loading data into children tables easy. As

you define how you would like to load each table, it would automatically read

the table’s properties and detect how it would recommend loading the data. For

example, if the column had a foreign key constraint on it, Datatect would

automatically detect this and load data into the column randomly from the parent

table. Of course, as you select more columns like this, the slower the product

will load data because it has to run a select statement in the parent table. As

you can see below, you can also set what the chance a column will be left NULL.

As you can see in the above screenshot, I’m loading the

Products table in the Northwind database. When you load tables that have

identity columns on them (like the Products table), make sure you uncheck the

identity column or Datatect will try to force data into the column and you will

receive an error. While Datatect does read the SQL Server properties for some

types of items like data types and nullability, it does not detect if you have

the identity property on the column.

As you select a table, you can set a column to any number

of items. Datatect ships with a number of pre-defined lists of businesses,

names, and cities (among other lists). You can set a column to a set value or a

random string of numbers or characters. You have the option to also add your own

customized lists of values. For example, you could export data from a different

source into a flat file and then import it into this system. Datatect also

allows for advanced scripting where data can be massaged as it’s entering the

system. Keep in mind that the more advanced you make your data load, the slower

it will load.

Another item to watch for as you define your tables is when

you add a new table to your project. When you click New in the Table Specs tab,

you will be given a list of tables and views that are available to add to your

project. Be careful not to add a SQL Server view on this screen or you will

receive an error when trying to load it. This option should be left in though

just in case the user wants to load a distributed partitioned view.

To load a table, right click on the table and select

Generate to Table. You will then be presented with the below screen, where

Datatect will ask you how many records you would like to load into the table and

how often you would like to commit the records. If you’re loading a large

database and you have ample RAM, it may make sense to increase the commit size

to a larger amount. In my data loads where I was loading over a million records,

I would set this setting to 10,000. You also have the option in the screen to

purge the old records before loading or just append your data to the end of the

table.

Before you click OK, make sure your database is at the

target size after the data load (or at least your best guess). If your database

is set to automatically grow, you will experience a problem with Datatect if the

database must grow during the data load. For example, I ran into the problem

where my 20 GB database was set to grow by 10%. When the autogrow feature was

triggered, my database spent 3 minutes growing by 10% and timed out the Datatect

application. This is because the database creates a series of locks that

prevents the data from being loaded while the database is growing. This is not

Datatect’s issue, but rather SQL Server locking.

If you’re looking for a fast way to load your test data,

Datatect is your answer. Because it automatically detected your foreign keys,

its interface was much easier to use than its rivals. The only interface issue I

had was that there was no easy way to load all the tables in your project in one

execution. Instead, you had to load each table individually. This prevents you

from being able to start a data load before you leave for the weekend and have

it done when you get back. There are ways around this with some of the available

scripting features. Other data loading products did have this feature built into

the GUI, but were buggy the more child tables you added.

If you expect a few million record data load to execute in

an hour, don’t hold your breath. Complicated data loads with millions of

records may take a long time to execute. For example, my 60 million record data

load took the better part of the weekend to execute. Yes, I’m afraid I have

nothing better to do on a weekend than to look at a data load.

Datatect 1.6 makes loading data easy. After trying the old

fashion method of using scripts to load my sample data for years, I was able to

save days of script writing by using this product. Datatect is sure to offer the

most advanced users ample features and time saving techniques to load your data.

By using this product, you can ensure that your application’s performance will

excel before your application ever enters production.

Datatect

Pricing  $995.00 per copy.

Discount Schedules are available for 5 packs, 10 packs etc.

Banner Software inc.

email  banner@ns.net
Phone 800-336-9969 U.S. and Canada

Outside U.S. and Canada  916-481-1157

Web Site  www.datatect.com

 

Rate

4 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

4 (1)

You rated this post out of 5. Change rating