One Big Table To Two Smaller Ones

  • I need Help!

    I have one very large table (12 million rows) with the following columns:

    (DID, datetime, origination, servicetype, serviceloc, servicecon, servicename, host, value)

    I would like to seperate the table into two tables with the following columns:

    Table1

    (DID,origination, servicetype, serviceloc, servicecon, servicename, host)

    Table2

    (DID,datetime, value1)

    Currently the data is in one table and even with indexes the performance is very poor. I believe by doing this I can drasticaly improve the performance of the reports hitting the tables since the number rows in table1 would be less than 1000 and the number of rows in table2 would be around 12 million.

    I appreciate any help I can get with this.

    Thanks,

    Jamey

    - JH


    - JH

  • You are almost there already. Now that you have created the tables, you can

    INSERT INTO Table1

    SELECT DISTINCT DID,origination, servicetype, serviceloc, servicecon, servicename, host

    FROM <OriginalTable> (NOLOCK)

    INSERT INTO Table2

    SELECT DID, datetime, value1

    FROM <OriginalTable> (NOLOCK)

    I would recommend leaving the indexes off table 1 and 2 until after they have data...it will speed up the insert.

    Good luck

    Guarddata-

  • According to your analyis, currently the fields DID,origination, servicetype, serviceloc, servicecon, servicename, host are duplicated and you want to normalize the table. DID will be the primary key for the new table1 (referenced in Table2.

    create a new table called Table 1 with the fields DID,origination, servicetype, serviceloc, servicecon, servicename, host and the run the query.

    Insert into Table1 Select distinct DID,origination, servicetype, serviceloc, servicecon, servicename, host from <CurrentTable>

    Now rename the <CurrentTable> into Table2 and remove the unnneccesary fields.

    Create the relationship properly.

    I suggest you to create view <CurrentName> which is a selection from both tables.

    You can have instead of Insert, instead of update & instead of delete triggers to handle any code referring the old table.

    This is only a suggestion.

    G.R. Preethiviraj Kulasingham

    Chief Technology Officer.

    Softlogic Information Systems Limited,

    14 De Fonseka Place,

    Colombo 05.

    Sri Lanka.

    Cheers,
    Prithiviraj Kulasingham

    http://preethiviraj.blogspot.com/

  • Thtanks for the responses!

    If I forgot to mention that the DID column is an ID field. Therefore, if I do a select distinct on it it will return all 12 million rows.

    I built the following code and it seems to work however when running it against the 12 million rows it runs for roughly 24 hours. (which is a problem)

    Here's what I have so far:

    --Updates the Descrition table with any new data

    INSERT INTO tblDescription (origination, servicetype, serviceloc, host, servicecon)

    SELECT DISTINCT 'MailStats', service, param, server, 'Residential'

    FROM tblAll a

    WHERE NOT EXISTS (SELECT a.service, a.param, a.server FROM tblAll a, tblDescription d WHERE a.service = d.servicetype and a.param = d.serviceloc and a.server = d.host)

    --Scans through decription table and inserts values into the value table and links the desciption table to the values table by the DID column.

    INSERT INTO tblResults (DID,[datetime], datamonth, value1)

    SELECT d.DID, a.date, DATEPART(Month,a.date), a.value

    FROM tblAll a, tblDescription d

    WHERE EXISTS (SELECT a.service, a.param, a.server FROM tblAll a, tbldescription d WHERE a.service = d.servicetype and a.param = d.serviceloc and a.server = d.host)

    Sorry for the elementary coding here. I am VERY new to this!

    - JH


    - JH

  • INSERT INTO Table1

    12 million rows is a lot of data to cut your teeth on .

    Remove the WHERE NOT EXISTS clause from your statements. Since nothing is in the destination table to begin with, this condition should always be true.

    HOWEVER, the real problem is the duplication of the key fields. If all 6 fields define the key as you seem to indicate, try this

    CREATE TABLE tblDescription (

    DescID INT IDENTITY(1,1),

    origination ....add the proper data types

    servicetype

    serviceloc

    servicecon

    servicename

    host

    )

    CREATE TABLE tblResults (

    DescID INT,

    datetime .... not a good idea to use a reserved word like this...try something like ResultTime

    value ....

    )

    INSERT INTO tblDescription (origination, servicetype, serviceloc, servicecon, servicename, host)

    SELECT DISTINCT origination, servicetype, serviceloc, servicecon, servicename, host

    from tblAll

    INSERT INTO tblResults ( DescID, ResultTime, Value )

    SELECT D.DescID, A.ResultTime, A.Value

    FROM tblAll A

    INNER JOIN tblDescription D ON

    D.origination = A.origination

    AND D.servicetype = A.servicetype

    AND D.serviceloc = A.serviceloc

    AND D.servicecon = A.servicecon

    AND D.servicename = A.servicename

    AND D.host = A.host

    Good luck. I would recommend picking some some good books or tutorials dealing with Database structure and design.

    Guarddata-

  • 24 hours to run a report on a 12 million rows table sounds a bit excessive if you ask me. Have you defined any indexes on your tables referenced in this report? Look at your JOIN columns and WHERE criteria columns and make some intelligent decisions on whether to build indexes on them. And if the answer is yes to above question, have you checked to see that they are not fragmented maybe?

    Nikki Pratt

    Development DBA


    Nikki Pratt
    Development DBA

  • Jamey,

    guarddata has given the right answer.

    I would extent his version for table2 insert by the following:

    INSERT INTO Table2

    SELECT DID, datetime, value1

    FROM <OriginalTable> (NOLOCK)

    WHERE datetime is not null

    As you have written you would have around 1000 records in table2.

    As I don't know your data I'assumed that the datetime column is the selective one (either it has a value or it has not). Of cours you should modify this query to fit your environment (check the value1 column for not null or other thisg which fits to your requirements)

    By creating the indexes after the data load the whole insert processes should not take too long for 12M records.

    Bye

    Gabor

    Bye

    Gabor



    Bye
    Gabor

  • Or there is an another possibility what I do if I have a large table to remodel (over 100M rows)

    - I do a bcp out into a file from a view corresponding to table1 definition after a second bcp out corresponding to table2 definition.

    - Create tbale1 and table2 whithout indexes

    - bcp in into table1 and table2

    - create the required indexes

    Assuming you have set the recovery mode to bulk logged there will be a minimum of logging therefore this one is the fastest solution over all.

    Bye

    Gabor



    Bye
    Gabor

  • Thanks everyone for all of your help!

    Just to give some more detail to my set up. Table1 should only have 1000 or so rows, however, during each insert I must check and make sure that the data from the large table is in Table1, if it is then just grab the DID and insert it along with the datetime, and value into Table2. If it's not in Table1 then insert it into Table1 and then grab the DID number along with the datetime and value and insert it into Table2.

    The data is generated like so...We have approx. 1000 systems that are being monitored (The data in Table1). Each system is being polled approx. 500 times a day (the data in Table2). Currently there is only one table and all of the data is being dumped into it every 30 minutes. This is causing very slow performance on the system since I don't have time to reIndex the table between updates I have already ordered another server to set up a data wharehouse on in order to report off of, this will give some relief to the primary server.

    Hopefully this will give you all an idea as too what I need and also an idea of what the tables are used for. Again I want to thank everyone for their help with this matter.

    Thanks,

    Jamey

    - JH


    - JH

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply