40 million records big problem

  • Hi we have a database which contains 40 million records in a particular table.

    there are 3 tables actually i want to pick up records from 5 fields in Table A,3 fields from Table B and A single fields from Table C.

    Table C has 40 million records.When i query it takes 15-16 hours to give me the results.

    Can you please help me ouy in the database design to minimise the query time

  • Hi sachin007,

    quote:


    there are 3 tables actually i want to pick up records from 5 fields in Table A,3 fields from Table B and A single fields from Table C.

    Table C has 40 million records.When i query it takes 15-16 hours to give me the results.

    Can you please help me ouy in the database design to minimise the query time


    well, I guess, one of the first steps is to verify proper indexing. Have you yused the Index Tuning wizard from QA?

    Cheers,

    Frank

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • If you know what records you need from table C,I mean using the where clause. Try inserting those particular rows in a temp table and then try and join it with other tables.

    The 3rd table should have an index on the column invloved in the where clause.

    It would be grt8 if you could let us know the schema and the indexes

    Relationships are like Banks.You Invest the Principal and then get the Interest.


    He who knows others is learned but the wise one is one who knows himself.

  • Is the query plain vanilla inner join, etc. no group by, order by or aggregates etc.

    I'll go with the upto date indexes first(a5xo3z1).

    On average, what will be the number of rows in your result set?

  • 1. Make sure the table has an appropriate clustered index...this will physically order the records and greatly increase query performance. This site has plenty of good pointers on indexing...search for 'clustered index'.

    2. If you have 40 million records, chances are that it is very granular data. If your query needs to return totals, consider an interim or consolidation table. For example, if your data is one record for every item sold, but you need to return yearly totals by product, by store..consider rolling up the data to an interim table with one record per item, per month...Then, the table against which you run the final query would be orders of magnitude smaller (~30 * [N-items] smaller).

    3. If you have data across spans of time, consider partitioning the table by month (or year, or quarter). Then your query will only have to use the subset of the partition that it needs...Your final query would then look like

    Select C1, C2, C3 From PartitionA Where (C3 = 'X')and (Date < [some_date])

    UNION ALL

    Select C1, C2, C3 From PartitionB Where (C3 = 'X') and (Date < [some_date])

    There's always a way...

    Phil Daniels

    ciao ciao


    ciao ciao

  • Try to put the conditions to Table C and check whether you have proper indexes and relationships. If still your query takes more time, try to reindex the table C

    If all above fails try horizontal partitioning.

    G.R. Preethiviraj Kulasingham

    Chief Technology Officer.

    Softlogic Information Systems Limited,

    14 De Fonseka Place,

    Colombo 05.

    Sri Lanka.

    Cheers,
    Prithiviraj Kulasingham

    http://preethiviraj.blogspot.com/

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply