80 columns by 6 million rows

  • All,

    I am very new to SQL server performance tuning and I was wondering what I could do to get this table performing a bit better.  I have a quad 3 GHZ box with 3GB of RAM.  My table is 80 columns x 6 million records and has 1 unique clustered index.  The size of the table will not change and only BI query tools are going to be used to hit it.  My issue is a simple summation of one of the columns takes almost 2 minutes on average.  This will not make users happy considering the data that populates this table comes from a Teradata server where the same query takes 15 seconds on average.  Can someone please provide me some tips or point me to some documentation to get this running faster?  Thanks in advance.

  • I had this problem where I had to get transactions from 15 countries on a monthly basis.

    My solution, pre-analysis services was to create specific summary tables that pre-aggregated the data.

    For example, the transactions were by day but the users were specifically interested in transactions by region by month, or by region by week.  I simply bypassed the problem.

    I would ask if all the 80 columns are actually used?

    Use SQL profiler to log the SELECTs done against the database.  If some columns are rarely used then think about splitting out the tables so the less commonly used stuff is in one table and the rest is in another.

    If users commonly search on a sequential range of values then try putting your clustered index on the column that holds that value instead of the default primary key.

    If your data is mainly static try altering the index fill factors to a very high value or even 100%.

  • Sorry, I forgot to add.  Splitting the table into more than one by columns is vertical partitioning.

    If your users tend to search for last months transactions then I would consider horizontal partitioning and have your main table(s) with just the past two months data in them.

    As my database was effectively read-only apart from the monthly population job I had a script to drop, and recreate the indices and I indexed virtually everything.  The reindex job used to take 6 hours, but as this was a once a month job there was no issue with this.

  • You only have one index, on a large table of static data?

    Create some more!

    If you're not adding new data, then you don't need to worry about the effects they might have on data loading & updating.

    You're only limited by disk space, and I'd assume given the spec of your server that you're not short of that.

    I regularly get sub-second performance on queries against a similarly sized table with 25 million rows - appropriate indexes really DO make that much of a difference.

  • Best solution for these kind of situations is to cache the result of all the queries on that table. Subsequent requests for the same query should be satisfied from the cache. Also make sure that the cache expires when the data changes (which i hope will be less frequent).

  • You could also consider creating some new covering indexes for your selects if you are selecting frequently the same columns.

    And as David told you you could split your table into several smaller one (vertical partitioning). The aim is to put as many records into a page as possible, therefore you speed up your queries



    Bye
    Gabor

  • You can use Cubes. A fine cube could make your query about a 5 seconds wait!!!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply