Building (fast) Stats in a table with 3+ Million Rows?

  • Here's a little background info first

    We have an phone survey software that is SQL Server based. Because of the load on the server, we decided to create a replicated server for clients to log into and pull information on the data there, and not have them be weighed down because of server activity.

    The replication is now done, but when I try and query the table of survey responses on the replicated server, which has 3 million rows in it, it takes a long long time, and therefore crashes our main server, I assume it does this because the replication gets backed up, and then the survey software crashes as well. So much for the replication server being a fix-all-solution eh? hehe

    There's already proper indexes, and a lot of my queries require and perform index scans.

    Anyhow, if I use INSERT triggers, I can split up that table of 3 million rows into 400 tables (the amount of 'questions' in this survey.). This would leave me with 400 table with approx 7,000 rows per.

    So now if people are requesting stats on the survey, they can be provided much faster. But my question is this....

    Even though my solution would work, is there no way to work faster on large tables?

    Would I need a much faster server? Would I need a cluster of several servers in order to provide speedy access?

    It's times like these when I am in awe of how fast Google gives me results on a search, hehe...

  • "There's already proper indexes, and a lot of my queries require and perform index scans."

    The index scans are probably your problem.  Index seeks perform much better than index scans.  You may want to post your table DDL along with a sample query and maybe some sample data.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • 3 million rows is a pretty small table to be honest. You might want to consider partitioning but I'm not sure you'll see much benefit.

    I don't quite follow the replication thread - I'd suggest you run update statistics against your table ( on both servers ) and see if things improve.

    Having used replication for a couple of years I don't see how a query to one will creash the other - I suspect other factors are at play!

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • It doesn't crash the SQL Server, it crashes the software that populates the sql server...

  • "and therefore crashes our main server"

    Have you tried extracting the query plans for your queries to see where the performance bottleneck is? I assume your application probably has a time out setting and if the results don't get returned in this time it creates problems?
    Check a profiler trace to compare rpc and sp time just in case it's a connectivity issue.
     

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • You certainly don't need a cluster of machines to handle a 3 million row table. I've got tables 100 times that size. and a few 10 times that size all on one server

    The index scans are very likely the cause of the slowness. Scan means that SQL is reading the entire index from beginning to end, as apposed to a seek where SQL goes direct to the rows it needs. The scan indicates that the index doesn't satisfy the query, that the selectivity is too low or that there are functions on the fields in the where clause. ORs also cause scans if all the fields in the or are not indexed.

    As a suggestion, post your table design, index design and long running queries and we can offer suggestions

    The fact that a read on your subscriber crashes the publisher says that there is something serously wrong somewhere in your replication. I've never seen that occur in all my time of working with SQL. The worst I've seen is the unreplicated transactions increasing, which increases the delay between the two servers. Are you using standard SQL replication? Transactional?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I am using standard, transactional replication. But I needed a way to automate publications and subscriptions, so I used Profiler to spy on qwhat SQL did during the GUI portion of it, and then stripped that out and made a script out of it that automated new publications....

    I do know that indexes are my major issue. I will work on that today, and then see if I can't find a new way to replicate. Check for a new thread later on today

  • Gila,

    "As a suggestion, post your table design, index design and long running queries and we can offer suggestions"

    Dan has a second thread pertaining to this issue with the table/index design:

    http://qa.sqlservercentral.com/forums/shwmessage.aspx?forumid=9&messageid=269436

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply