Mimicking partitions in SQL 2000

Question

Post reply

Mimicking partitions in SQL 2000

GabyYYZ

SSCertifiable

Points: 7913
More actions
June 17, 2009 at 7:49 am

#73220

Hi everyone,

We have an archive table that captures sales and is over 340 million rows on our SQL 2000 box. It has NO clustered indexes, but has two nonclustered indexes, one on a critical upload date column. (No, I didn't design it like this). Long story short, the rebuild of the index requires we have 1.6x the largest index available as an unallocated cushion which is filling up the drive space.

As SQL 2000 doesn't support partitioning like SQL 2005, we are planning to create a view with the same name as this table, and separate data into smaller tables corresponding to months, clustered this time on the upload date column, that are called by the view. The total space used on these smaller tables should be less than the massive index of the original table.

Are there any issues to be aware of when creating this pseudo-partitioning, such as should the view be partitioned, despite the fact the underlying tables are as well, and other issues you may think of?

Thanks.

Gaby

Gaby
________________________________________________________________
"In theory, theory and practice are the same. In practice, they are not."
- Albert Einstein

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

I've done something very similar on 2000, and the data in Create View in BOL has all the rules on partitioned views. It's pretty easy to set up, and works quite well, if you follow the rules in there.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 2

GSquared (6/17/2009)

I've done something very similar on 2000, and the data in Create View in BOL has all the rules on partitioned views. It's pretty easy to set up, and works quite well, if you follow the rules in there.

Thanks GSquared. Thought that would be okay, but was curious as to the real life experience of implementing this. Also found this article that's more for data warehousing but has some practical approaches.

Wish me luck then. 😀

Gaby
________________________________________________________________
"In theory, theory and practice are the same. In practice, they are not."
- Albert Einstein

GSquared SSC Guru Points: 260824 More actions · Answer 3

I had a three-table solution. One was currently active data, which was only a few dozen rows, but was under constant updates, in terms of up to a hundred or more transactions per second. One was recently active data, which was a few hundred rows, up to a couple of thousand, and had lots of selects but only a few dozen transactions per minute. Last was older data, with lots of selects, but only a few updates per day, and about 10-million rows.

Current table had a clustered index on an Identity PK, and no other indexing. At a few dozen rows, they weren't really needed anyway.

Recent table had clustered, and a few key indexes for the most common selects.

Archive table had a dozen or more covering indexes, pretty much whatever was needed to make the selects work well.

Had jobs that moved data between those tables based on a set of very complex business rules. Ocassionally, something would move from Archive to Recent or Current, or from Recent to Current, but mostly it was moves from Current to Recent or Recent to Archive.

Had quite a few selects that needed to pull data from all three, so did those on partitioned views.

The whole thing worked great. When I started out, I had it all in one table, and locks and such were causing major performance issues. Didn't want to use NoLock or some such, because dirty reads would have caused huge problems in this particular application. Splitting the data into three sections improved performance like you wouldn't believe (average query time went from 10 seconds to under half a second, and updates and inserts were much faster too). Partioned views as the basis for all the CRUD procs made the code very easy to build and maintain too. Since a major part of this was for a call center, really fast selects, inserts and updates were completely mission critical. Can't have the operators sitting there, waiting for their screen to display data, while they're on the phone.

Used similar solutions in a couple of other situations, but that was the most extreme one.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 4

Fortunately, this is an archive table I'll be working on. The eventual goal is to have the current view encapsulate the most recent X months, including current month (we haven't decided how many yet). We'll be designing monthly jobs that 1) create a new table for the month past, 2) add that table to the view, and 3) drop the oldest table from the view.

E.G.



create view ArchiveTable

as

  select * from Archive_current_table   -- for the current month not yet complete

  union all

  select * from Archive_2009_05

  union all

  select * from Archive_2009_04

  union all

  select * from Archive_2009_03

When this month is finished, I'll rename Archive_current_table to Archive_2009_06, move the new table to the view, remove Archive_2009_03 from the view, and create a new Archive_current_table. Assuming all the tables are indexed on the entrydate in ascending, would it be more prudent to reverse the order of the selects with the oldest tables first?

Gaby
________________________________________________________________
"In theory, theory and practice are the same. In practice, they are not."
- Albert Einstein

GSquared SSC Guru Points: 260824 More actions · Answer 5

You could simplify the system tremendously by having tables named "CurrentMonth", "Prior1Month", "Prior2Month", ..., "ReallyOld", or something like that.

Dynamically building tables and rebuilding the view is going to be more error prone.

Another option would be tables named "DataJanuary", "DataFebruary", ..., "DataPriorYears". Makes it a little simpler than dynamic tables, and leaves the view with a static definition.

Having what year and month the data is in the table name kind of violates the idea that data should be in rows and columns, not in the meta-data for the table. You can't query, "Where TableName > dateadd(year, -1, getdate())". Better to have that data in a column, so you can query it.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Steve Jones - SSC Editor SSC Guru Points: 728159 More actions · Answer 6

The order of the SELECTs shouldn't matter, AFAIK. The query optimizer is supposed to know what is where in a partitioned view and not query those sets that don't apply.

Not sure how well this works in large implementations or in what you're proposing, but I've seen demos with 3 views and it appeared to be efficient and work as advertised.

Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 7

Steve Jones - Editor (6/17/2009)

The order of the SELECTs shouldn't matter, AFAIK. The query optimizer is supposed to know what is where in a partitioned view and not query those sets that don't apply.

Not sure how well this works in large implementations or in what you're proposing, but I've seen demos with 3 views and it appeared to be efficient and work as advertised.

Thanks, I have a lot to go with now. Only thing now is to find a test server with enough space left to restore the full backup from production. 😀

Gaby
________________________________________________________________
"In theory, theory and practice are the same. In practice, they are not."
- Albert Einstein