Best Method for

Question

Post reply

Best Method for

AMCiti

SSC-Addicted

Points: 485
More actions
March 30, 2022 at 3:20 pm
Go to Answer
#4008288

We have a database (DB1) that gets a giant push of data every morning. However, due to issues with the team that pushes this data, it isn't always successful, which causes a lot of issues for our reporting needs. As such, our team has been instructed to have an "Isolation" layer, which is basically a second database (DB2).

Once DB1 is completed, we want to copy all this data to DB2. This will allow us to run reports against DB2 while DB1 is loaded to avoid issues with LOCKs, and other issues that arrise. It also allows us to continue to run reports if their process fails, so we do not have empty tables (we understand the data might be a day old, but this is a better solution for us at this time than no data). DB1 and DB2 reside in the same SQL Server.

However, we are not all that knowledgeable in the best methods. Our first thought was replication, but didn't think this would be good, because DB1 is not a transactional database that changes. Once we get the data pushed, it remains the same throughout the day until the next day when it is pushed. Also, all the tables in DB1 are TRUNCATED before the push, so not sure we could use SQL Replication when this is the case.

At any rate, we are trying to research and figure out the best method to do this but hoping for some help and guidance as to what would be the best (or at least a very good method) to copy GIGs of data from DB1 to DB2 on a daily basis. Unfortunately, we never really had an issue for a long time, but the team that pushes the data changed their routine that causes a lot of issues for us, but for political reasons out of our control, it is our team's issues to solve this, and we are mostly just a reporting team that creates reports from this data, not DBAs that have to solve these problems, so we are a little out of our elements and our direction has been to "create an isolation layer" for us to run the reports, which is "DB1 simply gets the data and DB2 is where the reports run", but how to get the data from DB1 to DB2 effectively and efficiently is our issue. We also need to do a bunch of checks to make sure the data has been pushed, because it is fairly common for their processes to break, causing empty tables, only part of the data and right now, we created a bunch of validation checks to alert the other team their process failed (things such as empty table, tables whose records have a tolerance of +- 10%, and a few other validation checks).
Michael L John

One Orange Chip

Points: 26551
More actions
March 30, 2022 at 5:20 pm
Answer
#4008354

AMCiti wrote:

My apologies, but I "mostly" understand.

To a team that only knows the name "DB1" to which to push the data, we would do the following:

1). Create two databases. DB2, DB3.

2). Create an alias called "DB1". This way, all the reports and data pushed from the other team point to "DB1".

3). If the push is succcessful, we point "Alias DB1" to DB2. Then the next day, if successful, we point "Alias DB1" to DB3. But since their code to push the data and our code to "select" the data both point to "Alias", we can change the pointer in the backend depending upon the successful completion of their push.

Aliases are at the client level. If there are only a few clients, you can probably flip aliases on each client with a script.

Synonyms are at the object level.

You would create the synonym named "Table". It will point to "TableA". Create a second empty table "TableB". The data load will fill TableB, and at completion, you alter the synonym to point to TableB. The next run, you fill TableA, and at completion alter the synonym to point to TableA

I have seen, although it's been a while, a similar process that renames the databases. DB1 is "live", the fill runs against DB2. When completed, it renames DB1 to DB_Old (or whatever), and then renames DB2 to DB1. The issue with this approach is that the connections to the databases are dropped.

Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
April 18, 2022 at 11:46 am

This was removed by the editor as SPAM

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

homebrew01 SSC Guru Points: 55454 More actions · Answer 1

Backup DB1 database, and restore the entire database to DB2 ? Use backup compression, then copy the file to restore location. If you have objects unique to DB2, this will replace them. You'd need to copy them out and back in after the restore.

You may have to restore user permissions afterwards, but that can all be rolled into 1 process.

At my previous job, we scheduled that daily, just after midnight to update a reporting server.

This reply was modified 2 years, 6 months ago by homebrew01.
This reply was modified 2 years, 6 months ago by homebrew01.
This reply was modified 2 years, 6 months ago by homebrew01.

homebrew01 SSC Guru Points: 55454 More actions · Answer 2

"...However, we are not all that knowledgeable in the best methods. Our first thought was replication, but didn't think this would be good, because DB1 is not a transactional database that changes. ..."

You can use snapshot replication, which replaces the entire table. But if new tables are added to DB1, you will need to maintain your replication process.

Backup & Restore will copy everything, so no worries of forgetting to include a table.

This reply was modified 2 years, 6 months ago by homebrew01.
This reply was modified 2 years, 6 months ago by homebrew01.

Phil Parkin SSC Guru Points: 245813 More actions · Answer 3

One possibility is the use of SYNONYMs to help you in this.

Assume that the synonyms 'point' to tables in DB1 on day 1.

On day 2, DB2 is updated. If successful, switch the synonyms to point at DB2, otherwise leave them as-is. Otherwise, fix the import and re-run.

Update DB1 and DB2 on alternate days. On day 3, DB1 is updated and, if successful, the synonyms are switched back.

All of this should be transparent to the reporting layer.

If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Michael L John One Orange Chip Points: 26551 More actions · Answer 4

Phil Parkin wrote:

One possibility is the use of SYNONYMs to help you in this.

Assume that the synonyms 'point' to tables in DB1 on day 1.

On day 2, DB2 is updated. If successful, switch the synonyms to point at DB2, otherwise leave them as-is. Otherwise, fix the import and re-run.

Update DB1 and DB2 on alternate days. On day 3, DB1 is updated and, if successful, the synonyms are switched back.

All of this should be transparent to the reporting layer.

I recommend this also. You do not need a second database with this method.

Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/

AMCiti SSC-Addicted Points: 485 More actions · Answer 5

My apologies, but I "mostly" understand.

To a team that only knows the name "DB1" to which to push the data, we would do the following:

1). Create two databases. DB2, DB3.

2). Create an alias called "DB1". This way, all the reports and data pushed from the other team point to "DB1".

3). If the push is succcessful, we point "Alias DB1" to DB2. Then the next day, if successful, we point "Alias DB1" to DB3. But since their code to push the data and our code to "select" the data both point to "Alias", we can change the pointer in the backend depending upon the successful completion of their push.

Steve Collins SSCommitted Points: 1686 More actions · Answer 6

Steve Collins

SSCommitted

Points: 1686

March 30, 2022 at 5:15 pm

#4008350

Now that the Briefcase in SSC actually works I'm able to find old bookmarked threads. This was a good one

https://qa.sqlservercentral.com/forums/topic/replace-table-with-staging_table-approach

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

ScottPletcher SSC Guru Points: 100262 More actions · Answer 7

If you do not need full-text indexing nor filestream data on the reporting db, you could consider using a db snapshot as "DB2". The initial create of the snapshot is almost no overhead at all. The only extra overhead would be when DB1 is being modified, and you have some overhead there anyway, of course.

As soon as the changes to DB1 are fully verified, you would drop the old "DB2" snapshot and create a new "DB2" snapshot. If an update went terribly bad, you could even revert the db back to the snapshot and try the updates again.

IF EXISTS(SELECT 1 FROM sys.databases WHERE name = 'DB2')
    DROP DATABASE DB2;
GO

CREATE DATABASE DB2 ON
( NAME =logical_file_name1, FILENAME ='os_file_name1' ) /*,
( NAME =logical_file_name2, FILENAME ='os_file_name2' ), ...*/
AS SNAPSHOT OF DB1;

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!