Optimal Disaster/Recovery Strategy for Data Warehouse (BI) application

  • Hi, this is more of a D/R question than a high-availability question, but I thought it still belongs here.

    We have a business-intelligence application consisting of several relational data warehouses and Analysis-Services databases.

    A series of scheduled jobs execute SSIS packages on a daily basis to load the data into the data warehouses.

    Then other jobs process the SSAS cubes based on data in the data warehouses.

    Everything is running on SQL Server 2008 R2.

    Now, the question is whether I have designed the most optimal D/R strategy for this application.

    On a separate D/R server I have currently set up scheduled jobs that restore the database backups (both relational and SSAS) daily from the primary (production) server.

    Should I instead be mirroring the relational data warehouses from production to the D/R server?

    I would then need to switch the RECOVERY model of the relational databases to FULL (currently set to SIMPLE), and I'm not sure how that would affect the performance of daily operations, especially of the SSIS packages that load data to the warehouse.

    Any comments/suggestions, as well as anecdotal experience with this sort of thing, is most welcome.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Well if your hardware on NetApp life becomes very easy .

    You can use NetApp's SMSQL /sanpmirror/snapvault to replicate it to DR site.

    In your case the databases are in simple and believe it should be because lot of ETL will be taking palce through out the day and don't have courage to change it to full.(even i wouldn't have)

    Use freeware SQLSAFE to backup all the databases once a week and restore it in DR site.SQLSAFE compress the db upto 95%. Write separate scripts to sync loging/jobs etc.

    DW doesn't have its own data, it pulls from various sources and puts in DW . Hope the source DR is already setup for you. Now on the D-Day when your DataCeter is inaccessible you switch to DR site. Now you can run the various etl jobs which will pull the data from source(already getting replicated in DR) and will put into DW.

  • it depends on what is an acceptable down-time for your business.

    If they require near instance recovery then I would go for Mirroring / Clustering.

  • If you mirror the data warehouse database (which you noted would require using the full recovery model), be sure to run a transaction log backup when the ETL processes complete. If you don't, the t-log will continue to grow until you run out of disk space for the log.

    This happened several times at a previous employer even though I had changed the recovery model to simple after the first time it happened (some one having changed it back).

  • You could also try out Log Shipping as a DR solution. But, since you stated the databases are in Simple recovery model, you would need to change that to Full. I suspect that you might have issues in doing this and how it will affect your ETL processing. Regardless of what you do, if you put the databases in Full recovery, you need to watch the T-Log very carefully.

  • Thank you all for your thoughts on this. At the moment I am sticking with daily backups/restores and databases in SIMPLE recovery.

    I will need to test any changes to this plan and will definitely take your comments into consideration.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Hi Marios,

    Can you tell me how you set up your SSIS packages to work on both your principal and DR servers without creating 2 versions? Did you specify a Failover Partner in the data source connections? How did you name your 2 servers and 2 SQL instances?

    Thanks,

    Martin.

  • Martin Thatcher (12/14/2011)


    Hi Marios,

    Can you tell me how you set up your SSIS packages to work on both your principal and DR servers without creating 2 versions? Did you specify a Failover Partner in the data source connections? How did you name your 2 servers and 2 SQL instances?

    Thanks,

    Martin.

    I am using the code listed here - http://blog.hoegaerden.be/2010/01/10/list-all-ssis-packages-deployed-on-your-integration-server/ - to compare SSIS-package versions between prod and DR.

    Any package versions that differ in the two envs, I redeploy on the DR using the prod versions.

    So this is low-tech, nothing fancy apart from the code in the above link.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • The use robocopy to mirror the folders(incremental) from PROD to DR system.

    These folder sometime use to host ssis pacakges,config files and other application files.

    The job runs once and the DR is updated with latest.

    In my senario the PROD and DR couldn't talk to each other they were setuped in different domains.

    So i robocopy or xcopy was failing. Later the server team gave a temprorary file share which was accessible both from PROD and DR.

    One job in PROD would copy to \\temserver\tempfolder and my other job which was scheduled 2 hrs after the first job in DR server used to pull the files to DR server

    PROD--> \\temserver\tempfolder --> DR.

    Every thing wsa automated using batch files with copylogfile retention of 60 days.

  • Marios Philippopoulos (12/15/2011)


    Martin Thatcher (12/14/2011)


    Hi Marios,

    Can you tell me how you set up your SSIS packages to work on both your principal and DR servers without creating 2 versions? Did you specify a Failover Partner in the data source connections? How did you name your 2 servers and 2 SQL instances?

    Thanks,

    Martin.

    I am using the code listed here - http://blog.hoegaerden.be/2010/01/10/list-all-ssis-packages-deployed-on-your-integration-server/ - to compare SSIS-package versions between prod and DR.

    Any package versions that differ in the two envs, I redeploy on the DR using the prod versions.

    So this is low-tech, nothing fancy apart from the code in the above link.

    The point of Martin's question was how your SSIS packages on the DR server run considering that the server(s) they run against are, presumably, no longer functioning.

    For example, if your PROD server has SSIS packages with data sources pointing to PROD, what happens on the DR server when you try to run these packages? i.e. they can't run against PROD as the disaster has caused it to stop running, so which server do they run against? Presumably, they are running against the DR server, but how have you told the SSIS packages to do so? Do you use a Failover Partner as Martin suggests?

  • We use DNS aliases with our prod servers. In a DR situation we will simply switch the aliases from prod to DR.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Marios Philippopoulos (12/20/2011)


    We use DNS aliases with our prod servers. In a DR situation we will simply switch the aliases from prod to DR.

    That's the approach we've been looking in to - thanks for the follow up 🙂

  • Michael Lysons (12/21/2011)


    Marios Philippopoulos (12/20/2011)


    We use DNS aliases with our prod servers. In a DR situation we will simply switch the aliases from prod to DR.

    That's the approach we've been looking in to - thanks for the follow up 🙂

    Np, in addition to that, aliases are more user-friendly. It's easier for someone to reference a server by, say app name, than by some incomprehensible name that the server admins coined according to a convention they only understand. Also, if the databases need to be migrated to another server - as an upgrade - none of the app connection strings need to be changed, which makes for a more trouble-free migration process.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply