Where Do Staging Tables Belong

  • Alvin Ramard (12/29/2015)


    Orlando Colamatteo (12/29/2015)


    I have had people scoff at running staging databases in FULL, taking backups and including them in Availability Groups and have had people scoff at running them in SIMPLE, not taking backups and having no automated HA for them. I have done both and variations in between. The bottom line is we must meet the customer's Recovery Point and Recovery Time Objectives and those concerns will usually trump developer-preference. There are going to be tradeoffs no matter which configuration is selected.

    I agree.

    ... but, sometimes you might have to ask the customer why they want those recovery time and points for a staging database. Make it clear to them that you're only asking so you can deliver the best solution for them.

    Agree. Ultimately the customer owns the requirement but having the conversation with all options on the table and full disclosure of all options is critical, otherwise what value are you providing?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Orlando Colamatteo (12/29/2015)


    I have had people scoff at running staging databases in FULL, taking backups and including them in Availability Groups and have had people scoff at running them in SIMPLE, not taking backups and having no automated HA for them. I have done both and variations in between. The bottom line is we must meet the customer's Recovery Point and Recovery Time Objectives and those concerns will usually trump developer-preference. There are going to be tradeoffs no matter which configuration is selected.

    Absolutely agreed. Ultimately, "It Depends". And being someone that does Point-in-time backups even on Dev databases, there's no way I scoff at running staging databases in the FULL Recovery Model even if I don't generally do such a thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • If the data can't be easily re-sourced, meaning it's accumulating change data history, and the goal is tho retain the data permanently, which I'm guessing would be the reason for full recovery and backups, then the database is not really "staging" but more like an "operational data store". If it truely is an ODS, then you need to stop calling it Staging and start treating it with all the proper design and care that an ODS deserves.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • In our case we can recreate the data going into the staging database because we have the original files used to populate the tables at the beginning of the process. For us, the simple recovery model would be fine for this staging database.

  • It truly does depend on the scenario. I do log backups on my development server, too. I also have staging databases in production that are in simple recovery model because they aren't permanent data stores and having PIT recovery simply isn't required.

  • Definitely depends. It is typically not about how easy it is to recover it is about how fast you can recover and how highly available the system needs to be. "Staging" is subjective and ambiguous but when used about databases it generally implies all contained tables are wiped and reloaded as part of some data processing routine. Staging tables can exist in any class of database however: OLTP, DW, "staging", ODS, etc.

    With AlwaysOn Groups and SSIS the OLEDB connections would not support multisubnet failover so affected Packages would need to be independently resilient or restarted in case of failure. All .Net connections from SSIS could seamlessly failover making the RTO effectively 0 and achieve any RPO due to zero data loss (assuming synchronous mode). It is tough to implement that kind of failover manually in code and some tlog activity and some backup overhead are usually viewed as a small price to pay in trade if you are after HA or PIT recovery capabilities.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • As the name implies, staging is just that. The data is not meant to be a permanent fixture, otherwise call it a operation data store. If the goal is auditing suspicious data in the warehouse, it would be ideal to have this ODS to refer to. Reloading it from the OLTP sys can add to more confusion if erroneous data is already corrected there by the time you fetch it again. Of course having this ODS means... backups and full recovery and all the goodies that come along with maintaining a permanent fixture.

    I am definitely in the SSIS cheering section just note it :-). One thing I do stay away from are linked server connections.

    ----------------------------------------------------

  • Eric M Russell (12/29/2015)


    If the data can't be easily re-sourced, meaning it's accumulating change data history, and the goal is tho retain the data permanently, which I'm guessing would be the reason for full recovery and backups, then the database is not really "staging" but more like an "operational data store". If it truely is an ODS, then you need to stop calling it Staging and start treating it with all the proper design and care that an ODS deserves.

    +1

    I am regularly confronted with clients that think of their "staging" database as some sort of archive/backup/historical reference.

    Don Simpson



    I'm not sure about Heisenberg.

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply