How to query old data while importing new data, then switch to new, without downtime

  • Hi,

    I’m using a database for read purposes only.

    The data comes in as a huge CSV feed, is imported using DTS, and is then queried until the next CSV feed arrives.

    The queries themselves are quite complicated. Thousands of lines of TSQL, joins of several tables, etc. They are embedded in User Defined Functions, since views are not adequate for the required logic.

    The problem: data import takes several minutes, and it is impossible to take the down the application which accesses the db for so long. I need a way to query the old data until the import is completed, and only then switch to the new data.

    Now, CSV feeds come in often and without warning, so all this must be automated.

    It has been suggested to me that I make copies of all tables, add a flag indicating which tables to use - and then write all the code twice... So...

    IF (SELECT flag FROM flag_table) = 1

    SELECT .... FROM table_name1

    ELSE

    SELECT .... FROM table_name2

    But this would lead to terrible code, and that after LOADS of copy-paste work..

    I also thought to use sp_renamedb, importing new data to a temporary db and then taking down the actual db only for as long as is necessary to change the name of the temporary database into that of the normal one, replacing it. But this would still entail a small amount of downtime, and I’d need an exclusive lock too, which I don’t think would be easy since the application is accessing the db constantly.

    Thanks for any suggestions or hints!

    Angelos

  • All I can think of is to query msdb..sp_get_composite_job_info to see if the job is still running. ACTIVE = "Y"

    You will then need code to see when the job started = STARTTIME = "2004-07-11 12:25:32"

    You can then use an IF function to look at only records where ACTIVE = Y and STARTIME < getdate()

    It's all I can think of, but does rely on you timestamping the data when it comes in from the csv file.


    ------------------------------
    The Users are always right - when I'm not wrong!

  • Hi, thanks for the pointer, I'll try to figure out how this works in case it can be helpful.

    I think I should have clarified, though, that the old data is thrown away after the new data import is completed... (There is actually a lot of data duplication, but we just import the new version fully and then throw away the previous version). The point is not to mix the data (adding the new to the old) but to be able to access the old version until the new version is ready, then switch 100% to the new version).

  • You could always WAITFOR, but I guess that effectively stops anyone querying the tables during update.


    ------------------------------
    The Users are always right - when I'm not wrong!

  • Add a log table that you put a record in when the process starts and another when the process finishes. Then simply query the log table using the NOLOCK hint. When you see that the finished record is in the table you can start the rest of the process.

    SELECT * FROM Log WITH(NOLOCK)

     




    Gary Johnson
    Microsoft Natural Language Group
    DBA, Sr. DB Engineer

    This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply