Not able to recover database

  • Dear experts,

    Thanks for looking into my question.

    I have a primary database replicated to a secondary (AlwaysOn - readonly ).

    I had to remove the database from the availability group in order to change the recovery mode and shrink the log file of the primary database.

    After removing the database from the availability group, I am not able to bring the secondary database online.

    I issued the following command and got the following error.

    restore database xxxxx with recovery

    Msg 4303, Level 16, State 1, Line 1

    The roll forward start point is now at log sequence number (LSN) 52448000025545400001. Additional roll forward past LSN 52449000004510300001 is required to complete the restore sequence.

    Msg 3013, Level 16, State 1, Line 1

    RESTORE DATABASE is terminating abnormally.

    Please help.

  • Did you failover to the 2ndary before removing the primary?

    You know, you didn't have to remove the primary. Why didn't you just backup the log on the primary and then shrink?

    Now your best bet is probably to bring the primary backup and rebuild your entire AG. In fact, it may be your only choice.

    Watch my free SQL Server Tutorials at:
    http://MidnightDBA.com
    Blog Author of:
    DBA Rant – http://www.MidnightDBA.com/DBARant

    Minion Maintenance is FREE:

  • Thanks For your reply.

    I had the log backups running every 10 minutes on the primary. But that did not help and the Log File size grew to the full capacity of the disk.

    So, as an emergency solution, I changed the recovery mode and shrunk the log file to keep the primary running.

    I was going to restore the new full backup from primary (after putting it on full recovery) and enable AlwaysOn for this database.

    I wanted to make sure that I handled this situation right. Thats why I raised this question in the forum.

    Thanks Again..!!

  • If your log backups weren't doing any good then you probably had a large xact that was spanning across all of them. Or maybe one was left open.

    Otherwise, there wouldn't be any reason for it to not truncate your log.

    Watch my free SQL Server Tutorials at:
    http://MidnightDBA.com
    Blog Author of:
    DBA Rant – http://www.MidnightDBA.com/DBARant

    Minion Maintenance is FREE:

  • Did I handle it right? Or what would have been the right solution?

  • Removing the primary would have been the last resort.

    I would have done the following, probably in order:

    1. Check what's holding up the log truncation with

    select name, log_reuse_wait_desc from sys.databases

    If it wasn't log backup then I would've looked into other avenues like looking at that xact or something.

    2. If nothing could have been done about the situation, I would have expanded the drive, and barring the ability to do that, I would temporarily expand the log to another drive to give it space to grow. I would remove the log from the new drive once things went back to normal and look at maybe putting it on a bigger drive as a planned operation.

    Assuming none of that was feasible and you were going to be in dutch, there may not be much you can do. Evicting the primary from the AG doesn't do anything for you that rolling back the xact wouldn't do. And since you're here you can see why evicting it isn't a good choice.

    So next time I would investigate the matter more fully before taking a drastic measure that leaves me with a bigger problem.

    Watch my free SQL Server Tutorials at:
    http://MidnightDBA.com
    Blog Author of:
    DBA Rant – http://www.MidnightDBA.com/DBARant

    Minion Maintenance is FREE:

  • Thanks so much for your time.

    I also looked at your Blog. Very informative. Thanks for your service to the SQL Server community.

  • Siva Ramasamy (1/6/2015)


    But that did not help and the Log File size grew to the full capacity of the disk.

    Hopefully, you've identified the code that caused that problem and are going to fix it because... it's going to happen again.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I did some further research on this issue. There was a SQL Job failed during this time frame. The failure reason was

    The transaction log for database XXXX is full due to 'AVAILABILITY_REPLICA'. [SQLSTATE 42000] (Error 9002). The step failed.

    Does this give any clue?

    Thanks again..!!

  • Without looking anything up and just going with my gut I would guess that maybe your AG wasn't pushing data and it was filling up the log.

    This is where the query I gave you above would have helped you.

    Watch my free SQL Server Tutorials at:
    http://MidnightDBA.com
    Blog Author of:
    DBA Rant – http://www.MidnightDBA.com/DBARant

    Minion Maintenance is FREE:

  • Thanks for your response.

    In that kind of a situation (Availability Replica not pushing data), what should be done?

  • it depends on what's wrong now doesn't it?

    Watch my free SQL Server Tutorials at:
    http://MidnightDBA.com
    Blog Author of:
    DBA Rant – http://www.MidnightDBA.com/DBARant

    Minion Maintenance is FREE:

  • Hi Siva,

    The message "The transaction log for database XXXX is full due to 'AVAILABILITY_REPLICA' " can sometimes be an indicator that the AG replication is not able to send (or receive) data quickly enough to keep pace with the transaction.

    Was the secondary actually available on the network when this occurred?

    If there was a network blip or the server was down at the time then this would also lead to this issue.

    What is the bandwidth like to your secondary?

    If it's at the end of a slow WAN link, then this may well be your problem.

    Depending on the nature of the transaction, would it be possible to batch this into chunks? For example could you batch up a large delete operation into chunks of x rows, and have a breathing space between deleting each chunk to give the AG replication a chance to catch up?

    Otherwise try to find the bottleneck.

    Is it the bandwidth of the link between primary and secondary? - if so can you increase the bandwidth?

    Is it the network interfaces? - if so can you upgrade the NIC from say 1GB to 10 GB?

    Do you have a dedicated NIC for AG replication? if not could you install one?

    Hope this is in some small way helpful. Good luck.

  • Hi Andy

    Thanks for your response. I will look into the points you have mentioned.

    Thanks!

    Siva.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply