Event Id 117 - Unexpected fatal error occurred. Attempting to restart server.

  • Hi Guys

    We are running SQL Server 2000 SP3a on a Windows server 2003 Enterprise box.

    Analysis services was working fine, but I haven't used this box much (as it's our live box and I've been working on test box, live box not currently running anything).

    Today I restored an archive cube from the similar speced test box, and haven't been able to use AS since!  The restore may be a grey herring.  AS may have had a problem already.

    The application event log is showing:

    Event Id 117 - Unexpected fatal error occurred.  Attempting to restart server.  It then says started OK, then errors again.

    AS is giving errors like:

    Unable to browse the cube 'CubeName'.  Cannot open connection to Analysis server 'ServerName'.

    Could not unlock object (Shared Dim Name).  Connection to the server is lost.  You need to restart Analysis Manager.

    Unable to edit data source.  Cannot open database 'DatabaseName' on the Analysis server.

    Unable to open the dimension DimensionName.

    Cannot connect to the Analysis server on computer 'ComputerName'.

    Cannot connect to the server 'ServerName'.  The server is either not started or too busy.

    Stopping and restarting the services makes no difference.  I reinstalled AS and SP3a, which made no difference.  Then I installed SP4, which again, has made no difference.

    Any ideas?  Are the system databases or install corrupt?  My next step would be to totally uninstall SQL Server and install again.

    Any comments would be appreciated.

    Thanks

    Mark

  • I know this sound like a stupid question but you are/have been installing the AS service packs, not just the SQL ones?

    Is your repository in MSSQL or MSAccess? 

    Steve.

  • Hi Steve

    I have been installing both SP's and the repository is in MSSQL.

    This morning I uninstalled everything then reinstalled, and all appears to be working OK, it's processing a cube as I type.

    I'll just have to put it down to one of those things and be glad that it was on a server which hasn't gone live yet!

    If you or anyone has any comments as to what might have gone wrong i'd be interested.

    Thanks for the response anyway,

    Mark

  • Looks like I spoke too soon!

    On re-installing  working with Foodmart was OK, but the restored cube still gave errors so I deleted it and restored from archive again.  Then everything was working OK, the cube processed for a couiple of hours then let me browse it for a while, then the same errors started appearing again on all cubes.  Whenever I try to edit, browse, etc, I just get a cannot connect to OLAP server error as before.

    Has anyone got any ideas?

    Thanks

    Mark

  • Another daft question maybe, but are you restoring the OLAP database onto it's original server, or are you moving it form one server to another?

    If moving it from one server to another, check you've got the correct data source connection strings. 

  • Hi Peter

    I am archiving on one server and restoring on the problem server.

    Data source is the same on both as they are both getting the data directly from Oracle Data Mart.

    Thanks

    Mark

  • Both servers are on the same service pack level?

     

    Steve.

  • Hi Steve

    They were both on SP3a when all this started.  Test is still on 3a but Live is now on 4 since the re-installation.  I'm a bit dubious to apply SP4 to Test until the other server is sorted as I'll be well and truely knackered if both go down, and Sod's Law and all that.....!

    As I said before, I think the cube restore could be a red herring.  I can't even open Analysis Services on the Live box now, I just get an unable to connect to the Analysis Server error.

    That being said, when I uninstalled and re-installed SQL Server and AS on Live, the restored cube file didn't get deleted and it was there when I first opened up AS.

    Perhaps I should uninstall again and make sure the restored cube files are deleted, then reinstall and create the cube from scratch.  That would take the resored cube totally out of the equation.

    Again, any comments from you or anyone else would be grately appreciated, as I not getting very far here.

    Might also be worth mentioning that we are using AWE, as the boxes have 10Gb of RAM each.

    Thanks

    Mark

  • Hi,

    We usually only get a 117 when there is a large non empty cross join MDX statement thrown at us, however, there have been other MDX combinations that have had the same affect, and it has been known to do it on cubes with drill through (Microsoft - Q313279)

    When you first got the 117 was there any queries coming in? Or drill through going on?

    Daft question coming – when the 117 happened did you wait until Event Viewer said the OLAP services had re-started and closed all the previously open connections? and created new connections?

    117 is reserved for 2 reasons – 1 – Your data files in the data directory are corrupted or 2 – a bug. - http://msdn.microsoft.com/library/default.asp?url=/library/en-us/trblsql/tr_as_errors_01vd.asp

    You could be on the right lines in making sure files are deleted when you delete and restore. I’d recommend bringing cubes back one at a time, testing each one, and then bringing another one back. It may take some time but hopefully you could identify just one cube that is causing the problem.

    I’ve done this a couple of times and because you can only archive entire databases, my method is to copy each cube into its own database, archive it, ensure all files are deleted and then restore it.

    Of course there is reason 2; you have unfortunately stumbled across a new bug.

    Howard.

  • Hi Howard

    Thanks for the reply.

    When it first happened there was nothing going on on the server, just me starting to work on the restored cube.

    The application event log is showing the 117 then 30 secs later AS started successfully then 20 secs later 117 again.  This just loops.  The System event log is showing a different looped error, with the service manager doing the restarting of AS.  I was closing everything down and stopping and restarting Analysis Services also.

    It's looking like your reason 1 - corrupt data in the data directory was the answer.  This morning I uninstalled again and manually deleted any data files that were left.  I then reinstalled to SP4 and have created the cube from fresh rather than a restore.  So far all is working OK (touch wood!), so it is looking like it was the restored cube after all.  The server it was archived on has been having intermitant memory problems, so perhaps this caused a bit of duff data in the archive.  And they are all new HP boxes!

    Thanks for the link, if I would have seen this earlier I would have suspected the archive from the start.  I don't think my google search bought this one back.  The Microsoft site is a wealth of info, it's just a bugger to find most of it.

    So hopefully the problem has now gone away.  I will keep working on this server for the moment to make sure.

    Thanks again for your comments.

    Mark

  • I've noticed google isn't very good at returning specific microsoft searches. I always try and goto http://msdn.microsoft.com/default.aspx and search from there. You get much better results.

    I hope things stay stable for you.

  • Hello everyone!

    Sorry about bringing this issue back to life, but I am in a deep trouble now.

    Here's my situation: I had box A as the live box for a huge OLAP DB and box B as the old live box for this same OLAP (box B has other OLAP DBs).

    I changed a property at a level of dimension and tried to save. This error started to happen. Tried to restart the service and even the whole box, but nothign would make it work.

    Then I thought "I'll just go back to box B because I have the OLAP DB there". Well, I did it. Made the some change to the same dimension level and guess what? You're right, now I have two problems!

    Please, any info on this issue will be very, very welcome.

    Thanks!

  • Do you still have the back up you used to move from box B to box A? if so reuse that to get yourself back up and continue your changes in a controled test environment. If not can you find any archived stuff from the DBA's?

  • Yes, we do have backups.

    I had some help from other people from the team and we are back now.

    Here's what was done last night: we have the OLAP repository on a SQL Server. There was a lock on this DB. So we killed the process that was trying to lock some object in the repository. After that the MSOLAP Service was changed to "take no action" when failure. The box was restarted and then the service was brought up manually.

    It worked, we hope we have helped. But we will not stop trying to find out what's happened. We'll set up another box and try to make this error occur consistently.

    I'll keep you updated on this.

    Thanks!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply