Some clients cann''t connect to QSL server after restarting SQL service

  • SQL Server Cluster

    Windows Server 2003

    SQL Server SP4

    2 active nodes

    Interesting thing has happened. There was a need to restart SQL server services. Instead of restarting physical sql servers, I decided to restart only sql server services without moving them to other nodes(thats the way cluster works). Suddenly after doing that some clients started calling and complaining that they cann't connect to SQL server through one of our program which connects to sql server using ODBC and sql server alias.

    I tried to connect to this sql server using that application and Query Analyzer and connections were successful.

    Things went better when I restarted physical SQL server nodes. There were no clients calls any more

    Did someone see or heard something like that?

    Maybe someone has any info how alias works, how client gets alias info from the server?

    Thanks

  • Are you sure your node didn't failover.  That's what it sounds like to me.  It would explain why the odbc didn't work, different machine.  What steps did you take to ensure no failover.  Uncluster that instance?

    Tom

  • Yes I'm sure. In the cluster administrator I've unchecked "Affect the group" which means, that the resource could be sinply restarted without failover to another node.

    I don't understand why some of our clients couldn't connect while other ones could. The error looked something like "SQL server doesn't exist" as they told me

     

  • Had any configuration change been made before the restart that wouldn't take effect until after the restart, for example authentication mode, TCP port?  Check the users/PCs that can't connect and see if you can find something they have in common along those lines.

    By the way (Tom), even if the node had failed over, you would (should) still be able to connect, because the virtual SQL Server is the same.  That's how clustering is designed.

    John

  • Unless your odbc was setup incorrectly.  To the machine ip instead of the virtual.

    Tom

  • John,

    There were not made any changes. I had to restart these sql servers because of the couple hanged transactions, which occured when our programmer made couple steps. But that's not the case

    The Question is: why someone could connect while others couldn't after simple restart of SQL server service(not physical server). When restarted physical server, the problem was solved 

  • Well, I think that maybe you missed something. There are three moments:

    1. All clients could work normally with sql server through ODBC

    2. I restarted SQL Server service. After that one part of our clients could work and other one- not. They got error messagewhich stated "SQL server doesn't exist"

    3. I restarted physical SQL server and then all our clients became happy

     

    So what was going in the second moment?

  • did you restart the services using CLUSTER MANAGER ?

    If not, that will be your problem.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Perhaps the users who couldn't connect are all on the same part of the network, and connectivity between that part of the network and the SQL Server was lost.  Then, by the time you had restarted the server, connectivity had been restored.  Coincidental, yes, but a possibility worth considering.

    As I said before, look for something that all of them have in common.

    John

  • - How about the connection-pools ? Did they reach the timeouts ?

    - Did the users stop and start their applications ? or just tried executing some stuff with the ongoing application ??

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • "did you restart the services using CLUSTER MANAGER ?" is a dang good question.

    But, you are saying some clients can't connect and some can ?

    Check the problemattical clients ODBC settings. Run cliconfg on the client and check the "Alias" settings. There should be an alias defined for the server on your cluster. If not create it. Go to the Edit option and see if they are set to use 'Dynamically Determine Port'. (SQL Clusters can pick their own port)

    With cliconfg you can disable unused protocols or reorder their use.

    Also, if the version of MDAC on the client is somewhat antique you will not see the option to set 'Dynamically Determine Port'. 

    DB


    The systems fine with no users loggged in. Can we keep it that way ?br>

  • I restarted SQL Server service using SQL Server Service Manager. After that one part of our clients could work and other one- not. They got error message which stated "SQL server doesn't exist". There was such situation when one client could work, but his neighbour sitting couple steps away - not. So I don't think that there was a problem with network connection or firewall etc. 

    As our programmer told me clients use two programs, wich were written using Visual Basic:

    1.The first program code has couple of strings where it is said that program should use "SQLOLEDB.1"  provider and there is a connection string wich shows the way to the sql server.

    2.The second program connects using ODBC and  the client who works with this one has configured system DSN to SQL server. As our programmer said, clients don't have client network utility configured. They use DSN

    Also interesting thing is that adding some user to "power user" group (locally on their computers) solved the problem, but it helped only to these, who used ODBC. Those power users who used OLE DB provider couldn't connect.

    Reinstalling MDAC didn't helped. We tried that.

    Now it all works fine, but I would like to find out what had happend then

  • you should use Cluster Administrator to stop/start all clustered applications !!!

    Clustermanager loose connectivity info an try to failover ...  it the stop/start is taking more than a "isalive"-poll-interval.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I think you're safe using the SQL Server Service Manager to stop and start the services - it's the Services applet in Control Panel that you shouldn't use.  Having said that, I always use Cluster Administrator, just to be sure.

    However, given that it's the OLEDB users that couldn't connect, it sounds as if something in that connection string didn't like the restart of the services and so that's where I would start my investigation.  Perhaps those clients use Windows authentication and the DSN people use SQL authentication?  Strange that it worked again after the reboot, though...

    If everything is working OK now, you may be better off just accepting it as "one of those things", documenting the incident thoroughly and moving on.  Then if it happens again you will be able to go back to your notes and look for patterns.

    John

  • Alzdba,

    Maybe it was my mistake, but why SQL server was reachable to other clients? The Question still remains

    "you should use Cluster Administrator to stop/start all clustered applications ". How it should look like? Do I need to move these resources from one server and back in order to restart or there is another method(Taking offline\online) of doing that. There is no such comman restart in the Cluster Administrator

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply