Multi-server administration

  • Hey guys,

    Any of you support 15-20 servers or more? Would you have some tips on how to automate and speed up the administration and monitoring of a large group of servers?

    I find it tedious to go one by one and check error logs, event logs, job statuses, alerts, etc.

    SQL Mail could be useful, but I can't have it installed on most servers.

    JM

  • I wrote sql script to query all my servers and give me the status of what I need and then load it to my server via bcp and put it to the main server that monitor. I can pull jobs status, step status, email status to see if email is up or down, sql error log and many more jobs. I could say I can get anything I want reported. I even check to see if sql agent is up or down as well as if the sql server is up. for example server 1 monitor 2 to 100 computer, and 3 monitor 1, and 5 monitor 3 so that we will be paged if any of our server went off line. Of course everything is using t-sql.

    mom

  • JM,

    I snickered on this one 🙂 Yes, I have the 15-20 Servers and am the only DBA, and I also do some development and report writing. I am th SQL guru around here...

    Anyway, YES, YES, YES, there is a way to automate your administration. I believe there are some tools out there, but that's not what I have. Here's what I have..

    1) Extra workstation in my office with SQL Server Client.

    2) A 'DBA' database to store and work with information

    3) Lots of batch scripts with SQL scripts, and some DTS packages on the Server with the "DBA" database.

    Through Task scheduler I have scheduled jobs that call my batch scripts. Some of which execute every 4 minutes.

    Here is what I am monitoring

    1) Disk space (xp_fixeddrives) and am alerted if it is getting close.

    2) Space in the datafiles, when they are going to grow, and am alerted if there is potential for growth.

    3) Transaction log space

    4) Event logs and parsed for errors

    5) All SQL Server Agent jobs and am emailed on any failure.

    6) ping checks

    7) Database connectivity checks

    8) Cluster Swings.

    9) etc

    This may not be the best way, but it was pretty easy to create. I am alerted by pager and/or e-mail depending on the severity of my alert.

    Jeff


    "Keep Your Stick On the Ice" ..Red Green

  • We also have a large number of servers. I have a large list of scripts that run as well.

    Things I have done to make life easier.

    One, setup a SQL server to act as a repository for gathered information from other servers like space usage details. Also, I point all errors from all servers to this same sql server so I have one place to setup alerts from. This also means you only need to watch one machine from the outside to make sure that email is leaving that box or to restart the agent every so often so you know that email isn't going to blow up on you. We all know how reliable sql mail is. This keeps you from having to setup a ton of alerts or operators on each server or even setting up sql mail on every server.

    Two, script everything, makes life much easier to use osql and xp_commandshell to run scripts on alot of servers at once to deploy them. Script out backups and jobs to a point to make the time you actually have to touch a server much smaller. It also keeps your enviornment the same across the board. Anything you can do with EM you can do in script.

    Three, don't be afraid to use an outside tool like openview or microsoft operations manager to watch your management server and all server to monitor agent or server shutdowns. Generally, I like having the sql server services monitored by an outside source and perform restarts or paging as needed.

    I'm currently working up how my team manages and monitors our server farm. I'll put out the documents and scripts when it is done.

    We currently have 35 to 40 servers attached to SAN devices and local storage. I have also used the same procedures to manage 35 servers at another job as the only dba on staff.

    Wes

  • Howdy

    To monitor disk space

    Using a host watch to monitor disk space. Alerts are sent to email and sms sent a group of mobiles if free space goes below a specified amount.

    Cheers

  • I saw a couple of you mention event logs. Do you mean Windows event logs?

    If so, how are you automating that? It's a recurring problem around here, and I keep meaning to find a way to load them to a central rollup and filter them to what is "interesting" but never have.

    Anyone done that? Got code you want to share?

  • I am also currently in a process of creating some scripts for Multi-server Admin. Although I manage only 6 servers now, the number will soon increase.

    This are some of ideas that i am working on

    1. Have a small DBA database on 1 server

    2. Create procedures / tables / other objects in this DBA database.

    3. Add managed servers as linked servers in server containing DBA database.

    4. Schedule jobs / Alerts / Mails on "DBA" server which will then in turn query / exec statements on remotes servers and process results returned accordingly.

    Any contributions are welcome.

    Windows Event Log for a remote server can be viewed / automated by using "dumpel". It is a Windows Resource kit utility free download by microsoft.

    Use that with dos batch files to analyse Windows event logs.


    -- Amit



    "There is no 'patch' for stupidity."



    Download the Updated SQL Server 2005 Books Online.

  • The problem with dumpel is timeliness.

    First, you have to deal with overlapping dumps and filter out the duplicates (you can control this somewhat with the "last 1 day" but it's a pretty gross time filter so you need to get some overlap to ensure you do not miss anything).

    Secondly, if you want to look historically it is good, but if you want something that is going to notice (say every hour or few minutes or even few hours) some problem developing, this is not very timely.

    I keep meaning to write some routine that will dynamically and in real time see changes and load them (or selected ones) up into a database. Just one of those things on the back burner and was hoping someone had better tools.

    Or are people finding infrequent dumps with dumpel adequate?

  • if you look at the sql agent there is an option to forward all handled or unhandled events to another machine this will send events from the logs to another machines logs they will both get logged once on the host machine once on the target machine.

  • Isn't that only SQL events?

    We have as many or more issues arise that show up in the windows event logs - device problems, user authentication issues (e.g. service accounts), etc.

  • If you don't mind registering a 3rd party dll, STMAdmin, I have a bunch of scripts that reads a list of server names and gets the last two days worth of Events from each, and concatnetes them into a file that you can email.

    I got the control and scripts from http://cwashington.netreach.net/main/library/stmadmin.html

    Jacko


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • Can you retrieve the event logs even if the workstation is not registered on the domain?

    Also, how do you monitor remote jobs statuses? By querying the sysjobhistory table?

    JM

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply