Move 2005 System DB Files in Clustered Instance

  • I need to move system databases in a 2005 (Enterprise SP3 64bit) 2-node cluster.

    The instance uses 4 SAN drives, R:,S:,T:,U: for database files. I need to move

    the system databases from R: and S: to T: and/or U: because the R: and S: are

    going away (to reclaim SAN space no longer used). My question is: Are there

    any special considerations in the process for clustering? I've tested the move

    process on an unclustered instance and am comfortable with that. When it comes

    time to start the instance from a command prompt with /f and /T3608 flags, are there

    any special steps? Since T: and U: are already used by SQL I figure there should be no

    cluster resource permission issues. Any thing else special to think about?

    I don't have a clustered sandbox to test this on.

  • I just did this move for our cluster about 10 months ago and it was very smooth. You do need to also consider moving the Resource DB as well and making sure the new disks are cluster Resources.

    First, be sure you have reviewed MS recommendations at (which is sounds like you have):

    http://msdn.microsoft.com/en-us/library/ms345408(v=SQL.90).aspx

    Obviously you'd need to change the drive letters and paths as well as test this on a dev instance first, clustered or not, to be comfortable with it.

    Good luck with the move!

    Follow the steps outlined from link above. Once I got to this point I ended up having to use sqlcmd and connect to it via IP (noted below)

    Start the instance of SQL Server in master-only recovery mode by entering one of the following commands at the command prompt.

    The parameters specified in these commands are case sensitive. The commands fail when the parameters are not specified as shown.

    For the default (MSSQLSERVER) instance, run the following command:

    NET START MSSQLSERVER /f /T3608

    Using sqlcmd commands or SQL Server Management Studio ~*I just used sqlcmd*~, run the following statements. Change the FILENAME path to match the new location of the master data file. Do not change the name of the database or the file names.

    If single instance:

    sqlcmd -U sa -P XXXXXXXXXXXX

    If cluster then will need to use the following with IP:

    sqlcmd -S ###.###.###.###

    ALTER DATABASE mssqlsystemresource

    MODIFY FILE (NAME=data, FILENAME= 'U:\YourNewFolder\mssqlsystemresource.mdf');

    GO

    ALTER DATABASE mssqlsystemresource

    MODIFY FILE (NAME=log, FILENAME= 'U:\YourNewFolder\mssqlsystemresource.ldf');

    GO

    Move the mssqlsystemresource.mdf and mssqlsystemresource.ldf files to the new location.

    Set the Resource database to read-only by running the following statement.

    ALTER DATABASE mssqlsystemresource SET READ_ONLY;

    Exit the sqlcmd utility or SQL Server Management Studio.

    Stop the instance of SQL Server.

    Restart the instance of SQL Server.

    Verify the file change for the master database by running the following query. The Resource database metadata cannot be viewed by using the system catalog views or system tables.

    SELECT name, physical_name AS CurrentLocation, state_desc

    FROM sys.master_files

    WHERE database_id = DB_ID('master');

    GO

  • Yep. I think I have that all covered and tested. Thanks for the support. What I'm still not clear on is the starting of the instance with the NET START command. Usually, I bring SQL up and down from the Cluster Administrator. I would like to be more certain that when I log onto the primary node server and start up the local services using NET START, that the cluster will say, "Hey, fella, I'm over here on this virtual ip for the cluster and I hear you. I'll start SQL now". That happens ok, does it?

  • hayden_jones (1/23/2012)


    Yep. I think I have that all covered and tested. Thanks for the support. What I'm still not clear on is the starting of the instance with the NET START command. Usually, I bring SQL up and down from the Cluster Administrator. I would like to be more certain that when I log onto the primary node server and start up the local services using NET START, that the cluster will say, "Hey, fella, I'm over here on this virtual ip for the cluster and I hear you. I'll start SQL now". That happens ok, does it?

    I usually bring SQL up and down via Cluster Administrator as well. So, if memory serves me correctly when I issued the NET START I was on the physical node that the cluster manager said was current owner. I issued the NET START command and then the SQL Server service started. At this point I was able to connect using sqlcmd from the same physical node and using the IP address (sqlcmd -S ###.###.###.###)

  • I missed something because it didn't work. App DBs moved ok. msdb, model , and tempdb moved ok. I then changed the startup parameters in Configuration Manager to tell SQL where the Master DB resides, took down SQL, copied the files to the new location and entered NET START MSSQLSERVER /f /T3608 in a command window on the primary node. SQL said, essentially, (I'll paste error messages below) "something's wrong with your log file. you have to restore Master from a backup". That is probably bogus. I double and tripple checked my startup parameters, byte-for-byte, several times with another DBA checking my work. I tried recopying the files in case the copy caused some corruption. I did this several times. The instance ALWAYS started successfully using NET START MSSQLSERVER /f /T3608 from the command line if the old file location was used. It ALWAYS failed if the new location was used. This is not likely due to cluster resource authorization problems because the new drives have been in the cluster for years with app DBs on them, And the msdb, model, and tempdb DBs moved there just fine.

    These messages appeared in the error log in the New Location:

    2012-01-28 22:21:56.46 Server Microsoft SQL Server 2005 - 9.00.4220.00 (X64)

    Apr 2 2009 18:34:24

    Copyright (c) 1988-2005 Microsoft Corporation

    Enterprise Edition (64-bit) on Windows NT 6.0 (Build 6001: Service Pack 1)

    2012-01-28 22:21:56.46 Server (c) 2005 Microsoft Corporation.

    2012-01-28 22:21:56.46 Server All rights reserved.

    2012-01-28 22:21:56.46 Server Server process ID is 9556.

    2012-01-28 22:21:56.46 Server Authentication mode is MIXED.

    2012-01-28 22:21:56.46 Server Logging SQL Server messages in file 'T:\MSSQL\ErrLog\ERRORLOG'.

    2012-01-28 22:21:56.46 Server This instance of SQL Server last reported using a process ID of 5804 at 1/28/2012 10:21:00 PM (local) 1/29/2012 4:21:00 AM (UTC). This is an informational message only; no user action is required.

    2012-01-28 22:21:56.46 Server Registry startup parameters:

    2012-01-28 22:21:56.46 Server -d T:\MSSQL\data\master.mdf

    2012-01-28 22:21:56.46 Server -e T:\MSSQL\ErrLog\ERRORLOG

    2012-01-28 22:21:56.46 Server -l U:\MSSQL\logs\mastlog.ldf

    2012-01-28 22:21:56.47 Server SQL Server is starting at high priority base (=13). This is an informational message only. No user action is required.

    2012-01-28 22:21:56.47 Server Detected 8 CPUs. This is an informational message; no user action is required.

    2012-01-28 22:21:56.47 Server Cannot use Large Page Extensions: lock memory privilege was not granted.

    2012-01-28 22:21:56.55 Server Using dynamic lock allocation. Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node. This is an informational message only. No user action is required.

    2012-01-28 22:21:56.60 Server Attempting to initialize Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.

    2012-01-28 22:21:56.67 spid1s A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 40988, committed (KB): 89472, memory utilization: 45%.

    2012-01-28 22:21:57.74 Server Attempting to recover in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.

    2012-01-28 22:21:57.74 Server Database mirroring has been enabled on this instance of SQL Server.

    2012-01-28 22:21:57.75 spid4s Starting up database 'master'.

    2012-01-28 22:21:57.76 spid4s Error: 9003, Severity: 20, State: 9.

    2012-01-28 22:21:57.76 spid4s The log scan number (30837:552:1) passed to log scan in database 'master' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup.

    2012-01-28 22:21:57.76 spid4s Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

  • That is strange. I noticed that you have your mastlog.ldf on a different drive from your master.mdf. I kept both of those (as well as the resource DB files) all in the same location. The MSDN article eludes to the log being able to be in a different location from the data file (for master), but the example puts them in the same place. Did you try it with both the data and log in the same location (T: Drive in your case)?

  • No, I did not. I will test on an unclustered instance, having log and data in different places. I may have tested that way originally, but I don't recall

  • I tested on a non-clustered instance. Separating the Master log and data is not an issue.

    I put them on separate drives and all went well.

    HOW ABOUT THIS?

    I noticed in the comments at the bottom of http://technet.microsoft.com/en-us/library/ms345408(SQL.90).aspx that

    there can be cluster permission problems on new directories. I'll paste one here.

    You can add the cluster service permissions to the new directory with ICACLS, the service accounts do not show in the GUI.

    eg

    icacls mssql /grant "NT SERVICE\MSSQLSERVER":(F)

    Does that make sense? the SQL service id has permissions on the cluster resource drives.

    But I created some new folders to put the DB files in. Does it make sense that the service id

    might not automatically have permissions on the new folders? If so, what the heck is icacls?

    I have no clue how to use it or where to issue it. If this might be the problem, how can I check

    to see if there is a lack of permissions? I don't want to schedule a retry of this event just because this

    MIGHT be the problem.

  • I've never had to use icalcs (looks like a tool for applying permissions though - http://ss64.com/nt/icacls.html). I changed the service account for several instances last summer and our server team had to do some work with permissions on the folder level because they weren't inherited from the drive letter. I'll check with them on Monday for any insight. As a plan I would make sure all permissions are the same on all folders (I'm sure you've done this already though), schedule another outage and have Microsoft on the ready if you have support. I'll let you know if my server team has any suggestions next week.

  • hayden_jones (1/23/2012)


    What I'm still not clear on is the starting of the instance with the NET START command. Usually, I bring SQL up and down from the Cluster Administrator.

    leave all resources in the cluster group online and manually take the sql server and its agent offline, you may then start sql server using net start from the command prompt. The virtual network name and IP will be online and available for you to connect to.

    hayden_jones (1/23/2012)


    I would like to be more certain that when I log onto the primary node server and start up the local services using NET START, that the cluster will say, "Hey, fella, I'm over here on this virtual ip for the cluster and I hear you. I'll start SQL now". That happens ok, does it?

    Providing you leave all other resources online in the group except sql server and the agent

    hayden_jones (1/20/2012)


    I don't have a clustered sandbox to test this on.

    Then i highly recommend you create one, use my tutorial at this link[/url] to help you

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Cluster sandbox would be lovely, but won't happen any time soon. We're a big shop. Stuff happens slow or never happens at all. I couldn't get access and premissions to do it even if I had the time to spend on it.

    So, Shawn,

    I found a couple of deficiencies that I figure should not cause what I experienced. I want to run them by you and the group.

    1. The domain id that runs the SQL service is in the Administrators group on the Primary node, but not on the

    Secondary node. I don't know why SQL has run successfully on both nodes, but it always has. So I figure

    this is not going to have an impact here - - - especially, since I was starting SQL from the Primary node where

    the id is God.

    2. When moving the master DB files, I changed the startup parameters in the properties of the SQL Server service on the Primary node, but not

    on the Secondary node. This is an oversight, but, again, I don't see why that should impact a startup on the Primary node.

    Does any of this sound like it would cause SQL to say, "Hey, dummy, you got a bad log file. I'm not going to start up"?

  • hayden_jones (2/7/2012)


    Cluster sandbox would be lovely, but won't happen any time soon. We're a big shop. Stuff happens slow or never happens at all. I couldn't get access and premissions to do it even if I had the time to spend on it.

    LOL you only need a decent spec pc or an old out of support or decommed server 😉

    hayden_jones (2/7/2012)


    1. The domain id that runs the SQL service is in the Administrators group on the Primary node, but not on the

    Secondary node. I don't know why SQL has run successfully on both nodes, but it always has. So I figure

    this is not going to have an impact here - - - especially, since I was starting SQL from the Primary node where

    the id is God.

    The SQL Server service account is not required to be in the local admins group and in fact probably shouldn't be either. All permissions required are granted when you set the service account through sql server configuration manager

    hayden_jones (2/7/2012)


    2. When moving the master DB files, I changed the startup parameters in the properties of the SQL Server service on the Primary node, but not

    on the Secondary node. This is an oversight, but, again, I don't see why that should impact a startup on the Primary node.

    Does any of this sound like it would cause SQL to say, "Hey, dummy, you got a bad log file. I'm not going to start up"?

    Changing the startup parameters should not matter, the cluster service will replicate those registry keys amongst the nodes

    to move master and resource do the following

    ➡ stop sql server service

    ➡ modify startup parameters

    ➡ copy master, distribution and resource database files to new location

    ➡ restart sql instance in master only recovery mode using /f /T3608 flags and check the new file locations are valid

    ➡ Set the resouce db file locations and set the database read only

    ➡ delete old files after sucessful sql server start

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • hayden_jones (2/7/2012)


    1. The domain id that runs the SQL service is in the Administrators group on the Primary node, but not on the Secondary node. I don't know why SQL has run successfully on both nodes, but it always has. So I figure this is not going to have an impact here - - - especially, since I was starting SQL from the Primary node where the id is God.

    The ID that runs SQL service is in the Administrators group on only one of my nodes as well so I wouldn't think that would be your issue

    hayden_jones (2/7/2012)


    2. When moving the master DB files, I changed the startup parameters in the properties of the SQL Server service on the Primary node, but not

    on the Secondary node. This is an oversight, but, again, I don't see why that should impact a startup on the Primary node.

    Does any of this sound like it would cause SQL to say, "Hey, dummy, you got a bad log file. I'm not going to start up"?

    This could be a problem. It was changed in both locations on my cluster.

    I still don't have an answer on the icalcs either. But like I said, I did have issues when changing the service account for other, stand alone servers that the permissions didn't trickle down to the underlying folders and we had to propogate those manually.

  • My server guy hadn't used ICACLS before so I can't help there, but he did mention the same that I've said about propagating permissions down to child objects under the Advanced Security Settings for the folder/drive.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply