Database Column Tuning

  • I have developers quickly building databases to load client data. Some have built entire databases using nvarchar everywhere. This of course is a performance nightmare and bad practice. I am working with the lead developer on fixing the pre-existing tables and databases and came up with a script to find max number of data items in each fields. The results on the full script are buggy and need some help from you wonderful people.

    if I run this script

    use DBA_DB

    go

    select max(len(ACCTNUM)) as ACCTNUMMaxlength

    from genericleadsdemo

    GO

    select max(len('Margin 1 High for ALAP')) as Margin1Maxlength

    from genericleadsdemo

    GO

    select max(len('AGENCYCD')) as AGENCYCDMaxlength

    from genericleadsdemo

    I get values

    13

    22

    8

    respectively

    When I run this script

    use DBA_DB

    go

    -- to find table column datatypes and max length of data

    SELECT a.name, b.name AS ColumnName,

    c.name AS Datatype, b.length as ColLength,

    max(len(b.name)) as Maxlength

    FROM sysobjects a, syscolumns b, systypes c--, sysindexes

    WHERE a.id = b.id AND

    b.xtype = c.xtype AND

    -- to run for all tables in db run next line

    a.xtype = 'U'--and c.name <> 'sysname'

    -- comment out below line to run for all tables or modify to different table

    and a.name = 'GenericLeadsDemo'

    group by a.name, b.name,c.name, b.length

    order by a.name, b.name

    I get

    name ColumnName Datatype ColLength Maxlength

    GenericLeadsDemo ACCTNUM nvarchar 100 7

    GenericLeadsDemo ACCTNUM sysname 100 7

    GenericLeadsDemo AGENCYCD nvarchar 100 8

    GenericLeadsDemo AGENCYCD sysname 100 8

    GenericLeadsDemo Margin 1 High for ALAP varchar 50 22

  • The script as written will find you the longest column name per table. Somehow, I don't think that's what you wanted.

    If you want to find the length of the data in the column, you'll have to query the tables themselves, not the system tables.

    The query you're looking for is of the following form, for each table that you want to check

    SELECT Max(len(column1)), max(len(column2)), ... From Table1

    You can use the system tables to build up the queries to run.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • With our databases, I have 30+ to check with many tables each with sometimes 100+ columns in the table. The script I wrote shows the column length, and the max length of the data within the column. (or so I thought). Some times the data extracted is dead on, and sometimes it is off.

    I need a way to run a scan of the table(s) to find out the values without writing a custom script for each table.

    Any ideas?

  • This snippet was created and posted by someone else who frequents this site. I wish I could remember who to give them credit.

    It might be useful to you.

    --===== Setup the environment

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    SET NOCOUNT ON

    SET ANSI_WARNINGS OFF

    --===== If the result table already exists, drop it

    IF OBJECT_ID('TempDB..#ColumnData') IS NOT NULL

    DROP TABLE #ColumnData

    --===== Declare the local variables

    DECLARE @Columns INT --Total number of columns found

    DECLARE @Counter INT --General purpose loop counter

    DECLARE @SQL VARCHAR(600) --Contains the dynamic SQL for each column

    --===== Populate the result table with the initial table/column info

    SELECT RowNum = IDENTITY(INT,1,1),

    TableName = OBJECT_NAME(sc.ID),

    ColumnName = sc.Name,

    DataType = UPPER(TYPE_NAME(sc.XType)),

    DefinedLength = sc.Length,

    MaxActualDataLength = CAST(NULL AS INT)

    INTO #ColumnData

    FROM dbo.SysColumns sc

    WHERE OBJECTPROPERTY(sc.ID,'IsTable') = 1

    AND OBJECTPROPERTY(sc.ID,'IsMSShipped') = 0

    --===== Remember how many columns there are

    SET @Columns = @@ROWCOUNT

    --===== Add a primary key to the result table (just 'cuz)

    ALTER TABLE #ColumnData

    ADD PRIMARY KEY CLUSTERED (ROWNUM) WITH FILLFACTOR = 100

    --===== Loop through the column data and find the actual max data length for each

    SET @Counter = 1

    WHILE @Counter <= @Columns

    BEGIN

    SELECT @SQL = 'UPDATE #ColumnData SET '

    + 'MaxActualDataLength=(SELECT MAX(DATALENGTH(['+ColumnName+'])) FROM ['+TableName+'])'

    + 'WHERE RowNum='+CAST(@Counter AS VARCHAR(10))

    FROM #ColumnData

    WHERE RowNum = @Counter

    -- PRINT @SQL

    EXEC (@SQL)

    SET @Counter = @Counter+1

    END

    --===== Display the columns not fully utilized in order of worst usage of the length

    -- Note that NULL columns contain no data at all.

    -- Note that this does NOT find columns that have been RPadded to max length (yet).

    SELECT *,DefinedLength-MaxActualDataLength AS MinUnused

    FROM #ColumnData

    WHERE ISNULL(MaxActualDataLength,0)<DefinedLength

    ORDER BY CASE WHEN MaxActualDataLength IS NULL THEN 9999

    ELSE DefinedLength-MaxActualDataLength

    END DESC,TableName,ColumnName

  • Lowry Kozlowski (12/20/2007)


    With our databases, I have 30+ to check with many tables each with sometimes 100+ columns in the table. The script I wrote shows the column length, and the max length of the data within the column. (or so I thought).

    Not at all. You're getting the max value of LEN(syscolumns.name). Hence, get the length of the column name, then take the max value of that per table. The only tables you're querying there are the system tables.

    I need a way to run a scan of the table(s) to find out the values without writing a custom script for each table.

    Any ideas?

    As a starting place, try this. (not guaranteed in any way)

    EXEC sp_MSforeachtable '

    DECLARE @columns varchar(8000)

    SELECT @columns = ''''

    SELECT @columns = @Columns + ''MAX(len('' + name + '')) AS Max_Len_'' + name +'', '' FROM sys.columns where object_id = object_id(''?'')

    print ''SELECT ''''?'''' AS TableName, '' + LEFT(@columns, LEN(@Columns)-1) + '' FROM ?''

    '

    Take the results, paste into a query window and run them.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for your help. The script Joel supplied gives me closer to what I need. I just will have to play with it to sort the data by tables and columns. Gila monsters works in that it creates the select statements for the tables but since the other issue the developers did was create table names with spaces (ARGH) it has issues running without going in and editing the fields created. For what I need to do the more detail the better.

    Thanks again for all the help!;)

  • the snippet

    i.e.

    --===== Setup the environment

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    SET NOCOUNT ON

    SET ANSI_WARNINGS OFF

    --===== If the result table already exists, drop it

    IF OBJECT_ID('TempDB..#ColumnData') IS NO

    ....

    only valide for Master database

    for the other USER database

    it produce error !!

    error around here >>

    SELECT

    RowNum = IDENTITY(INT,1,1),

    TableName = OBJECT_NAME(sc.ID),

    what is wrong??

  • at the top of of the script add

    USE

    go

    or within query analyzier change from master to the database you wish to run this against

  • misandrew (12/21/2007)


    only valide for Master database

    for the other USER database

    it produce error !!

    What error?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • you all seem to be using old 2000 tables instead of the 2005 dmvs !

    as to original post, logic should dictate that the size of the stored data will not be stored in system tables, you wouldn't expect to find the largest value stored in an integer column from a system table - or would you? ( not counting identity )

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • In the case I am working on all are SQL2005 databases, some were migrated from 2000, some were created on 2005. The issue I am trying to resolve is removing as much nvarchar uses and also to optimize the size of the new fields to meet our business needs and still allow for the variances supplied by our customers.

  • This snippet was created and posted by someone else who frequents this site. I wish I could remember who to give them credit.

    Thanks for the compliment, Joel... nice to see that people use these things 🙂

    I gotta remember to put my name on some of these...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • always be wary of removing nvarchar/nchar just in case they were actually put there for a reason, likewise reducing column widths should be done with great care with regard to the client applications. reducing a varchar defined column from 100 to 50 doesn't save you any space even if the largest data item is 20 chars.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • They were put there because the develpers were too lazy to change the import from the default build from excel. I have been working with the team for months to educate them and find out where there truly needs to be nvarchar (nowhere in most of the cases). I will be working with them and much testing in dev prior to all changes going to test then prod. As of now no new tables are promoted with nvarchar unless they have a legit business reason and supervisor approval.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply