SELECT All Columns from a table - without re-engineering if table stucture changes

  • Hi,

    My problem statement is very simple, all I have to do is write a stored procedure to get data from a table based on certain conditions.

    Now, the solution I am proposing is the one mentioned in Method 1 but my DBA says that I should use the one from Method 2 (i.e. dynamically create list of columns and then execute using sp_ExecuteSql). I find Method 2 code unnecessary for this simple query, he says that if any new columns are added in future we do not want to go and modify the columns, this happens lots of time that developers forget to add columns to related stored procedure.

    So, I am suggesting we use Method 3 if they do not want to modify the SPs in the future but that idea was shot down saying that you cannot use a SELECT * query.

    My questions/observations:

    1. Is there really any benefit in using Method 2 or am I missing something?

    2. In some SPs the WHERE clause is very long in that case it will be dealing with very ugly code using dynamic query.

    3. Is there any other work around anyone can think where I do not need to change SP if corresponding table has changed? In my case there is one SP per table.

    Thanks,

    ASDL

    -- Method 1. This query is SELECT <All Columns included in list> FROM Table WHERE Condition1, Condition2...

    DECLARE @PostalCode VARCHAR(15) = '98028'

    SELECT

    AddressID,

    AddressLine1,

    AddressLine2,

    City,

    StateProvinceID,

    PostalCode,

    rowguid,

    ModifiedDate

    FROM

    Person.[Address]

    WHERE

    PostalCode = @PostalCode

    GO

    -- Method 2. Get list of columns from Information_Schema.Columns view, create a CSV column name,

    -- dynamically create query and execute using sp_ExecuteSql

    DECLARE @PostalCode_Input NVARCHAR(15) = N'98028'

    DECLARE @Catalog VARCHAR(100) = 'AdventureWorks'

    DECLARE @Schema VARCHAR(100) = 'Person'

    DECLARE @Table VARCHAR(100) = 'Address'

    DECLARE @Where VARCHAR(1000) = ' WHERE PostalCode = @PostalCode'

    DECLARE @Parameters NVARCHAR(200) = '@PostalCode NVARCHAR(15)'

    DECLARE @Columns VARCHAR(MAX) = ''

    SELECT

    @Columns = @Columns + COALESCE('[' + COLUMN_NAME + ']' + ',', '')

    FROM

    INFORMATION_SCHEMA.COLUMNS

    WHERE

    TABLE_CATALOG = @Catalog

    AND TABLE_SCHEMA = @Schema

    AND TABLE_NAME = @Table

    ORDER BY

    ORDINAL_POSITION

    SET @Columns = SUBSTRING(@Columns, 1, LEN(@Columns) - 1)

    DECLARE @Sql NVARCHAR(MAX)

    SET @Sql = N'SELECT ' + @Columns + ' FROM ' + @Schema + '.' + @Table + ' ' + @Where

    EXEC sys.sp_executesql @Sql, @Parameters, @PostalCode = @PostalCode_Input

    GO

    -- Method 3. SELECT * FROM Table WHERE Condition1, Condition2...

    DECLARE @PostalCode VARCHAR(15) = '98028'

    SELECT

    *

    FROM

    Person.Address

    WHERE

    PostalCode = @PostalCode

    GO

  • I agree 100% with shooting down Method 3.

    Your DBA is a pretty smart guy for being able to do #2. Normally, I would be inclined to agree with him. However, I'll bet that whatever query that needs to have all fields included will also need to have application/report/SSIS package changes to be able to include those fields in the display/report/SSIS Package. For some of these, included the new columns could break the process. Therefore, I would end up saying that each stored procedure should be coded to have a specific list of columns (not dynamic) as per Method 1.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Thanks for your reply!

    If I need to select all columns from a table, I think SELECT * should work here. In my opinion Method 2 is an overkill. Moreover, the piece of code that receives the output does not care about the number of columns i.e. the number of columns returned can vary and it will handle it well. (Receiving process is a generic SQLCLR that writes data to a file)

    So, in this case adding/removing columns will not break anything.

    My question still holds how do I write this SP so that I do not have to change it in future if columns are added or deleted from the table?

  • If the intention of the query is to return most if not all of the columns from the table, and the order nor the exact number is important to the client then in my opinion the select * method will suffice.

    IF however, you only need one or just a few columns from a table then you should explicitly specify the columns in a select list to insure you get only what you need to make the query less consumptive of resources (bandwidth etc.) as well as allow SQL to perhaps better optimize the query.

    Method #2 is overkill and there should be no need for clients to have to write complex procedures involving system tables and so forth. Having said that, if you need all columns and need them in a specific order no matter what sort of modifications are ever made to the table then #2 would be the way to go. It doesn't sound to me like that is what you need though.

    The probability of survival is inversely proportional to the angle of arrival.

  • sturner (12/14/2010)


    If the intention of the query is to return most if not all of the columns from the table, and the order nor the exact number is important to the client then in my opinion the select * method will suffice.

    What do you mean "most if not all of the columns and the order nor the exact number is important"? When I do SELECT * FROM a table I assume all the columns from that table will be returned. Is there a case when not all columns will be returned? Eg. SELECT * FROM AdventureWorks.Person.Address WHERE Condition1

    IF however, you only need one or just a few columns from a table then you should explicitly specify the columns in a select list to insure you get only what you need to make the query less consumptive of resources (bandwidth etc.) as well as allow SQL to perhaps better optimize the query.

    I understand that but in my case I need to return *all* the columns at that time in the table, order of columns does not matter to me.

    Method #2 is overkill and there should be no need for clients to have to write complex procedures involving system tables and so forth. Having said that, if you need all columns and need them in a specific order no matter what sort of modifications are ever made to the table then #2 would be the way to go. It doesn't sound to me like that is what you need though.

    When you say "if you need all columns and need them in a specific order no matter what sort of modifications are ever made to the table then #2 would be the way to go". With Method 2, if there are any columns added/deleted then it will reflect in the output of the dynamic columns list. So, how is that different from SELECT * FROM table query?

    Thanks!

  • ASDL (12/14/2010)


    What do you mean "most if not all of the columns and the order nor the exact number is important"? When I do SELECT * FROM a table I assume all the columns from that table will be returned. Is there a case when not all columns will be returned? Eg. SELECT * FROM AdventureWorks.Person.Address WHERE Condition1

    All columns will always be returned by " select * " in the order of the original CREATE TABLE statement. If the table has many and/or large columns and your client only needs a few of them, returning all of the columns is wasteful of Server (and network) resources and may also preclude the server from better optimizing your query (i.e. utilizing a covered index perhaps...)

    I understand that but in my case I need to return *all* the columns at that time in the table, order of columns does not matter to me.

    Then by all means, use the select * syntax.

    When you say "if you need all columns and need them in a specific order no matter what sort of modifications are ever made to the table then #2 would be the way to go". With Method 2, if there are any columns added/deleted then it will reflect in the output of the dynamic columns list. So, how is that different from SELECT * FROM table query?

    Thanks!

    If your client is written in such a way that it expects the columns in a certain order (for example all integers first, then char data in alphabetical order) then you could handle that by the more complex methodology of preparing the query dynamically used in #2.

    As I stated, I would find it difficult to recommend the #2 method in your case it is unnecessary.

    The probability of survival is inversely proportional to the angle of arrival.

  • Use "select *" if you really want all the columns.

    The second method is clever (but could do with some improvements), but "clever" and "should be used in a production environment" are not the same thing. It's just wasting resources, not accomplishing anything useful.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • ASDL (12/14/2010)


    My question still holds how do I write this SP so that I do not have to change it in future if columns are added or deleted from the table?

    Is whatever is consuming the data produced from this procedure capable of handling it when the number of columns changes? This answer is usually NO. I therefore stand by my previous answer - when whatever consumes the data being produced is ready to be changed, then change the underlying query at the time. The query itself should select just the necessary columns, even if it's all of them, by name.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • This actually begs the question of why the number of columns in a table is so volatile that this matters in the first place.

    In most cases, the properties of an object don't change frequently enough to require a mechanism like this to deal with it dynamically.

    So this may actually be solving the wrong problem.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I would avoid the Select * method. Are you absolutely certain that all of the code has been adapted to properly handle all of the columns? Do you perform any unions based on this select * and possibly another select *? The use of select * in code may cause performance problems as well as cause code failures that were not expected (e.g. union statements).

    I agree that columns should not be changing so much that the ability to retrieve dynamically the columns from a table would be of real use.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • sturner (12/14/2010)


    All columns will always be returned by " select * " in the order of the original CREATE TABLE statement. If the table has many and/or large columns and your client only needs a few of them, returning all of the columns is wasteful of Server (and network) resources and may also preclude the server from better optimizing your query (i.e. utilizing a covered index perhaps...)

    I understand that but in my case I have to get all columns.

    If your client is written in such a way that it expects the columns in a certain order (for example all integers first, then char data in alphabetical order) then you could handle that by the more complex methodology of preparing the query dynamically used in #2.

    As I stated, I would find it difficult to recommend the #2 method in your case it is unnecessary.

    Now, I understand what you meant. :-), in my case I want all the columns in the order they were when table was created.

    Thanks sturner.

  • WayneS (12/14/2010)


    ASDL (12/14/2010)


    My question still holds how do I write this SP so that I do not have to change it in future if columns are added or deleted from the table?

    Is whatever is consuming the data produced from this procedure capable of handling it when the number of columns changes? This answer is usually NO. I therefore stand by my previous answer - when whatever consumes the data being produced is ready to be changed, then change the underlying query at the time. The query itself should select just the necessary columns, even if it's all of them, by name.

    Wayne,

    The consumer can handle varying length (columns) of data. All it is doing is parsing the data and writing to a file. I will not have to change the consumer at all.

    I agree with you to name all columns while selecting, infact that is Method #1 from my original post but my DBA wants me to be "future" proof and I am being forced to use Method #2 (dynamic ugly code) so my answer to be "future" proof is Method #3 (SELECT * FROM table) which is again not acceptable by him.

    By future proof I mean he does not want to change the SP if table columns change.

    My idea of using Method #1 is that if table changes all we have to do it run sp_depends (I know it does not return correct result all the time) and add/remove that column from the SP and we are good.

    Thanks,

    ASDL

  • ASDL (12/14/2010)


    WayneS (12/14/2010)


    ASDL (12/14/2010)


    My question still holds how do I write this SP so that I do not have to change it in future if columns are added or deleted from the table?

    Is whatever is consuming the data produced from this procedure capable of handling it when the number of columns changes? This answer is usually NO. I therefore stand by my previous answer - when whatever consumes the data being produced is ready to be changed, then change the underlying query at the time. The query itself should select just the necessary columns, even if it's all of them, by name.

    Wayne,

    The consumer can handle varying length (columns) of data. All it is doing is parsing the data and writing to a file. I will not have to change the consumer at all.

    I agree with you to name all columns while selecting, infact that is Method #1 from my original post but my DBA wants me to be "future" proof and I am being forced to use Method #2 (dynamic ugly code) so my answer to be "future" proof is Method #3 (SELECT * FROM table) which is again not acceptable by him.

    By future proof I mean he does not want to change the SP if table columns change.

    My idea of using Method #1 is that if table changes all we have to do it run sp_depends (I know it does not return correct result all the time) and add/remove that column from the SP and we are good.

    Thanks,

    ASDL

    I can see why he wants to do that, but it allows for too many problems. I would rather know that something is being changed and have a list of procs that need changed if something is to change.

    I would certainly emphasize that the better practice is to use Option 1.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • GSquared (12/14/2010)


    This actually begs the question of why the number of columns in a table is so volatile that this matters in the first place.

    In most cases, the properties of an object don't change frequently enough to require a mechanism like this to deal with it dynamically.

    So this may actually be solving the wrong problem.

    GSquared, I agree with you that the columns of the table should not be so volatile and if it is then there is something wrong.

    Please read my reply to Wayne. All I am trying here is to make changes within the limited options that I have.

    I am for Method #1 in original post.

    Thanks for replying 🙂

  • CirquedeSQLeil (12/14/2010)


    I would avoid the Select * method. Are you absolutely certain that all of the code has been adapted to properly handle all of the columns? Do you perform any unions based on this select * and possibly another select *? The use of select * in code may cause performance problems as well as cause code failures that were not expected (e.g. union statements).

    I agree that columns should not be changing so much that the ability to retrieve dynamically the columns from a table would be of real use.

    Yes, I am absolutely certain that the consumer code can handle all of the columns. So, to me the number, name, size, data type of columns does not matter. Plus to make you at peace the result is not going to the client in anyway. 🙂

    The consumers is a SQLCLR SP that parses this result set and writes to a file. I know there would be more questions about using DTS, BCP, SSIS, etc for this task but I have very limited options here. 🙂

    No, there are no unions performed and the queries are mostly SELECT <All Columns> FROM Table WHERE Condition1, Condition2...

    Now, with this background do you still recommend not using SELECT * query? I think SELECT * will be faster than Method #2 for sure plus low maintenance. I would like to implement Method #1 though. Please see my reply to Wayne for more information.

    Thanks,

    ASDL

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply