Excluding duplicated columns after join

  • Hi,

    I am trying to join 2 tables and store the result in a new table using 'select * into from ...'

    Sounds simple, but the problem is that the result contains duplicated columns and the table creation fails.

    Example:

    table1: id, col1

    table2: id, col2

    results: id, col1, id, col2

    Does anyone know if there is a generic way to exclude the duplicated column?

    Alternatively, do you perhaps know of a better way to achieve the same result?

    Thanks for your help,

    Frank

  • It's hard to know exactly what you are trying to achieve and how much data we are talking about.

    In any case, this should work for reasonably sized tables.

    1. Create a temporary table and populate with data from both tables.

    2. Use this table to insert into the main destination table using a select with distinct values on the column.

    Alternatively, you can use EXIST to check each row before insert, but it will be slow. Also, I am guessing you do not mind which table takes precedence.

    If you supply more details i.e. how many columns match etc,  I am sure you will get many more ideas.


    ------------------------------
    The Users are always right - when I'm not wrong!

  • Thanks for your reply.

    The data or speed does not really matter, nor does the number of columns which by the way varies as my generic stored proc is supposed to run on different databases with different design for the 2 tables.

    The only thing that's guaranteed is that both tables have always 1 identical column which I can use for the join.

    So I don't know how I can create a generic output table without removing the duplicated column first. Is there a way to create a table with a duplicated column name? Is there a better way than using 'select * into...'?

    Thanks again for your help.

  • first u need to use group by clause then u can eliminate duplicate column values by using having clause

    you need

    group by col1,col2

    having count(*)=1

    <a href="http://www.websolsoftware.com"> For IT jobs click here</a>

    *Sukhoi*[font="Arial Narrow"][/font]

  • I don't think there's any easy way to generally identify and remove duplicated column names that result from a join. If you use a "SELECT * " while joining two tables then this will result in all the columns from both the tables being returned (and if column names are duplicated in the tables being joined then the duplicates will be returned)

    One (convoluted) way to make this generic would be to use dynamic SQL to seek through the values in the INFORMATION_SCHEMA.COLUMNS view for the tables being joined and generate a SELECT list that removes the duplicate columns...This however is based on the assumption that if the column names are repeated in the tables being joined then the data they contain is the same (not always what I've seen)...

    I hope that makes sense....

  • Did you try DISTINCT keyword

    SELECT DISTINCT *

     

    Alex

     

  • Are you sure you want to do a join? It almost looks like you want to do a UNION query, i.e.,

    select id, column1, '1' as tblid

    from table1

    union all

    select id, column2, '2'

    from table2

    If you just want the distinct set between the two, change

    union all to just union, and leave off the final field:

    select id, column1

    from table1

    union

    select id, column2

    from table2

    If the ID field is common between the two, then you might need to do something like:

    select id, column1, null as column2

    from table1

    union all

    select id, null, column2

    from table2

    this could give you a table like:

    1 54

    2 55

    1 1004

    2 1003

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply