Copying only rows with unique values from one table to another

Question

Post reply

Copying only rows with unique values from one table to another

Louise-213220

SSC-Addicted

Points: 484
More actions
April 27, 2005 at 8:04 am

#65800

Hi all,

I am trying to copy rows from table Weekly into table Current. Table Weekly has 600 columns with a unique id which contains 50 duplicates. I therefore only want 550 rows to be copied into table Current which has the same structure.

The query I have composed is as follows:-

INSERT INTO current SELECT DISTINCT [Unique ID] AS UniqueID, Country AS Country, [Company Name] AS CompanyName, [Deal Name] AS DealName, [Line Amount] AS LineAmount, [SR Name] AS SalesRep, [Fiscal Quarter] AS FiscalQuarter, [WW Sales Stage] AS WWSalesStage

FROM Weekly

WHERE Country<>"Total";

All the rows are being copied into Current. Where am I going wrong?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Kathi Kellenberger SSChampion Points: 11811 More actions · Answer 1

This should work with an aggregate query:

INSERT INTO current SELECT [Unique ID] AS UniqueID, Country AS Country, [Company Name] AS CompanyName, [Deal Name] AS DealName, [Line Amount] AS LineAmount, [SR Name] AS SalesRep, [Fiscal Quarter] AS FiscalQuarter, [WW Sales Stage] AS WWSalesStage FROM Weekly WHERE Country<>"Total"
Group by [Unique ID], Country,[Company Name], [Deal Name],[Line Amount],[SR Name], [Fiscal Quarter], [WW Sales Stage]

Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor

James Rimell Right there with Babe Points: 791 More actions · Answer 2

Surely though if it truely is a row Unique ID (I am assuming here that this is your PK) you will never able to SELECT DISTINCT since the Unique ID will always be different, this will still be the case with the aggregate function since gouping by a true Unique ID will leave you with 1 entry for each group. If this is the case and you did not require the original PK from Weekly in the current table the first statement would be fine if you ommitted the [Unique ID column], as long as you had a PK in the current table which allowed autogeneration of values.

Louise-213220 SSC-Addicted Points: 484 More actions · Answer 3

Thanks very much for replies.

The Weekly table does not have a PK as it is a table linked to an Excel spreadsheet which is to be copied on a weekly basis which is why there are duplicates. There is a monetary column for each row in both tables - the aim is to sum the values of this column for each "Unique ID" from the Weekly table and then transfer just one occurrence with the aggregated value to the Current table which does have a PK.

Am I also right in thinking that in Access (unlike in Enterprise Manager) you can only include one SQL statement per query. It would be nice to have several queries execute at one time (to make life easier for the user).

noeld SSC Guru Points: 96590 More actions · Answer 4

>> ... from the Weekly table and then transfer just one occurrence with the aggregated value to the Current table which does have a PK. <<

What is the PK in there? can you post the structure of your Weekly And Current Table?

>>Am I also right in thinking that in Access (unlike in Enterprise Manager) you can only include one SQL statement per query.<<

Yes, you are correct! Access can nest the queries for you to implement the "subquery" feature. You do get things built in like CrossTab that SQL Server won't be implementing until SQL2005 goes Live (with PIVOT and UNPIVOT). You get vb Function Flexibility which you won't will be able to use on SQL until 2005 comes out with CLR integration, etc. I am not saying that Access is superior I am just impliying that you have some good and some bad things and you have to weight the features you use in each one of them.

hth

* Noel

kbrown@pccsinc.org Old Hand Points: 366 More actions · Answer 5

Louise,

I think Kathi was on the right track by using an aggregate query. You do have to keep in mind what is that is duplicated in your Excel table and what it is that you would like to add up.

It looks like you need to have the sum of the "Line Amount" column for each "Unique ID". If this is the case, then you only want to group by the UniqueID and not all of the other columns. For example, if you group by all the columns in the table, you still have your 50 duplicates included as, I think James was eluding to. So, if you have data that looks like this:

UniqueID Country LineAmount

1 USA 100

1 France 200

2 USA 100

2 USA 300

Then when you group by UniqueID and Country and Sum the LineAmount, you get this result:

UniqueID Country LineAmount

1 USA 100

1 France 200

2 USA 400

If what you want is to know that UniqueID 1 has a total of 300 and 2 has a total of 400, then you have to Group By just the UniqueID.

So, with your data, that would look like this:

SELECT [Unique ID], SUM([Line Amount]) AS TotalForUniqueID

FROM Weekly

WHERE Country <> 'Total'

GROUP BY [Unique ID]

Now, if when you insert, you want include all the other columns, you have to join the above query with the query of all your other data, like so:

INSERT INTO current

SELECT W.[Unique ID], Country, [Company Name], [Deal Name], WSums.TotalForUniqueID, [SR Name], [Fiscal Quarter], [WW Sales Stage]

FROM Weekly W

JOIN (

SELECT [Unique ID], SUM([Line Amount]) AS TotalForUniqueID

FROM Weekly

WHERE Country <> 'Total'

GROUP BY [Unique ID]

) WSums

ON W.[Unique ID] = WSums.[Unique ID]

You really have to know how many of the columns are involved in the duplication in order to know what to group by to get the correct sum. If the only thing that changes with each duplicate line is the LineAmount, then you can use a query like Kathi’s and just add SUM([Line Amount]) , otherwise, you may have to post which columns are involved in the duplication or more info about what your data looks like.

Kathi Kellenberger SSChampion Points: 11811 More actions · Answer 6

Kathi Kellenberger

SSChampion

Points: 11811

April 30, 2005 at 8:29 pm

#556162

I may have misunderstood the question, but I think she is saying that completely duplicate rows were accidently added. If that is the case, my solution would work.

Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor

Louise-213220 SSC-Addicted Points: 484 More actions · Answer 7

Thanks so much for the great response. Our broadband has been down (good old BT) but we're back up now.

Query runs OK but I am still getting duplicates in the output table. I have isolated the ones where duplicates occur as follows:-

Unique ID Line Amount

10 200

10 400

11 300

12 200

12 400

By studying the duplicate rows I have narrowed it down to only Unique ID 12 being duplicated because the line amount is the same in 2 of the rows.