Find duplicates in datasets

  • Hello,

    Please note that I am all new to this, so my question can look a little dum... but here it is...

    I WOULD LIKE TO ADD THE FOLLOWING CONDITION (find duplicates):

    GROUP BY ISF.OperatingEntityNumber, ISF.QuestionnaireTypeCodeId

    HAVING COUNT(*) > 1;

    TO THE FOLLOWING QUERY:

    INSERT INTO SurveyInterface.tblLoadISFNotification (OperatingEntityNumber, SDDS, SurveyCodeId, QuestionnaireTypeCodeId, ReferencePeriod, DataReplacementIndicator, PrecontactFlag, SampledUnitPriority)

    SELECT ISF.OperatingEntityNumber

    ,[SDDS]

    ,[SurveyCodeId]

    ,[QuestionnaireTypeCodeId]

    ,[ReferencePeriod]

    ,[DataReplacementIndicator]

    ,[PrecontactFlag]

    ,[SampledUnitPriority]

    FROM dbo.tblISF40201507 ISF

    JOIN Operating.tblOperating O

    ON O.OperatingEntityNumber = ISF.OperatingEntityNumber

    JOIN Operating.tblAccountingTypeCharacteristic ATC

    ON O.OperatingId = ATC.Operatingid

    WHERE atc.AccountingTypeCharacteristicCodeId = 3 -- Allocation Entity

    IT’S JUST NOT WORKING!! Can you help me with this?

  • In a query that contains GROUP BY clause, all the columns in the select list must be in the GROUP BY clause on in an aggregate function.

    Example:

    SELECT ISF.OperatingEntityNumber, ISF.QuestionnaireTypeCodeId, COUNT(*)

    FROM dbo.tblISF40201507 ISF

    JOIN Operating.tblOperating O

    ON O.OperatingEntityNumber = ISF.OperatingEntityNumber

    JOIN Operating.tblAccountingTypeCharacteristic ATC

    ON O.OperatingId = ATC.Operatingid

    WHERE atc.AccountingTypeCharacteristicCodeId = 3 -- Allocation Entity

    GROUP BY ISF.OperatingEntityNumber, ISF.QuestionnaireTypeCodeId

    HAVING COUNT(*) > 1;

    -- Gianluca Sartori

  • Sorry... maybe I wasn't clear enough. English is not my primary language. Let me rephrase that...

    I need to make a selection on join datasets with 2 conditions and populate the results in another dataset(Report).

    It is working with the fist condition "AccountingTypeCharacteristicCodeId = 3"...

    INSERT INTO SurveyInterface.tblLoadISFNotification (OperatingEntityNumber, SDDS, SurveyCodeId, QuestionnaireTypeCodeId, ReferencePeriod, DataReplacementIndicator, PrecontactFlag, SampledUnitPriority)

    SELECT ISF.OperatingEntityNumber

    ,[SDDS]

    ,[SurveyCodeId]

    ,[QuestionnaireTypeCodeId]

    ,[ReferencePeriod]

    ,[DataReplacementIndicator]

    ,[PrecontactFlag]

    ,[SampledUnitPriority]

    FROM dbo.tblISF40201507 ISF

    JOIN Operating.tblOperating O

    ON O.OperatingEntityNumber = ISF.OperatingEntityNumber

    JOIN Operating.tblAccountingTypeCharacteristic ATC

    ON O.OperatingId = ATC.Operatingid

    WHERE atc.AccountingTypeCharacteristicCodeId = 3 -- Allocation Entity

    Know I also want to add in that new dataset(report) all the duplicates of concatenated variables ISF.OperatingEntityNumber/ISF.QuestionnaireTypeCodeId

    If I try what you suggested, it gives me the following error:

    The select list for the INSERT statement contains more items than the insert list. The number of SELECT values must match the number of INSERT columns.

    ... talking about the count(*) for sure!! But I don't want to output the count...

  • I suppose this should do:

    INSERT INTO SurveyInterface.tblLoadISFNotification (OperatingEntityNumber, SDDS, SurveyCodeId, QuestionnaireTypeCodeId, ReferencePeriod, DataReplacementIndicator, PrecontactFlag, SampledUnitPriority)

    SELECT

    OperatingEntityNumber

    ,[SDDS]

    ,[SurveyCodeId]

    ,[QuestionnaireTypeCodeId]

    ,[ReferencePeriod]

    ,[DataReplacementIndicator]

    ,[PrecontactFlag]

    ,[SampledUnitPriority]

    FROM (

    SELECT

    cnt = COUNT(*) OVER (PARTITION BY ISF.OperatingEntityNumber, ISF.QuestionnaireTypeCodeId),

    ISF.OperatingEntityNumber

    ,[SDDS]

    ,[SurveyCodeId]

    ,[QuestionnaireTypeCodeId]

    ,[ReferencePeriod]

    ,[DataReplacementIndicator]

    ,[PrecontactFlag]

    ,[SampledUnitPriority]

    FROM dbo.tblISF40201507 ISF

    JOIN Operating.tblOperating O

    ON O.OperatingEntityNumber = ISF.OperatingEntityNumber

    JOIN Operating.tblAccountingTypeCharacteristic ATC

    ON O.OperatingId = ATC.Operatingid

    WHERE atc.AccountingTypeCharacteristicCodeId = 3 -- Allocation Entity

    ) AS duplicates

    WHERE cnt > 1

    -- Gianluca Sartori

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply