Group By help to remove duplicates.

  • Having written the below pieces of code I'm struggling to isolate duplicates from my results.

    I have created 2 temporary tables called #AMPpos and #AMPneg which contain key values and payment amounts for specific policy keys.

    When I run this piece of code:

    SELECT AccountsModulePaymentKeyNeg,

    PaymentAmountOriginalCurrencyNeg ,

    AbsolutePaymentNeg ,

    PolicyKeyNeg ,

    AccountsModulePaymentKeyPos ,

    PaymentAmountOriginalCurrencyPos ,

    AbsolutePaymentPos ,

    PolicyKeyPos

    FROM #AMPneg

    JOIN #AMPpos

    ON #AMPneg.PolicyKeyNeg=#AMPpos.PolicyKeyPos

    AND #AMPneg.AbsolutePaymentNeg = #AMPpos.AbsolutePaymentPos

    WHERE #AMPneg.PolicyKeyNeg <> -1

    and #AMPpos.PolicyKeyPos <> -1

    ORDER BY PolicyKeyNeg, PolicyKeyPos, AccountsModulePaymentKeyNeg, AccountsModulePaymentKeyPos

    I am close to what I require however I am trying to isolate 1 row id for a negative premium value joined to a row id for a positive premium value based on a join of the absolute values of these premiums

    108266 -75073.5000 75073.5000 311517 63554 75073.5000 75073.5000 311517

    108266 -75073.5000 75073.5000 311517 93479 75073.5000 75073.5000 311517

    108266 -75073.5000 75073.5000 311517 93986 75073.5000 75073.5000 311517

    108266 -75073.5000 75073.5000 311517 108269 75073.5000 75073.5000 311517

    As you can see in the data row id 108266 is appearing 4 times becuase there are 4 row id's where the value 75073.5000 exists.

    How can I amend my sql so I only return 1 row id for the negative value and 1 row id for the positive value (or 2 and 2 if there are 2 negative and positive absolute values which match)to allow me to have matching transactions which I can then remove from a reporting query.

    The problem I'm having is that there might be the same value in 10 rows as a positive figure but only 6 negative ones. In that case I need to match 6 row id's up so I'm left with only 4 true positive values to show on the report.

    I had thought it might be to do with the join or group by but I just can't get it to work.

    Any help is hugely appreciated!!! ๐Ÿ™‚

    Regards

    Graeme

  • Yes thanks for the advice on posting. Just started a new job last week have been thrown in the deepend in a new industry working on helpdesk issues to get an understanding of the data structure.

    From an ETL SSIS background it's been a while since I've written code like this.

    Ok what I have written so far is this based on your advice. Which is working nicely however I am still confused on how I isolate within a specific policy key. If a policy key does not have a negative value then I don't need it at all.

    If it does have a negative then I need to select that and a positive value, then get the accountspaymentkey for the 2 values.

    Otherwise I just have a table which is flagging where the value is negative and positive. I need to match these neg/pos values into pairs (based on them having the same policyid and a matching payment amount (but one pos one neg) and get their row id's(AccountModulePaymentKey).

    SELECT AMP.accountsmodulepaymentkey,

    AMP.paymentAmountOriginalCurrency,

    ABS(AMP.PaymentAmountOriginalCurrency) AS absolutepremium,

    P.policykey ,

    AMP.PaymentDate,

    P.UnderwriterReference,

    SUM(CASE WHEN AMP.PaymentAmountOriginalCurrency < 0 THEN 1 ELSE 0 END)

    OVER (PARTITION BY AMP.AccountsModulePaymentKey, AMP.PaymentAmountOriginalCurrency,ABS(AMP.PaymentAmountOriginalCurrency), P.policykey

    )

    AS negative_amt_cnt,

    SUM(CASE WHEN AMP.PaymentAmountOriginalCurrency > 0 THEN 1 ELSE 0 END)

    OVER (PARTITION BY AMP.AccountsModulePaymentKey, AMP.PaymentAmountOriginalCurrency, ABS(AMP.PaymentAmountOriginalCurrency), P.policykey

    )

    AS positive_amt_cnt

    FROM dbo.tbl_AccountsModulePayment AMP

    JOIN dbo.tbl_AccountsModuleHeader AMH

    ON AMH.AccountsModuleHeaderKey=AMP.AccountsModuleHeaderKey

    JOIN dbo.tbl_Policy P

    ON P.PolicyKey=AMH.PolicyKey

    WHERE P.policykey <> -1

    AND amp.PaymentAmountOriginalCurrency <> 0

    AND AMP.PaymentCategoryKey IN (1,2,3)

    GROUP BY AMP.AccountsModulePaymentKey, AMP.PaymentAmountOriginalCurrency, P.policykey, AMP.PaymentDate, P.UnderwriterReference

    ORDER BY p.PolicyKey

    Eg below

    7748 -3638.8600 3638.8600 262743 1999-02-22 00:00:00.000 13975Z99A 1 0

    84735 -3638.8600 3638.8600 262743 1999-02-22 00:00:00.000 13975Z99A 1 0

    52345 3638.8600 3638.8600 262743 1999-02-22 00:00:00.000 13975Z99A 0 1

    21081 289.3300 289.3300 262743 2000-04-19 00:00:00.000 13975Z99A 0 1

    9612 3559.0000 3559.0000 262743 1999-07-26 00:00:00.000 13975Z99A 0 1

    92654 3559.0000 3559.0000 262743 1999-07-26 00:00:00.000 13975Z99A 0 1

    79692 3559.0000 3559.0000 262743 1999-02-22 00:00:00.000 13975Z99A 0 1

    2468 289.3300 289.3300 262743 2000-04-19 00:00:00.000 13975Z99A 0 1

    2046 3559.0000 3559.0000 262743 1999-02-22 00:00:00.000 13975Z99A 0 1

    7453 3638.8600 3638.8600 262743 1999-02-22 00:00:00.000 13975Z99A 0 1

    All these are payments within the same policyid but there are 2 negative values and I want to do a further select to match those against 2 positive rows where the absolute values are the same.

    Thats why I have been using the ABS function as how else am I going to match the values together if they are pos/neg.

    Essentially I am trying to end up with a solution where I have a table of data with pairs of positive and negative values which are the same within each policy key. Then I can flag in the report that these values have to be contra'd off against each other and they won't show in the SSRS report.

    Thanks

  • Sounds like this to me:

    ;WITH MyStructuredData AS (

    SELECT RowID = ROW_NUMBER() OVER(PARTITION BY PolicyKey, PaymentAmountOriginalCurrency

    ORDER BY PolicyKey, PaymentAmountOriginalCurrency, accountspaymentkey),

    PolicyKey, PaymentAmountOriginalCurrency, accountspaymentkey

    FROM MyTable

    )

    SELECT

    pos.PolicyKey,

    pos.accountspaymentkey,

    neg.accountspaymentkey,

    pos.PaymentAmountOriginalCurrency,

    neg.PaymentAmountOriginalCurrency

    FROM MyStructuredData pos

    INNER JOIN MyStructuredData neg

    ON neg.PolicyKey = pos.PolicyKey

    AND neg.accountspaymentkey = pos.accountspaymentkey

    AND neg.RowID = pos.RowID

    AND neg.PaymentAmountOriginalCurrency < 0

    WHERE pos.PaymentAmountOriginalCurrency > 0

    AND (neg.PaymentAmountOriginalCurrency + pos.PaymentAmountOriginalCurrency) = 0

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for everybody's help.

    I ended up using the RANK statement and attaching a Rank column and then joining the positive and negative values on the policykey, absolute value and rank.

    This meant that when there are 3 negatives but 5 positives they would match up.

    That allowed me to get those key values of the rows which needed updated.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply