JOIN screwing things up.

  • I have a query(below) that contains a LEFT OUTER JOIN. This JOIN is needed to get all the data I need. However, it is causing some totals to be skewed because some table IDs are in the JOINED table multiple(not duplicates) times.

    The problem is, the total for totMileage is skewed because there are 2 WCid's that have mulitple(not duplicate) entries in the workexpense table.

    One was in there 5 times, and the other 2.

    So essentially, the WCmileage for those WCid's was being multiplied by 5 and 2 respectively.

    So that is causing the totMileage to be off.

    Is there a way to fix this so the numbers are correct? -Thanks!

    -----

    select Uname, UnameFirst, UnameLast, WXlinkItemID, WCid,

    count(WCid) as totWorkedDays,

    sum(WCtaskAmount) as totWorkedUnits,

    sum(WCbillableAmount) as totBilledUnits,

    round(sum(WCtaskAmount * Urate),2) as totWages,

    round(sum(WCmileage), 2) as totMileage,

    round(sum(WXamount), 2) as totExpenses

    from workrequest, subjectrecord, worktask, users, internaloffice, workcompleted

    left outer join workexpense on WXlinkItemID = WCid

    where WCcompletedDate between '1/04/2007 00:00:00' and '2/2/2007 23:59:59'

    and WCcompletedDate is not null

    and WCemployeeID = Uname

    and WCtaskTypeID <> 'BIWTNON'

    and WClinkItemID = WTid

    and WTlinkItemID = WRid

    and UinternalOfficeID = IOid

    and UworkgroupTypeID <> 'TRUSXXX'

    and WRsubjectRecordID = SRid

    and Uname = 'sdowdell'

    group by Uname, UnameLast, UnameFirst, WXlinkItemID, WCid ORDER BY WCid

  • There is a way to fix it, but it requires the business rules.

    What does it *mean* to your business if there are 2 or more records in WorkExpense for the same WCid ? Is this a valid state, or does it represent a bug in whatever application created the data ? If it's valid, which 1 of the multiple records is the correct 1 to use in this case ? The most recent ? The 1 with the highest/lowest mileage ?

  • It looks like you have multiple tables, (subjectrecord, worktask, etc.) with no alias and no join. 

    Could you write this using alias's so we can tell which columns are attached to which table and you may need to join these other tables...

    Second, you may want to do a "Derived Table" for your summations.  LEFT JOIN( SELECT SUM( ColumnName) AS ColumName.... FROM workexpense) AS WE ON( WR.WXlinkItemID = WE.WCid) to accomplish the specific records to sum. 

    I wasn't born stupid - I had to study.

  • Thanks for your responses.

    The workexpense table does not actually contain any mileage data. It just contains data that I need for the totExpenses column. All my mileage data, WCmileage, is in the workcompleted table.

    But the WCid column from workcompleted is linked in the workexpense table as WXlinkItemId. This is needed to show that these expenses are related to the data in workcompleted table.

    So would a "derived table" still work in this instance?

  • Could you post your query without the extraneous tables and alias the names so we will know what columns belong to what tables? 

    Also, you may want to provide a bit of data so we can determine if we are getting the correct results. 

    Thanks

    I wasn't born stupid - I had to study.

  • Here is a simplified query that returns the mileage and WCid(for reference).

    If I take out the LEFT OUTER JOIN, the query returns the correct #'s. But I need that in the larger query for another column of data.

    SELECT sum(workcompleted.WCmileage) as totMiles, workcompleted.WCid

    FROM users, worktask, workrequest, workcompleted

    left outer join workexpense on workexpense.WXlinkItemID = workcompleted.WCid

    where 1=1 and users.Uname = 'sdowdell'

    and workcompleted.WCcompletedDate between '1/04/2007 00:00:00' and '2/2/2007 23:59:59'

    and Uname = workcompleted.WCemployeeID

    and workcompleted.WClinkItemID = WTid

    and worktask.WTlinkItemID = workrequest.WRid

    and workcompleted.WCmileage > 0

    GROUP BY workcompleted.WCid ORDER BY workcompleted.WCid

    Here is the data that it returns:

    40.0 --- WC70676

    45.0 --- WC70738

    48.0 --- WC70750

    108.0 --- WC70804

    74.0 --- WC70861

    32.0 --- WC70901

    76.0 --- WC70928

    43.0 --- WC70996

    37.0 --- WC71056

    42.0 --- WC71062

    36.0 --- WC71128

    1015.0 --- WC71146 --incorrect, multiplied 5 times due to the JOIN (should be 203.0)

    103.0 --- WC71203

    63.0 --- WC71235

    44.0 --- WC71262

    300.0 --- WC71287 --incorrect, multiplied 2 times due to the JOIN(should be 150.0)

    107.0 --- WC71356

    94.0 --- WC71398

    149.0 --- WC71516

    50.0 --- WC71538

    Here is the data that it SHOULD return.

    40.0 --- WC70676

    45.0 --- WC70738

    48.0 --- WC70750

    108.0 --- WC70804

    74.0 --- WC70861

    32.0 --- WC70901

    76.0 --- WC70928

    43.0 --- WC70996

    37.0 --- WC71056

    42.0 --- WC71062

    36.0 --- WC71128

    203.0 --- WC71146 --correct

    103.0 --- WC71203

    63.0 --- WC71235

    44.0 --- WC71262

    150.0 --- WC71287 --correct

    107.0 --- WC71356

    94.0 --- WC71398

    149.0 --- WC71516

    50.0 --- WC71538

  • I have no way of testing this, but see if this works.  BTW WHERE 1 = 1 is totally unnecssary. 

     

    SELECT SUM( WC.WCmileage) AS totMiles, WC.WCid

    FROM workcompleted WC

       INNER JOIN users U ON( U.Uname = WC.WCemployeeID) -- ?

       INNER JOIN worktask WT ON( WC.WClinkItemID = WT.WTid) -- ?

       INNER JOIN workrequest WR ON( WT.WTlinkItemID = WR.WRid)

       LEFT JOIN( SELECT DISTINCT WXlinkItemID FROM workexpense) WE ON( WC.WCid = WE.WXlinkItemID)

      AND U.Uname = 'sdowdell'

      AND WC.WCcompletedDate BETWEEN '1/04/2007 00:00:00' AND '2/2/2007 23:59:59'

    GROUP BY WC.WCid

    ORDER BY WC.WCid

    I wasn't born stupid - I had to study.

  • wow that generated over 46,000 rows! It should only be 20 or so. I am not sure why it would do that.

  • I missed one of your conditions:

    SELECT SUM( WC.WCmileage) AS totMiles, WC.WCid

    FROM workcompleted WC

       INNER JOIN users U ON( U.Uname = WC.WCemployeeID) -- ?

       INNER JOIN worktask WT ON( WC.WClinkItemID = WT.WTid) -- ?

       INNER JOIN workrequest WR ON( WT.WTlinkItemID = WR.WRid)

       LEFT JOIN( SELECT DISTINCT WXlinkItemID FROM workexpense) WE ON( WC.WCid = WE.WXlinkItemID)

     WHERE U.Uname = 'sdowdell'

      AND WC.WCcompletedDate BETWEEN '1/04/2007 00:00:00' AND '2/2/2007 23:59:59'

      AND WC.WCmileage > 0

    GROUP BY WC.WCid

    ORDER BY WC.WCid

    Because I have no data, I am "assuming" that the summations are doubled or tripled because workexpense has multiple records for some WXlinkItemID's and no records for others, hence the LEFT JOIN and my use of DISTINCT.  (I am not wont to use DISTINCT as a general rule). 

    I wasn't born stupid - I had to study.

  • Thanks! That works on my parred down query, but when I try to implement it on my real query, I get this error:

    Invalid column name 'WXamount'

    Here is the full query using what you wrote:

    select users.Uname, users.UnameFirst, users.UnameLast, WE.WXlinkItemID, WC.WCid,

    count(WC.WCid) as totWorkedDays,

    sum(WC.WCtaskAmount) as totWorkedUnits,

    sum(WC.WCbillableAmount) as totBilledUnits,

    round(sum(WC.WCtaskAmount * Urate),2) as totWages,

    round(sum(WC.WCmileage), 2) as totMileage,

    round(sum(WE.WXamount), 2) as totExpenses

    from workrequest, subjectrecord, worktask, users, internaloffice, workcompleted WC

    LEFT JOIN( SELECT DISTINCT WXlinkItemID FROM workexpense) WE ON( WC.WCid = WE.WXlinkItemID)

    where WC.WCcompletedDate between '1/04/2007 00:00:00' and '2/2/2007 23:59:59'

    and WC.WCcompletedDate is not null

    and WC.WCemployeeID = Uname

    and WC.WCtaskTypeID 'BIWTNON'

    and WC.WClinkItemID = WTid

    and worktask.WTlinkItemID = workrequest.WRid

    and users.UinternalOfficeID = internaloffice.IOid

    and users.UworkgroupTypeID 'TRUSXXX'

    and workrequest.WRsubjectRecordID = subjectrecord.SRid

    and users.Uname = 'sdowdell'

    group by users.Uname, users.UnameLast, users.UnameFirst, WE.WXlinkItemID, WC.WCid ORDER BY WC.WCid

  • Boy, this keeps changing...    Try this, (it uses the Derived Table concept):

     

    SELECT U.Uname, U.UnameFirst, U.UnameLast, WE.WXlinkItemID, WC.WCid,

                 COUNT( WC.WCid) AS totWorkedDays,

                 SUM( WC.WCtaskAmount) AS totWorkedUnits,

                 SUM( WC.WCbillableAmount) AS totBilledUnits,

                 ROUND( SUM( WC.WCtaskAmount * Urate), 2) AS totWages,

                 ROUND( SUM( WC.WCmileage), 2) AS totMileage,

                 WE.totExpenses

    FROM workcompleted WC

       INNER JOIN users U ON( WC.WCemployeeID = U.Uname)

       INNER JOIN internaloffice O ON( U.UinternalOfficeID = O.IOid)

       INNER JOIN worktask WT ON( WC.WClinkItemID = WT.WTid)

       INNER JOIN workrequest WR ON( WT.WTlinkItemID = WR.WRid)

       INNER JOIN subjectrecord SR ON( WR.WRsubjectRecordID = SR.SRid)

    LEFT JOIN( SELECT WXlinkItemID ROUND( SUM( WXamount), 2) AS totExpenses

                      FROM workexpense GROUP BY WXlinkItemID) WE ON( WC.WCid = WE.WXlinkItemID)

    WHERE WC.WCcompletedDate BETWEEN '1/04/2007 00:00:00' AND '2/2/2007 23:59:59'

      AND WC.WCcompletedDate IS NOT NULL

      AND WC.WCtaskTypeID <> 'BIWTNON'

      AND U.UworkgroupTypeID <> 'TRUSXXX'

      AND U.Uname = 'sdowdell'

    GROUP BY U.Uname, U.UnameLast, U.UnameFirst, WE.WXlinkItemID, WC.WCid

    ORDER BY WC.WCid

    I wasn't born stupid - I had to study.

  • Yeah, this is getting confusing

    I tried this, but it said Line 14: Incorrect syntax near '('.

    I looked over and over the statement, but all the parentheses look good to me. I can't figure out where it is erroring.

  • Oooops... Forgot a comma. 

    SELECT U.Uname, U.UnameFirst, U.UnameLast, WE.WXlinkItemID, WC.WCid,

                 COUNT( WC.WCid) AS totWorkedDays,

                 SUM( WC.WCtaskAmount) AS totWorkedUnits,

                 SUM( WC.WCbillableAmount) AS totBilledUnits,

                 ROUND( SUM( WC.WCtaskAmount * Urate), 2) AS totWages,

                 ROUND( SUM( WC.WCmileage), 2) AS totMileage,

                 WE.totExpenses

    FROM workcompleted WC

       INNER JOIN users U ON( WC.WCemployeeID = U.Uname)

       INNER JOIN internaloffice O ON( U.UinternalOfficeID = O.IOid)

       INNER JOIN worktask WT ON( WC.WClinkItemID = WT.WTid)

       INNER JOIN workrequest WR ON( WT.WTlinkItemID = WR.WRid)

       INNER JOIN subjectrecord SR ON( WR.WRsubjectRecordID = SR.SRid)

    LEFT JOIN( SELECT WXlinkItemID, ROUND( SUM( WXamount), 2) AS totExpenses

                      FROM workexpense GROUP BY WXlinkItemID) WE ON( WC.WCid = WE.WXlinkItemID)

    WHERE WC.WCcompletedDate BETWEEN '1/04/2007 00:00:00' AND '2/2/2007 23:59:59'

      AND WC.WCcompletedDate IS NOT NULL

      AND WC.WCtaskTypeID <> 'BIWTNON'

      AND U.UworkgroupTypeID <> 'TRUSXXX'

      AND U.Uname = 'sdowdell'

    GROUP BY U.Uname, U.UnameLast, U.UnameFirst, WE.WXlinkItemID, WC.WCid

    ORDER BY WC.WCid

    I wasn't born stupid - I had to study.

  • FYI... the WHERE 1=1 thing is what a lot of GUI programmers do when writing embedded SQL with dynamic WHERE clauses...  they will always have a valid WHERE clause and they can add conditions on a whim...

     OR it can be an indication that someone is trying to do a little SQL Injection depending on where it shows up.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Interesting!  Is this because a NULL returned to some GUI's causes a problem? 

    It strikes me as a poor programming technique and a NULL should be handled in the GUI better.  Can you explain?  [ SQL Injection is a serious issue if that is occurring... ]

    I wasn't born stupid - I had to study.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply