In what twisted SQL world will a JOIN result in more records than a LEFT JOIN

  • Thanks for the support and suggestions, I will see if I can cobble together a simplified version of the code to show the issue.  The code has so much clean-up work to get it into a joinable state that most of it would be meaningless.  I am on a different computer today so it will be Monday before I can touch the code again.
    Off topic but as an example of what I am having to do to clean the code.
    Table B has two fields: codes and matching values in two string fields which use CSV formatting
    Code = "123,234,345,456" Value ="100,15.85,85.22,17" 
    so I have to CROSS APPLY both strings through Jeffs (awesome) DelimitedSpilk8K and then remove the Cartesian joins in the WHERE clause.  The next challenge is that the genius that put the original tables together though it would be a good idea to take the multi-part business key and create a single compound key on each table.  Table A has a key in the format 12345*XYX*F*S*nnn*7890
    Table B has the key 12345*XYZ*F*S*7890 and the nnn is one of the values derived from the DelimitedSpik8K so I have to inject this into the middle of the record key to get the two keys to match.  I can't use the natural multipart key is not indexed so ends up with table scans  - ARRRGH!!!.  the derived Table B actually gets stored as a temp table so that I can index it.

    I think the problem is going to be some null records in the WHERE clause and may indicate an error further up the code, but I still don't see how logically an inner join can return more records than a left join.

  • Holy cr@p. What a nightmare.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • I would often expect a left join on some criteria to deliver more rows than an inner join on the same criteria.  All the rows that are delivered by the inner join are also delivered by the left join.   If there are rows in the left-had table that don't match anything in the RH table, there is nothing delivered by the inner join that takes data from any of those rows, but there is a row delivered by the left join for each of those rows (each with NULLs for the columns that would have come from the RH rable if there had been a match for this LH row).   So the number of extra rows delivered by the left join is the number of rows in the LH table that contribute nothing to the inner join (unless some of these rows are discarded by the where clause).  The whole point of left join as opposed to inner join is to deliver extra rows to indicate rows in the LH table that don't match.
    But a WHERE clause can easily throw way all the extra rows, for example
      select  [come columns] from  L  left  join   R on [join criteria] where R.x <> L.y
    will throw away all rows that have data only from the LH table.
    And a where clause can easily throw away other rows instead:-
      select  [some columns] from L left join R on [join criteria] where R.x IS NULL or L.y between 0 and 2
    throws away nothing from teh left join, while
      select  [some columns] from L      join R on [join criteria] where R.x IS NULL or L.y between 0 and 2
    will throw away any row in which the colum y of the LH component as a value other than 0, 1, or 2 that has matched a row in R where the x value is not NULL.
    So probably it's something like that as one of the conditions of the where clause that's causing the unusual situation of teh inner join having few rows than the left join.

    Tom

  • Are there additional joins in this query that could also be affecting the number of rows returning? It does sound like it's a bit of a mess. Best of luck with it.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • I've gone back on to the system and can't replicate the error today so maybe it was a friday brain fug. 🙂  Thanks to everyone who contributed to the thread,  - Moderators, please can you look at deleting the thread as I think it is now irrelevant.

  • Heh... Moderators? 😀

    I wouldn't delete this thread if I could.  It has a lot of good thoughts in it including the thought that sometimes things happen that might cause someone to go down a rabbit hole or two and then, inexplicably, the problem simply disappears or can't be duplicated.  It's all good stuff.  Thanks for taking the time to ask the questions and share what you've gone through.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply