How many JOINs is too many?

  • I'm having an argument with a co-worker (actually my boss) about how many JOINs is too many.

    He claims anything over 4 tables and he starts to see performance degradation. Let's assume that all the JOINs are based on NC or CL indexes, and the tables range from 100 rows to a few thousand.

    My stance on this subject is that 7 or 8 JOINs are better than using Temp tables.

    What do you all think?


    Rick Todd

  • I'd vote for more joins, less tables.

    Interesting topic and something that is worthy of a few tests. Look for something soon.

    Steve Jones

    sjones@sqlservercentral.com

    http://qa.sqlservercentral.com/columnists/sjones

    http://www.dkranch.net

  • I would have to venture it would also depend on the data types joined, and how wide the tables were, and indexes, etc.....

    Tim C.

    //Will write code for food


    Tim C //Will code for food

  • Typically the fields joined would be INTs, perhaps with a sporadic varchar(32) thrown in. I would say an average table's width would be around...500 bytes. Do the indexes matter significantly if you're always JOINing on an index?


    Rick Todd

  • Also prefer using joins. Indexes on joins are very important for performance.

  • This doesnt say don't do it, but in there it does warn against using 4 or more joins, plus it has a couple other good join tips :

    http://www.sql-server-performance.com/tuning_joins.asp

    Tim C.

    //Will write code for food


    Tim C //Will code for food

  • When you deal with an extreme number of joins performance can degrade exponentially. 4 is common to see but I have situations with 12 or more joins and some are to the same table for multiple references. When he states he sees a degraded performance exactly what is he seeing that shows this problem and give a case in point. However, some degradation may be acceptable depending on the type of system and the requirements of data access. If the performance is a high concenr and the degradation is to high then you may consider denormalization. But first make sure ou have good indexing and that the joins are all required every time. If the join is to make a single view of the data and you don't need all the data on every piece of output then cater to the output not the concept of a single view (which is a common mistake).

  • Honestly, I'm not sure what his specific "degradation" is. I was hoping to discuss more general concepts in the thread, but general never seems to work, because performance depends so much on the specifics of the situation.

    I'll try to convince him to use more JOINs unless he can show me that temp tables are actually slower, rather than the other way around, which is how he does it now.

    Antares could you expand on this statement:

    If the join is to make a single view of the data and you don't need all the data on every piece of output then cater to the output not the concept of a single view (which is a common mistake).

    I'm not sure what you mean, but would like to.


    Rick Todd

  • It is common to see folks create a single view of the data from which to query and that can be more overhead than you need each time it runs. Here is an example:

    Table 1 = tblEmployees

    EmpID_PK,

    EmpName,

    HireDate,

    TerminationDate,

    TerminationReason

    Table2 = tblEmpPositionsSalary

    EmpID_FK,

    PosID_FK,

    StartDate,

    SalaryEffectiveDate,

    Salary

    Table3 = tblPositions

    PosID_PK,

    PosTitle,

    PayRangeMin,

    PatyRangeMax

    View = vwEmployeeData

    EmpID_K,

    EmpName,

    HireDate,

    TerminationDate,

    TerminationReason,

    EmpID_PK AS ManagerEmpID_K,

    EmpName AS ManaagerEmpName,

    PosTitle,

    StartDate,

    SalaryEffectiveDate,

    Salary

    Joining tblEmployees to tblEmployees and to tblEmpPositionsSalary to tblPositions)

    Then on a screen to list an Employee and his salary history (not showing position, just Employe Name on screen top and salary with salary effective date) the query may be written against the view which means 4 joins are done every time.

    Instead it is better for that screen to write either a new view or do a direct query joining tblEmployees directly to tblEmpPositionsSalary.

    Just make sure you are doing the minimal work needed for each output as you can.

    Probably even a non-issue but just wanted to point out. Reusable views are fine but make sure they don't cause more work than you need to perform.

  • Your boss is right, no more than 4 joins (unless if each table reside on a different drive), use declare myTable as Table instead of #tempTable, this is new in SQL 2k.

    quote:


    Honestly, I'm not sure what his specific "degradation" is. I was hoping to discuss more general concepts in the thread, but general never seems to work, because performance depends so much on the specifics of the situation.

    I'll try to convince him to use more JOINs unless he can show me that temp tables are actually slower, rather than the other way around, which is how he does it now.

    Antares could you expand on this statement:

    If the join is to make a single view of the data and you don't need all the data on every piece of output then cater to the output not the concept of a single view (which is a common mistake).

    I'm not sure what you mean, but would like to.


  • I have a query that includes a total of 11 joins all of which are joining on int. 3 of the JOINS are on different databases (same server)and one table has over 85000 records. Granted there are only 10 users at any one given time the system screams and we do not see any performance issues.

  • Like so many other issues - I would say the answer is "it depends". The biggest challenge is to figure out a viable test. For a specific selection, one method might be faster than another, but the impact to the system could be more debilitating.

    I've never had problems with joining many tables. I have also seen performance flip-flop when volumes and database size gets large. I have seen some temp tables cause big performance problems and other improve the performance. This is where you get to prove you are worth your position.

    My personal preference is to use more joins.

    Guarddata

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply