inner join give me huge data when package id equal package id from another table

  • I work on sql server 2014 i face issue when join two tables as inner join

    from table package it have 100000 rows

    and table package2 100000 rows

    so it will give result as

    100000 * 100000

    and it take too much time

    so how to handle that issue of big data

    are there are any way to get data quickly

    create table #package
    (
    packageId int,
    PartId int
    )
    insert into #package(packageId,PartId)
    values
    (1121,2311),
    (1121,9911),
    (1121,2020),
    (1121,5052),
    (1194,4311),
    (1194,9812),
    (1194,2391),
    (1194,5541)


    create table #package2
    (
    packageId int,
    PartId int
    )
    insert into #package2(packageId,PartId)
    values
    (1121,2377),
    (1121,2111),
    (1121,3420),
    (1121,5057),
    (1194,4388),
    (1194,9912),
    (1194,1091),
    (1194,6441)

    select p.partid as partc,p2.partid as partx from #package p
    inner join #package2 p2 on p.packageId=p2.packageId
  • "It takes too much time" -- Got any indexes on the tables? like on the Primary/Foreign Keys that you're doing your joins on? Did you look at the query's execution plan? You might want to download Grant Fritchey's book on reading execution plans... it should help a lot.

  • after create indexes

    are there are any thing remaining can enhance performance

  • Add RAM & SSD ( especially for tempdb )

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • The example query doesn't have a WHERE clause, so you're moving every single bit of the data. Indexes will help a little bit, but not much. A query that moves simply everything is only assisted by hardware. More, bigger, faster CPU. More, faster, disk. More, more, more, faster, memory. When you're simply moving all the data, all the time, you have no other choice for performance.

    Now, toss a WHERE clause or some other filtering mechanism in there, we can talk.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • can i do while loop to divide data

    then insert to table

    or this solution is bad

  • Why are you only joining on packageId - when both tables have packageId and partId?  If you join only on one part of the key - then you will get this type of issue.

    Based on your example though - you wouldn't get anything because you don't have matching package and parts in both tables.

    In your example, for 1121 packageId - you will get 4 rows from #package2 for each row.

     

     

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I suspect that the amount of time that you are experiencing is due to the time for the data to be shown on the screen, not the time for SQL to find it.

    What are you trying to do with all the data?

    If it is for display, NOBODY is going to read 10mil rows of data.

     

  • To add to what everyone else has said, network performance could be a bottleneck too.  Those tables don't look like they hold that much data in terms of MB or GB, so I doubt that network would be a bottleneck, but if you have a slow network between your workstation and the SQL server, getting a faster network connection MAY help.

    I do agree with everyone though - better hardware will help more than indexes.  Indexes on the joined columns should offer a benefit as well though if they don't already exist.

    It is not shown here, but just in case - doing cross database queries or using a linked server can cause slowness due to the query optimizer guessing the wrong number of rows.  Cross database queries and linked servers don't share their statistics with the optimizer, so the optimizer guesses there will be only 1 row which can result in a poor plan.  Pulling all of the cross database/linked server data into a temp table (or table variable) can help with performance.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply