Join Types & Execution Plans

Question

Post reply

Join Types & Execution Plans

Nic Hopper-586355

Ten Centuries

Points: 1131
More actions
February 12, 2008 at 3:51 am

#71566

Hi,

I have spent some time recently tweaking a pretty intensive procedure, one thing i noticed that raised a question is with regards to Hash matches in the execution plan, do these always require space to be used on tempdb (and then given back) for the hash table, or does the execution plan ever do it in memory.

Does anyone have a link for some detailed information regarding all the join types/Execution methods?

Thanks,

Nic

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 1

Hash matches happen in memory, unless the tash table gets too big. In that case, it spils to disk. It's called a hash bailout. There's an event in profiler that records this.

SQL 2000 Books Online

In-Memory Hash Join

The hash join first scans or computes the entire build input and then builds a hash table in memory. Each row is inserted into a hash bucket depending on the hash value computed for the hash key. If the entire build input is smaller than the available memory, all rows can be inserted into the hash table. This build phase is followed by the probe phase. The entire probe input is scanned or computed one row at a time, and for each probe row, the hash key's value is computed, the corresponding hash bucket is scanned, and the matches are produced.

Grace Hash Join

If the build input does not fit in memory, a hash join proceeds in several steps. Each step has a build phase and probe phase. Initially, the entire build and probe inputs are consumed and partitioned (using a hash function on the hash keys) into multiple files. The number of such files is called the partitioning fan-out. Using the hash function on the hash keys guarantees that any two joining records must be in the same pair of files. Therefore, the task of joining two large inputs has been reduced to multiple, but smaller, instances of the same tasks. The hash join is then applied to each pair of partitioned files.

Recursive Hash Join

If the build input is so large that inputs for a standard external merge sorts would require multiple merge levels, multiple partitioning steps and multiple partitioning levels are required. If only some of the partitions are large, additional partitioning steps are used for only those specific partitions. In order to make all partitioning steps as fast as possible, large, asynchronous I/O operations are used so that a single thread can keep multiple disk drives busy.

As for the multiple join types, there's a summary here[/url] (Blatent self-promotion) with links to far more in depth articles. If you have questions after, please post them here.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass