![]() ![]() It then tries to rewrite the apply as a regular join to maximize the chances of finding a good plan. The important takeaway at this stage is that the optimizer always removes subqueries, replacing them with an apply. Anyway, the interested reader is encouraged to explore the above simplification activities with other logically-equivalent ways of writing this semi join in T-SQL. It would be nice to be able to express queries more directly like this. Indeed, this is exactly the same tree we would get immediately if T-SQL supported syntax like: Not every apply can be rewritten as a join, but the current example is straightforward and succeeds: This is something the optimizer always tries to do, because it has more exploration rules for joins than it does for apply. The next step is to rewrite the apply as a regular join using the ApplyHandler rule family. (The same subquery-removal rule produces the same output for the SOME input tree as well). The result is an apply (a.k.a correlated or lateral join): The optimizer always does this, since it cannot operate on subqueries directly. The next thing the optimizer does is to unnest the subquery in the relational selection (= filter) using rule RemoveSubqInSel. That tree is a pretty direct translation of the query text though note that the SELECT * has already been replaced by a projection of the constant integer value 1 (see the penultimate line of text). ProductID ) OPTION ( QUERYTRACEON 3604, QUERYTRACEON 8606, QUERYTRACEON 8621 ) We can explore early this optimization activity with a few undocumented trace flags, which send information about optimizer activity to the SSMS messages tab.įor example, the semi join we have been using so far can also be written using IN: The ANY/SOME/IN representations are all interpreted as a SOME operation. Both EXISTS and INTERSECT are parsed as an EXISTS with correlated subquery. The EXISTS and INTERSECT cases differ only in that the latter comes with an implicit DISTINCT (grouping on all projected columns). The two main workaround syntax groups are EXISTS/INTERSECT, and ANY/SOME/IN. The common workaround semi join syntaxes are transformed to a "real" internal semi join early in the query compilation process (well before even a trivial plan is considered). Early OptimizationsĪlthough T-SQL lacks direct SEMI JOIN syntax, the optimizer knows all about semi joins natively, and can manipulate them directly. Naturally, the final plan choice is still a cost-based decision among the explored alternatives. The transformation to inner join is explored early because the optimizer knows more tricks for inner equijoins than it does for semi joins, potentially leading to more optimization opportunities. The execution plan shows that the optimizer introduced an aggregate (grouping on INV.ProductID) to ensure that the inner join can only return Product rows once, or not at all (as required to preserve the semi join semantics): The AdventureWorks example below shows a semi join being removed entirely, due to a trusted foreign key relationship: Transformation and simplificationĪ logical semi join might be simplified away or replaced with something else during query compilation and optimization. Nevertheless, a logical semi join expressed in T-SQL might not lead to an execution plan using a row goal for several reasons, which we will unpack next. ![]() The description of a semi join above naturally hints at the application of a row goal, since we are interested in finding any matching row in B, not all such rows. T-SQL currently lacks support for direct syntax like FROM A SEMI JOIN B ON A.x = B.y, so we need to use indirect forms like EXISTS, SOME/ANY (including the equivalent shorthand IN for equality comparisons), and set INTERSECT. ![]() Regular join may return columns from either (or both) join inputs.Semi join is defined to only return columns from input A.Regular join duplicates rows if there are multiple matches on the join predicate.Semi join either returns each row from input A, or it does not.The essential differences between a semi join and a regular join are: This article will help you understand when, and why, a semi join invokes the optimizer's row goal logic.Ī semi join returns a row from one join input (A) if there is at least one matching row on the other join input (B). It is rather less commonly appreciated that semi joins (and anti joins) can introduce a row goal as well, though this is somewhat less likely than is the case for TOP, FAST, and SET ROWCOUNT. It is relatively well-known that using TOP or a FAST n query hint can set a row goal in an execution plan (see Setting and Identifying Row Goals in Execution Plans if you need a refresher on row goals and their causes). Part 1: Setting and Identifying Row Goals.This post is part a series of articles about row goals. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |