SQL SERVER执行计划选择原理
一直遇到多张表进行连接后执行计划随着连接数目的变化而变化的现象,最初以为跟连接的数目有关,后来发现是不对的。求教老大。
一张大表(132W数据行),分别与4、5、6张纬度表相连,4张表时0-1秒,5张表时10-16秒,6张表时0-1秒。由于这几张表调用都很频繁,所以都在缓存中。
我使用了set statistics profile on和set statistics io on等选项进行了观察。由于外侧表都非常小,只有几行或是几行,个别表只有1行,所以大部分外侧表都没有建立索引,所以连接大多执行的是hash join。但是经过观察后并不是因为sql server选择hash join 而不是nested loop所造成的,而是其中一张只有一行的表坐在的缓存页被重复读了132W次。
示例如下:
5张表连接:
select count(1)
from Fact_test_MS_brand a11
inner join LU_region a16 on (a11.Region_id = a16.Region_id)
inner join LU_POPDB a15 on (a11.POPDB_id = a15.POPDB_id)
inner join LU_category a14 on (a11.Category_id = a14.Category_id)
inner join LU_brand a13 on (a11.Brand_id = a13.Brand_id)
Table 'LU_brand'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_category'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_POPDB'. Scan count 17, logical reads 428, physical reads 0, read-ahead reads 52, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_region'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Fact_test_MS_brand'. Scan count 17, logical reads 34933, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(19 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 3019 ms, elapsed time = 313 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
--------------------------------------
6张表连接:
select count(1)
from Fact_test_MS_brand a11
inner join LU_region a16 on (a11.Region_id = a16.Region_id)
inner join LU_POPDB a15 on (a11.POPDB_id = a15.POPDB_id)
inner join LU_category a14 on (a11.Category_id = a14.Category_id)
inner join LU_brand a13 on (a11.Brand_id = a13.Brand_id)
inner join lu_city a12 on a11.city_id=a12.city_id
Table 'LU_POPDB'. Scan count 17, logical reads 428, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_brand'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_city'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_region'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_category'. Scan count 16, logical reads 1327428, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Fact_test_MS_brand'. Scan count 17, logical reads 34933, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(22 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 123233 ms, elapsed time = 10659 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
-------------------------------------
8张表连接:
select count(1)
from Fact_test_MS_brand a11
inner join LU_region a16 on (a11.Region_id = a16.Region_id)
inner join LU_POPDB a15 on (a11.POPDB_id = a15.POPDB_id)
inner join LU_category a14 on (a11.Category_id = a14.Category_id)
inner join LU_brand a13 on (a11.Brand_id = a13.Brand_id)
inner join lu_city a12 on a11.city_id=a12.city_id
inner join lu_chain a17 on a11.chain_id=a17.chain_id
inner join dbo.LU_area a18 on a11.area_id=a18.area_id
Table 'LU_POPDB'. Scan count 17, logical reads 428, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_brand'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_chain'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_area'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_category'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_city'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LU_region'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Fact_test_MS_brand'. Scan count 17, logical reads 34933, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(28 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 6218 ms, elapsed time = 514 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
-----------------------
注明:'LU_category'表只有1行数据。
可以明显看出,在6-7张表连接时,'LU_category'表的页被重复读了内侧表的行数,其他时候都只有1次。
各位请不要在反复贴关于join的3种连接原理的帖子,我都看过了,请谈谈自己的理解。
恳请老大指点,谢谢。
[解决办法]
[解决办法]
看看查询优化方面的内容吧。主要是生成多个执行计划,然后计算每个计划所花费的资源,然后选择花费资源最小的那个
[解决办法]
关注...
[解决办法]
有时间再看。
[解决办法]
要搞清楚它的优化逻辑,还是比较难啊。可以看看数据库理论方面的内容,另外可以看看sql server2005技术内幕一套书 Inside SQL Server 2005: Query Tuning and Optimization 这一本,还有胡百敬写的一本sql server2005性能调教
[解决办法]
微软的执行计划也有算不对的时候.. 这个是比较郁闷的事
[解决办法]
这是高难度研究.精神值得鼓励.
[解决办法]
>>而是其中一张只有一行的表坐在的缓存页被重复读了132W次。
不是吧,这不成了子查询了?
[解决办法]
帮顶。
学习
------解决方案--------------------
.
还有 sql server 误算成本等等.
[解决办法]
补充2点:
1.因为Fact_test_MS_brand和LU_*数据量悬殊很大,SQL Server很容易就可以决定让LU_*表作为HJ的组建输入,
这样查询所要消耗memroy很少。如11楼的查询计划所示。
2.当SQL Server会选择用NL,并将LU_*作为内部输入表,而不是外部输入表。
可以检查一下查询计划,如20楼8张表联接的查询计划,加粗部分操作符的Estimated Number of Rows是否偏小,
或是和Actual Number of Rows相差较大
select count(1) from Fact_test_MS_brand a11 inner join LU_region a16 on (a11.Region_id = a16.Region_id) inner join LU_POPDB a15 on (a11.POPDB_id = a15.POPDB_id) inner join LU_category a14 on (a11.Category_id = a14.Category_id) inner join LU_brand a13 on (a11.Brand_id = a13.Brand_id) inner join lu_city a12 on a11.city_id=a12.city_id inner join lu_chain a17 on a11.chain_id=a17.chain_id inner join dbo.LU_area a18 on a11.area_id=a18.area_id
|--Compute Scalar(DEFINE:([Expr1023]=CONVERT_IMPLICIT(int,[globalagg1025],0)))
|--Stream Aggregate(DEFINE:([globalagg1025]=SUM([partialagg1024])))
|--Hash Match(Inner Join, HASH:([a17].[Chain_id])=([a11].[Chain_id]), RESIDUAL:([TPMMonitors].[dbo].[Fact_test_MS_brand].[Chain_id] as [a11].[Chain_id]=[TPMMonitors].[dbo].[LU_chain].[Chain_id] as [a17].[Chain_id]))
|--Table Scan(OBJECT:([TPMMonitors].[dbo].[LU_chain] AS [a17]))
|--Hash Match(Aggregate, HASH:([a11].[Chain_id]), RESIDUAL:([TPMMonitors].[dbo].[Fact_test_MS_brand].[Chain_id] as [a11].[Chain_id] = [TPMMonitors].[dbo].[Fact_test_MS_brand].[Chain_id] as [a11].[Chain_id]) DEFINE:([partialagg1024]=COUNT(*)))
|--Hash Match(Inner Join, HASH:([a15].[POPDB_id])=([a11].[POPDB_id]))
|--Clustered Index Scan(OBJECT:([TPMMonitors].[dbo].[LU_POPDB].[PK_LU_POPDB] AS [a15]))
|--Nested Loops(Inner Join, WHERE:([TPMMonitors].[dbo].[Fact_test_MS_brand].[Category_id] as [a11].[Category_id]=[TPMMonitors].[dbo].[LU_category].[Category_id] as [a14].[Category_id]))
|--Hash Match(Inner Join, HASH:([a12].[City_id])=([a11].[City_id]), RESIDUAL:([TPMMonitors].[dbo].[Fact_test_MS_brand].[City_id] as [a11].[City_id]=[TPMMonitors].[dbo].[LU_city].[City_id] as [a12].[City_id]))
| |--Table Scan(OBJECT:([TPMMonitors].[dbo].[LU_city] AS [a12]))
| |--Hash Match(Inner Join, HASH:([a16].[Region_id])=([a11].[Region_id]), RESIDUAL:([TPMMonitors].[dbo].[Fact_test_MS_brand].[Region_id] as [a11].[Region_id]=[TPMMonitors].[dbo].[LU_region].[Region_id] as [a16].[Region_id]))
| |--Table Scan(OBJECT:([TPMMonitors].[dbo].[LU_region] AS [a16]))
| |--Hash Match(Inner Join, HASH:([a18].[Area_id])=([a11].[Area_id]), RESIDUAL:([TPMMonitors].[dbo].[Fact_test_MS_brand].[Area_id] as [a11].[Area_id]=[TPMMonitors].[dbo].[LU_area].[Area_id] as [a18].[Area_id]))
| |--Table Scan(OBJECT:([TPMMonitors].[dbo].[LU_area] AS [a18]))
| |--Hash Match(Inner Join, HASH:([a13].[Brand_id])=([a11].[Brand_id]), RESIDUAL:([TPMMonitors].[dbo].[Fact_test_MS_brand].[Brand_id] as [a11].[Brand_id]=[TPMMonitors].[dbo].[LU_brand].[Brand_id] as [a13].[Brand_id]))
| |--Table Scan(OBJECT:([TPMMonitors].[dbo].[LU_brand] AS [a13]))
| |--Table Scan(OBJECT:([TPMMonitors].[dbo].[Fact_test_MS_brand] AS [a11]))
|--Table Scan(OBJECT:([TPMMonitors].[dbo].[LU_category] AS [a14]))
[解决办法]
这个问题讨论到此也差不多了,再继续个人认为意义不大。而从这个例子我们只要能明白以下几点就可以了:
1.
在某些场景中,NL带来的大量的内部表index scan(seek)操作,性能上不如HJ一次Table(CLustered)scan来的
有效率,毕竟后者可以大大减少逻辑I/O的数量,其实也就是在降低cpu的消耗。而在一些其他的
场景情况却是相反的,HJ在减少逻辑I/O的同时,消耗的CPU却变的大了,原因可能是两种逻辑I/O数相差的不是那么的
"大"。
SET STATISTICS TIME ONSET STATISTICS IO ONGOUSE NorthwindGOSELECT * FROM dbo.[Orders] INNER LOOP JOIN dbo.[Order Details] ON dbo.Orders.OrderID = dbo.[Order Details].OrderIDSELECT * FROM dbo.[Orders] INNER HASH JOIN dbo.[Order Details] ON dbo.Orders.OrderID = dbo.[Order Details].OrderIDGO