内容纲要
概要描述
什么是执行计划?
所谓执行计划,通俗点说就是,数据库服务器在执行sql语句的时候,会准备几套方案,最后选择消耗资源最小的那个方案。这就是执行计划。
对于sql而言,我们仅仅是描述出sql的动作,至于如何执行,是由数据库来决定的。
详细说明
如何获取执行计划?
-- 使用语法
EXPLAIN [COST|FORMATTED|EXTENDED|DEPENDENCY] hql_query
相比来说EXPLAIN
和EXPLAIN COST
这两个较为常用。
以下面的语句为例:
EXPLAIN
INSERT OVERWRITE TABLE
access_log_temp2
select
a.USER
,
a.prono,
p.maker,
p.price
FROM
access_log_hbase a
JOIN
product_hbase p
ON (a.prono = p.prono);
执行计划包含哪些内容?
· The Abstract Syntax Tree for the query
· The dependencies between the different stages of the plan
· The specific description of each of the stages
如何分析执行计划?
分析顺序:从下至上
-- explain cost 执行结果
INCEPTOR QUERY PLAN #0:
**FileSinkOperator
**SelectOperator
**JoinOperator (No numRows)
non-Cumulative-Cost: No estimation; CumulativeCost: No estimation;
**ReduceSinkOperator
**TableScanOperator (table name: default.access_log_hbase; alias: a)
**ReduceSinkOperator
**TableScanOperator (table name: default.product_hbase; alias: p)
Note: CBO failed, due to:
null
-- explain 执行结果
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME access_log_hbase) a) (TOK_TABREF (TOK_TABNAME product_hbase) p) (= (. (TOK_TABLE_OR_COL a) prono) (. (TOK_TABLE_OR_COL p) prono)))) (TOK_INSERT (TOK_DESTINATION (TOK_TAB (TOK_TABNAME access_log_temp2))) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) USER
)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) prono)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL p) maker)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL p) price)))))
STAGE DEPENDENCIES:
Stage-0 is a root stage [MAPRED]
Stage-1 depends on stages: Stage-0 [MOVE]
STAGE PLANS:
Stage: Stage-0
Stage: Stage-1
Move Operator
tables:
replace: true
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name: default.access_log_temp2
INCEPTOR QUERY PLAN #0:
**io.transwarp.inceptor.execution.FileSinkOperator **************************
File Output Operator
compressed: false #表示文件输出的结果是否进行压缩,false表示不进行输出压缩
GlobalTableId: 1
table:
input format: org.apache.hadoop.mapred.TextInputFormat #分别表示文件输入和输出的文件类型
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe #表示读取表数据的序列化和反序列化方式
name: default.access_log_temp2
**io.transwarp.inceptor.execution.SelectOperator **************************
Select Operator
expressions: #表示需要筛选的列
expr: _col1
type: string
expr: _col0
type: int
expr: _col9
type: int
expr: _col10
type: int
outputColumnNames: _col0, _col1, _col2, _col3 #表示输出的列名
**io.transwarp.inceptor.execution.JoinOperator **************************
Join Operator
condition map:
Inner Join 0 to 1 #表示join的类型,这里是内连接
join expressions:
0 [Column[VALUE._col0], Column[VALUE._col1]]
1 [Column[VALUE._col1], Column[VALUE._col2]]
condition expressions: #表示join中两表分别包含的字段
0 {VALUE._col0} {VALUE._col1}
1 {VALUE._col1} {VALUE._col2}
handleSkewJoin: false: #inceptor中默认是false(set hive.optimize.skewjoin)在开启时,会对数据进行扫描并监测哪个key会出现倾斜,对于倾斜的key,用map join处理,不倾斜的key正常处理
outputColumnNames: _col0, _col1, _col9, _col10
**io.transwarp.inceptor.execution.ReduceSinkOperator **************************
Reduce Output Operator
key expressions: prono (type: int) #RS组合字段之key expression
sort order: + #按照prono正排
Map-reduce partition columns: prono (type: int) #按照prono求hash值分配reduce
value expressions: prono (type: int), user (type: string) #RS组合字段之value expression
**io.transwarp.inceptor.execution.FilterOperator **************************
Filter Operator
predicate:
expr: prono is not null
type: boolean
error forward: false
error tolerant: false
**io.transwarp.inceptor.execution.TableScanOperator **************************
TableScan
alias: a
**io.transwarp.inceptor.execution.ReduceSinkOperator **************************
Reduce Output Operator #这里描述map的输出,也就是reduce的输入。比如key,partition,sort等信息
key expressions: prono (type: int) #reduce job的key
sort order: + #这里表示按一个字段排序,如果是按两个字段排序,那么就会有两个+(++),更多以此类推
Map-reduce partition columns: prono (type: int) #partition的信息,由此也可以看出hive在join的时候会以join on后的列作为partition的列,以保证具有相同此列的值的行被分到同一个reduce中去
value expressions: maker (type: int), price (type: int)
**io.transwarp.inceptor.execution.FilterOperator **************************
Filter Operator
predicate:
expr: prono is not null
type: boolean
error forward: false
error tolerant: false
**io.transwarp.inceptor.execution.TableScanOperator **************************
TableScan
alias: p