对PostgreSQL Merge Join 的理解

开始

伪代码:

http://momjian.us/main/writings/pgsql/optimizer.pdf



sort(outer);            

sort(inner);            

i = 0;            

j = 0;            

save_j = 0;            

while (i < length(outer))            

{            

    if (outer[i] == inner[j])        

        output(outer[i], inner[j]);    

            

            

    if (outer[i] <= inner[j] && j < length(inner))        

    {        

        j++;    

        if (outer[i] < inner[j])    

            save_j = j;

    }        

    else        

    {        

        i++;    

        j = save_j;    

    }        

}            

上述描述中,可以把两列排序好的数组看成 由大到小排列。

 Merge Join 先要对各表各自排序,然后从各自的排序表中抽取数据,到另一个排序表中做匹配。
通常来讲,能够使用merge join的地方,hash join 更快。

验证:

postgres=# EXPLAIN SELECT relname,nspname FROM pg_class join

        pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);

                               QUERY PLAN                                

-------------------------------------------------------------------------

 Hash Join  (cost=1.14..16.02 rows=290 width=128)

   Hash Cond: (pg_class.relnamespace = pg_namespace.oid)

   ->  Seq Scan on pg_class  (cost=0.00..10.90 rows=290 width=68)

   ->  Hash  (cost=1.06..1.06 rows=6 width=68)

         ->  Seq Scan on pg_namespace  (cost=0.00..1.06 rows=6 width=68)

(5 rows)



postgres=# 







postgres=# set session enable_hashjoin=false;

SET

postgres=# EXPLAIN SELECT relname,nspname FROM pg_class join

        pg_namespace ON (pg_class.relnamespace = pg_namespace.oid);

                               QUERY PLAN                                

-------------------------------------------------------------------------

 Merge Join  (cost=23.90..28.28 rows=290 width=128)

   Merge Cond: (pg_namespace.oid = pg_class.relnamespace)

   ->  Sort  (cost=1.14..1.15 rows=6 width=68)

         Sort Key: pg_namespace.oid

         ->  Seq Scan on pg_namespace  (cost=0.00..1.06 rows=6 width=68)

   ->  Sort  (cost=22.76..23.49 rows=290 width=68)

         Sort Key: pg_class.relnamespace

         ->  Seq Scan on pg_class  (cost=0.00..10.90 rows=290 width=68)

(8 rows)



postgres=# 

[作者:技术者高健@博客园  mail: [email protected] ]

改了 join 的顺序 对 结果也没有影响。

postgres=# EXPLAIN SELECT relname,nspname FROM pg_namespace join

        pg_class ON (pg_class.relnamespace = pg_namespace.oid);

                               QUERY PLAN                                

-------------------------------------------------------------------------

 Merge Join  (cost=23.90..28.28 rows=290 width=128)

   Merge Cond: (pg_namespace.oid = pg_class.relnamespace)

   ->  Sort  (cost=1.14..1.15 rows=6 width=68)

         Sort Key: pg_namespace.oid

         ->  Seq Scan on pg_namespace  (cost=0.00..1.06 rows=6 width=68)

   ->  Sort  (cost=22.76..23.49 rows=290 width=68)

         Sort Key: pg_class.relnamespace

         ->  Seq Scan on pg_class  (cost=0.00..10.90 rows=290 width=68)

(8 rows)



postgres=# 
postgres=# EXPLAIN SELECT relname,nspname FROM pg_namespace,

        pg_class where (pg_class.relnamespace = pg_namespace.oid);

                               QUERY PLAN                                

-------------------------------------------------------------------------

 Merge Join  (cost=23.90..28.28 rows=290 width=128)

   Merge Cond: (pg_namespace.oid = pg_class.relnamespace)

   ->  Sort  (cost=1.14..1.15 rows=6 width=68)

         Sort Key: pg_namespace.oid

         ->  Seq Scan on pg_namespace  (cost=0.00..1.06 rows=6 width=68)

   ->  Sort  (cost=22.76..23.49 rows=290 width=68)

         Sort Key: pg_class.relnamespace

         ->  Seq Scan on pg_class  (cost=0.00..10.90 rows=290 width=68)

(8 rows)



postgres=# 

结束

你可能感兴趣的:(PostgreSQL)