Sybase NNTP forums - End Of Life (EOL)

The NNTP forums from Sybase - forums.sybase.com - are now closed.

All new questions should be directed to the appropriate forum at the SAP Community Network (SCN).

Individual products have links to the respective forums on SCN, or you can go to SCN and search for your product in the search box (upper right corner) to find your specific developer center.

Merge Join over Nested Loop Join

7 posts in General Discussion Last posting was on 2010-02-17 16:10:26.0Z
Yaniv C. Posted on 2010-02-17 08:27:06.0Z
Sender: 65de.4b7ba750.1804289383@sybase.com
From: Yaniv C.
Newsgroups: sybase.public.ase.general
Subject: Merge Join over Nested Loop Join
X-Mailer: WebNews to Mail Gateway v1.1t
Message-ID: <4b7ba85a.6620.1681692777@sybase.com>
NNTP-Posting-Host: 10.22.241.41
X-Original-NNTP-Posting-Host: 10.22.241.41
Date: 17 Feb 2010 00:27:06 -0800
X-Trace: forums-1-dub 1266395226 10.22.241.41 (17 Feb 2010 00:27:06 -0800)
X-Original-Trace: 17 Feb 2010 00:27:06 -0800, 10.22.241.41
Lines: 11
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub sybase.public.ase.general:28970
Article PK: 78210

Hi
I am using ASE 15.0.2 ESD 6.
I would like to know, when merge join ( or hash join) is
better,faster than nested loop join.
For each tests that I made, NLJ is faster than Merge Join.
Can someone give me the minimum schema,data and query that
merge join for sure will be faster than NLJ

Thanks,

Yaniv


Rob V [ Sybase ] Posted on 2010-02-17 09:27:54.0Z
Reply-To: "Rob V [ Sybase ]" <robv@DO.NOT.SPAM.sypron.nl.REMOVE.THIS.DECOY>
From: "Rob V [ Sybase ]" <robv@DO.NOT.SPAM.sypron.nl.REMOVE.THIS.DECOY>
Newsgroups: sybase.public.ase.general
References: <4b7ba85a.6620.1681692777@sybase.com>
Subject: Re: Merge Join over Nested Loop Join
Lines: 66
Organization: Sypron BV / TeamSybase / Sybase
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5843
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
X-RFC2646: Format=Flowed; Original
NNTP-Posting-Host: vip152.sybase.com
X-Original-NNTP-Posting-Host: vip152.sybase.com
Message-ID: <4b7bb69a$1@forums-1-dub>
Date: 17 Feb 2010 01:27:54 -0800
X-Trace: forums-1-dub 1266398874 10.22.241.152 (17 Feb 2010 01:27:54 -0800)
X-Original-Trace: 17 Feb 2010 01:27:54 -0800, vip152.sybase.com
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub sybase.public.ase.general:28971
Article PK: 78214

An example is a query that hit many rows in both tables being joined: those
tend to benefit fro MJs.
For example, let's say we have a table t1 with customer IDs and order
numbers, and table t2 with the same customer ID and address details.
If you want to make a list of all the order of customers in a particular
area (or if you want to determine the collective value of orders in an
area), you need to join both tables on customer ID: in t1 you need to access
the order details, and in t2 you need to access the address data.
With an MJ, both tables will be scanned only once, whereas with an NLJ the
inner table will be accessed by walking down its index for every row that
qualifies in the outer table. So if you'd hit 1000 rows in the outer table,
an NLJ would cause 1000*H logical I/Os for the inner table (where H is the
height of the index). If an MJ would be used, you'd only have 1000/P LIOs
(where P is the number of rows per page). This is somewhat simplified but it
shows how MJs can be more efficient than NLJs. Note than I'm assuming a 1-1
match here for simplicity of the example, whereas in reality there may be
multiple matching rows in the inner table (amplifying the benefit of an MJ).
As another example, imagine you have a large customer table which has been
split vertically (i.e. one half of the attributes in each table with the
primary key in both tables). To join them back together you need to match
each and every row. An MJ could be more efficient here.

MJs will work best when there is an index on the join columns, since it
requires that the rows in both tables can be accessed in sorted order. If no
such index exists, either reformatting (or a SORT step) may be inserted by
the optimizer, or a hash join may be more efficient.
MJs have some internal overhead, so they tend to become more efficient only
when more than a handful of rows are accessed -- for a typical OLTP query
which hits only one row (or a few at most), an MJ will likely be slower than
an NLJ even though the logical I/O count may still be lower.

HTH,

Rob V.
-----------------------------------------------------------------
Rob Verschoor

Certified Sybase Professional DBA for ASE 15.0/12.5/12.0/11.5/11.0
and Replication Server 15.0.1/12.5 // TeamSybase

Author of Sybase books (order online at www.sypron.nl/shop):
"Tips, Tricks & Recipes for Sybase ASE" (ASE 15 edition)
"The Complete Sybase ASE Quick Reference Guide"
"The Complete Sybase Replication Server Quick Reference Guide"

mailto:rob@YOUR.SPAM.sypron.nl.NOT.FOR.ME
http://www.sypron.nl
Sypron B.V., Amersfoort, The Netherlands
Chamber of Commerce 27138666
-----------------------------------------------------------------

<Yaniv C.> wrote in message news:4b7ba85a.6620.1681692777@sybase.com...
> Hi
> I am using ASE 15.0.2 ESD 6.
> I would like to know, when merge join ( or hash join) is
> better,faster than nested loop join.
> For each tests that I made, NLJ is faster than Merge Join.
> Can someone give me the minimum schema,data and query that
> merge join for sure will be faster than NLJ
>
> Thanks,
>
> Yaniv


Rob V [ Sybase ] Posted on 2010-02-17 12:45:41.0Z
Reply-To: "Rob V [ Sybase ]" <robv@DO.NOT.SPAM.sypron.nl.REMOVE.THIS.DECOY>
From: "Rob V [ Sybase ]" <robv@DO.NOT.SPAM.sypron.nl.REMOVE.THIS.DECOY>
Newsgroups: sybase.public.ase.general
References: <4b7bb69a$1@forums-1-dub> <4b7bd3a9.6e90.1681692777@sybase.com>
Subject: Re: Merge Join over Nested Loop Join
Lines: 270
Organization: Sypron BV / TeamSybase / Sybase
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5843
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
X-RFC2646: Format=Flowed; Original
NNTP-Posting-Host: vip152.sybase.com
X-Original-NNTP-Posting-Host: vip152.sybase.com
Message-ID: <4b7be4f5@forums-1-dub>
Date: 17 Feb 2010 04:45:41 -0800
X-Trace: forums-1-dub 1266410741 10.22.241.152 (17 Feb 2010 04:45:41 -0800)
X-Original-Trace: 17 Feb 2010 04:45:41 -0800, vip152.sybase.com
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub sybase.public.ase.general:28977
Article PK: 78215

Here you go.... it's just a quick hack, but the NLJ takes about 90-100
millisec on my server, wheras the MJ takes 20-30 millisec. Of course, those
time may be different on your system.



create table t1 (pk1 int, c1 int, fill1 char(100))
go
create table t2 (pk2 int, c2 int, fill2 char(100))
go

set rowcount 10000
go
insert t1 select rand2()*100000, rand2()*100, '' from syscolumns c1,
syscolumns c2, syscolumns c3
go
insert t2 select pk1, rand2()*100, '' from t1
go
set rowcount 0
go

create index ix1 on t1(pk1, c1)
go
create index ix2 on t2(pk2, c2)
go


-- this is the query....
select s=sum(c2) into #t from t1, t2 where pk1=pk2 group by c1
go



HTH,

Rob V.
-----------------------------------------------------------------
Rob Verschoor

Certified Sybase Professional DBA for ASE 15.0/12.5/12.0/11.5/11.0
and Replication Server 15.0.1/12.5 // TeamSybase

Author of Sybase books (order online at www.sypron.nl/shop):
"Tips, Tricks & Recipes for Sybase ASE" (ASE 15 edition)
"The Complete Sybase ASE Quick Reference Guide"
"The Complete Sybase Replication Server Quick Reference Guide"

mailto:rob@YOUR.SPAM.sypron.nl.NOT.FOR.ME
http://www.sypron.nl
Sypron B.V., Amersfoort, The Netherlands
Chamber of Commerce 27138666
-----------------------------------------------------------------

<Yaniv C.> wrote in message news:4b7bd3a9.6e90.1681692777@sybase.com...
>I took your example and implelent it,still , NLJ wins:
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXX With Merge Join
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>
> C:\Documents and Settings\YanCohen>isql -Udbs -PXXX -w300
> -Dyaniv -p
> Execution Time (ms.): 0 Clock Time (ms.):
> 0
> 1>
> 2> set nl_join off
> 3> set merge_join on
> 4> set statistics plancost on
> 5> go
> Execution Time (ms.): 15 Clock Time (ms.):
> 15
> 1>
> 2> select count(big1.c)
> 3> from big1 , big2
> 4> where big1.a = big2.a
> 5> and big2.b = 213
> 6> go
>
> -----------
> 684
>
> ==================== Lava Operator Tree ====================
>
>
> Emit
> (VA = 5)
> r:1 er:1
> cpu: 0
>
>
> /
> ScalarAgg
> Count
> (VA = 4)
> r:1 er:1
> cpu: 700
>
> /
> MergeJoin
> Inner Join
> (VA = 3)
> r:684 er:685
>
>
> / \
> Sort IndexScan
> (VA = 1) nc1Big1
> r:684 er:685 (VA = 2)
> l:6 el:6 r:599215 er:600000
> p:0 ep:0 l:2529 el:2534
> cpu: 0 bufct: 24 p:0 ep:2534
> /
> IndexScan
> nc2Big2
> (VA = 0)
> r:684 er:685
> l:692 el:692
> p:0 ep:659
>
> ============================================================
>
> (1 row affected)
> Execution Time (ms.): 750 Clock Time (ms.):
> 766
>
>
>
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXX With NLJ
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> C:\Documents and Settings\YanCohen>isql -Udbs -PXXX -w300
> -Dyaniv -p
> Execution Time (ms.): 0 Clock Time (ms.):
> 0
> 1>
> 2> set nl_join on
> 3> set merge_join off
> 4> set statistics plancost on
> 5> go
> Execution Time (ms.): 0 Clock Time (ms.):
> 0
> 1>
> 2> select count(big1.c)
> 3> from big1 , big2
> 4> where big1.a = big2.a
> 5> and big2.b = 213
> 6> go
>
> -----------
> 684
>
> ==================== Lava Operator Tree ====================
>
>
> Emit
> (VA = 4)
> r:1 er:1
> cpu: 0
>
>
> /
> ScalarAgg
> Count
> (VA = 3)
> r:1 er:1
> cpu: 0
>
> /
> NestLoopJoin
> Inner Join
> (VA = 2)
> r:684 er:685
>
>
> / \
> IndexScan IndexScan
> nc2Big2 nc1Big1
> (VA = 0) (VA = 1)
> r:684 er:685 r:684 er:685
> l:692 el:692 l:1421 el:2053
> p:0 ep:659 p:0 ep:687
>
> ============================================================
>
> (1 row affected)
> Execution Time (ms.): 47 Clock Time (ms.):
> 47
> 1>
>
>
>
>> An example is a query that hit many rows in both tables
>> being joined: those tend to benefit fro MJs.
>> For example, let's say we have a table t1 with customer
>> IDs and order numbers, and table t2 with the same customer
>> ID and address details. If you want to make a list of all
>> the order of customers in a particular area (or if you
>> want to determine the collective value of orders in an
>> area), you need to join both tables on customer ID: in t1
>> you need to access the order details, and in t2 you need
>> to access the address data. With an MJ, both tables will
>> be scanned only once, whereas with an NLJ the inner table
>> will be accessed by walking down its index for every row
>> that qualifies in the outer table. So if you'd hit 1000
>> rows in the outer table, an NLJ would cause 1000*H logical
>> I/Os for the inner table (where H is the height of the
>> index). If an MJ would be used, you'd only have 1000/P
>> LIOs (where P is the number of rows per page). This is
>> somewhat simplified but it shows how MJs can be more
>> efficient than NLJs. Note than I'm assuming a 1-1 match
>> here for simplicity of the example, whereas in reality
>> there may be multiple matching rows in the inner table
>> (amplifying the benefit of an MJ). As another example,
>> imagine you have a large customer table which has been
>> split vertically (i.e. one half of the attributes in each
>> table with the primary key in both tables). To join them
>> back together you need to match each and every row. An MJ
>> could be more efficient here.
>>
>> MJs will work best when there is an index on the join
>> columns, since it requires that the rows in both tables
>> can be accessed in sorted order. If no such index exists,
>> either reformatting (or a SORT step) may be inserted by
>> the optimizer, or a hash join may be more efficient. MJs
>> have some internal overhead, so they tend to become more
>> efficient only when more than a handful of rows are
>> accessed -- for a typical OLTP query which hits only one
>> row (or a few at most), an MJ will likely be slower than
>> an NLJ even though the logical I/O count may still be
>> lower.
>>
>> HTH,
>>
>> Rob V.
>> ----------------------------------------------------------
>> ------- Rob Verschoor
>>
>> Certified Sybase Professional DBA for ASE
>> 15.0/12.5/12.0/11.5/11.0 and Replication Server
>> 15.0.1/12.5 // TeamSybase
>>
>> Author of Sybase books (order online at
>> www.sypron.nl/shop): "Tips, Tricks & Recipes for Sybase
>> ASE" (ASE 15 edition) "The Complete Sybase ASE Quick
>> Reference Guide" "The Complete Sybase Replication Server
>> Quick Reference Guide"
>>
>> mailto:rob@YOUR.SPAM.sypron.nl.NOT.FOR.ME
>> http://www.sypron.nl
>> Sypron B.V., Amersfoort, The Netherlands
>> Chamber of Commerce 27138666
>> ----------------------------------------------------------
>> -------
>>
>> <Yaniv C.> wrote in message
>> > news:4b7ba85a.6620.1681692777@sybase.com... Hi
>> > I am using ASE 15.0.2 ESD 6.
>> > I would like to know, when merge join ( or hash join) is
>> > better,faster than nested loop join.
>> > For each tests that I made, NLJ is faster than Merge
>> > Join. Can someone give me the minimum schema,data and
>> > query that merge join for sure will be faster than NLJ
>> >
>> > Thanks,
>> >
>> > Yaniv
>>
>>
>>


"Mark A. Parsons" <iron_horse Posted on 2010-02-17 12:51:38.0Z
From: "Mark A. Parsons" <iron_horse@no_spamola.compuserve.com>
User-Agent: Thunderbird 1.5.0.10 (Windows/20070221)
MIME-Version: 1.0
Newsgroups: sybase.public.ase.general
Subject: Re: Merge Join over Nested Loop Join
References: <4b7bb69a$1@forums-1-dub> <4b7bd3a9.6e90.1681692777@sybase.com>
In-Reply-To: <4b7bd3a9.6e90.1681692777@sybase.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Antivirus: avast! (VPS 100126-1, 01/26/2010), Outbound message
X-Antivirus-Status: Clean
NNTP-Posting-Host: vip152.sybase.com
X-Original-NNTP-Posting-Host: vip152.sybase.com
Message-ID: <4b7be65a$1@forums-1-dub>
Date: 17 Feb 2010 04:51:38 -0800
X-Trace: forums-1-dub 1266411098 10.22.241.152 (17 Feb 2010 04:51:38 -0800)
X-Original-Trace: 17 Feb 2010 04:51:38 -0800, vip152.sybase.com
Lines: 245
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub sybase.public.ase.general:28978
Article PK: 78216

Another perspective re: big1 table accesses ...

In both cases it looks like index covering is being used to access the big1 table (only 1 column - b - reference from
big1; index nc1Big1 used to access big1).

MJ:
========

- 599,215 records accessed
- 2,529 logical ios required (ie, scan entire index)
- 0.00422 ios required per record accessed


NLJ:
========

- 684 records accessed
- 1,421 logical ios required
- approx 2 ios required per record accessed


Using the above numbers to generate rough io estimates ...

Accessing 599,215 records via NLJ would require approx 1.2 million ios (599,215 * 2); so 2,529 ios for a MJ makes more
sense. [Obviously (?) a MJ may require other overhead, eg, worktable space in tempdb to hold intermediate records for
sorting, cpu for sorting said worktable.]

Accessing 684 records via MJ (via full index scan) would require 2,529 ios; so 1,421 ios for a NLJ makes more sense.

Yaniv C. wrote:
> I took your example and implelent it,still , NLJ wins:
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXX With Merge Join
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>
> C:\Documents and Settings\YanCohen>isql -Udbs -PXXX -w300
> -Dyaniv -p
> Execution Time (ms.): 0 Clock Time (ms.):
> 0
> 1>
> 2> set nl_join off
> 3> set merge_join on
> 4> set statistics plancost on
> 5> go
> Execution Time (ms.): 15 Clock Time (ms.):
> 15
> 1>
> 2> select count(big1.c)
> 3> from big1 , big2
> 4> where big1.a = big2.a
> 5> and big2.b = 213
> 6> go
>
> -----------
> 684
>
> ==================== Lava Operator Tree ====================
>
>
> Emit
> (VA = 5)
> r:1 er:1
> cpu: 0
>
>
> /
> ScalarAgg
> Count
> (VA = 4)
> r:1 er:1
> cpu: 700
>
> /
> MergeJoin
> Inner Join
> (VA = 3)
> r:684 er:685
>
>
> / \
> Sort IndexScan
> (VA = 1) nc1Big1
> r:684 er:685 (VA = 2)
> l:6 el:6 r:599215 er:600000
> p:0 ep:0 l:2529 el:2534
> cpu: 0 bufct: 24 p:0 ep:2534
> /
> IndexScan
> nc2Big2
> (VA = 0)
> r:684 er:685
> l:692 el:692
> p:0 ep:659
>
> ============================================================
>
> (1 row affected)
> Execution Time (ms.): 750 Clock Time (ms.):
> 766
>
>
>
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> XXX With NLJ
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> C:\Documents and Settings\YanCohen>isql -Udbs -PXXX -w300
> -Dyaniv -p
> Execution Time (ms.): 0 Clock Time (ms.):
> 0
> 1>
> 2> set nl_join on
> 3> set merge_join off
> 4> set statistics plancost on
> 5> go
> Execution Time (ms.): 0 Clock Time (ms.):
> 0
> 1>
> 2> select count(big1.c)
> 3> from big1 , big2
> 4> where big1.a = big2.a
> 5> and big2.b = 213
> 6> go
>
> -----------
> 684
>
> ==================== Lava Operator Tree ====================
>
>
> Emit
> (VA = 4)
> r:1 er:1
> cpu: 0
>
>
> /
> ScalarAgg
> Count
> (VA = 3)
> r:1 er:1
> cpu: 0
>
> /
> NestLoopJoin
> Inner Join
> (VA = 2)
> r:684 er:685
>
>
> / \
> IndexScan IndexScan
> nc2Big2 nc1Big1
> (VA = 0) (VA = 1)
> r:684 er:685 r:684 er:685
> l:692 el:692 l:1421 el:2053
> p:0 ep:659 p:0 ep:687
>
> ============================================================
>
> (1 row affected)
> Execution Time (ms.): 47 Clock Time (ms.):
> 47
> 1>
>
>
>
>> An example is a query that hit many rows in both tables
>> being joined: those tend to benefit fro MJs.
>> For example, let's say we have a table t1 with customer
>> IDs and order numbers, and table t2 with the same customer
>> ID and address details. If you want to make a list of all
>> the order of customers in a particular area (or if you
>> want to determine the collective value of orders in an
>> area), you need to join both tables on customer ID: in t1
>> you need to access the order details, and in t2 you need
>> to access the address data. With an MJ, both tables will
>> be scanned only once, whereas with an NLJ the inner table
>> will be accessed by walking down its index for every row
>> that qualifies in the outer table. So if you'd hit 1000
>> rows in the outer table, an NLJ would cause 1000*H logical
>> I/Os for the inner table (where H is the height of the
>> index). If an MJ would be used, you'd only have 1000/P
>> LIOs (where P is the number of rows per page). This is
>> somewhat simplified but it shows how MJs can be more
>> efficient than NLJs. Note than I'm assuming a 1-1 match
>> here for simplicity of the example, whereas in reality
>> there may be multiple matching rows in the inner table
>> (amplifying the benefit of an MJ). As another example,
>> imagine you have a large customer table which has been
>> split vertically (i.e. one half of the attributes in each
>> table with the primary key in both tables). To join them
>> back together you need to match each and every row. An MJ
>> could be more efficient here.
>>
>> MJs will work best when there is an index on the join
>> columns, since it requires that the rows in both tables
>> can be accessed in sorted order. If no such index exists,
>> either reformatting (or a SORT step) may be inserted by
>> the optimizer, or a hash join may be more efficient. MJs
>> have some internal overhead, so they tend to become more
>> efficient only when more than a handful of rows are
>> accessed -- for a typical OLTP query which hits only one
>> row (or a few at most), an MJ will likely be slower than
>> an NLJ even though the logical I/O count may still be
>> lower.
>>
>> HTH,
>>
>> Rob V.
>> ----------------------------------------------------------
>> ------- Rob Verschoor
>>
>> Certified Sybase Professional DBA for ASE
>> 15.0/12.5/12.0/11.5/11.0 and Replication Server
>> 15.0.1/12.5 // TeamSybase
>>
>> Author of Sybase books (order online at
>> www.sypron.nl/shop): "Tips, Tricks & Recipes for Sybase
>> ASE" (ASE 15 edition) "The Complete Sybase ASE Quick
>> Reference Guide" "The Complete Sybase Replication Server
>> Quick Reference Guide"
>>
>> mailto:rob@YOUR.SPAM.sypron.nl.NOT.FOR.ME
>> http://www.sypron.nl
>> Sypron B.V., Amersfoort, The Netherlands
>> Chamber of Commerce 27138666
>> ----------------------------------------------------------
>> -------
>>
>> <Yaniv C.> wrote in message
>>> news:4b7ba85a.6620.1681692777@sybase.com... Hi
>>> I am using ASE 15.0.2 ESD 6.
>>> I would like to know, when merge join ( or hash join) is
>>> better,faster than nested loop join.
>>> For each tests that I made, NLJ is faster than Merge
>>> Join. Can someone give me the minimum schema,data and
>>> query that merge join for sure will be faster than NLJ
>>>
>>> Thanks,
>>>
>>> Yaniv
>>
>>


Yaniv C. Posted on 2010-02-17 16:10:26.0Z
Sender: 65de.4b7ba750.1804289383@sybase.com
From: Yaniv C.
Newsgroups: sybase.public.ase.general
Subject: Re: Merge Join over Nested Loop Join
X-Mailer: WebNews to Mail Gateway v1.1t
Message-ID: <4b7c14f2.7961.1681692777@sybase.com>
References: <4b7be65a$1@forums-1-dub>
NNTP-Posting-Host: 10.22.241.41
X-Original-NNTP-Posting-Host: 10.22.241.41
Date: 17 Feb 2010 08:10:26 -0800
X-Trace: forums-1-dub 1266423026 10.22.241.41 (17 Feb 2010 08:10:26 -0800)
X-Original-Trace: 17 Feb 2010 08:10:26 -0800, 10.22.241.41
Lines: 269
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub sybase.public.ase.general:28979
Article PK: 78217

Hi Mark and Rob

I understand now.
I took Rob's code and test it.
Correct, the merge join run faster than NLJ.

Thanks for the effort.

Yaniv C.

> Another perspective re: big1 table accesses ...
>
> In both cases it looks like index covering is being used
> to access the big1 table (only 1 column - b - reference
> from big1; index nc1Big1 used to access big1).
>
> MJ:
> ========
>
> - 599,215 records accessed
> - 2,529 logical ios required (ie, scan entire index)
> - 0.00422 ios required per record accessed
>
>
> NLJ:
> ========
>
> - 684 records accessed
> - 1,421 logical ios required
> - approx 2 ios required per record accessed
>
>
> Using the above numbers to generate rough io estimates ...
>
> Accessing 599,215 records via NLJ would require approx 1.2
> million ios (599,215 * 2); so 2,529 ios for a MJ makes
> more sense. [Obviously (?) a MJ may require other
> overhead, eg, worktable space in tempdb to hold
> intermediate records for sorting, cpu for sorting said
> worktable.]
>
> Accessing 684 records via MJ (via full index scan) would
> require 2,529 ios; so 1,421 ios for a NLJ makes more
> sense.
>
>
>
> Yaniv C. wrote:
> > I took your example and implelent it,still , NLJ wins:
> >
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XXXXXX XXX With Merge Join
> >
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XXXXXX
> > C:\Documents and Settings\YanCohen>isql -Udbs -PXXX
> > -w300 -Dyaniv -p
> > Execution Time (ms.): 0 Clock Time
> > (ms.): 0
> > 1>
> > 2> set nl_join off
> > 3> set merge_join on
> > 4> set statistics plancost on
> > 5> go
> > Execution Time (ms.): 15 Clock Time
> > (ms.): 15
> > 1>
> > 2> select count(big1.c)
> > 3> from big1 , big2
> > 4> where big1.a = big2.a
> > 5> and big2.b = 213
> > 6> go
> >
> > -----------
> > 684
> >
> > ==================== Lava Operator Tree
> > ====================
> >
> > Emit
> > (VA = 5)
> > r:1 er:1
> > cpu: 0
> >
> >
> > /
> > ScalarAgg
> > Count
> > (VA = 4)
> > r:1 er:1
> > cpu: 700
> >
> > /
> > MergeJoin
> > Inner Join
> > (VA = 3)
> > r:684 er:685
> >
> >
> > / \
> > Sort IndexScan
> > (VA = 1) nc1Big1
> > r:684 er:685 (VA = 2)
> > l:6 el:6 r:599215 er:600000
> > p:0 ep:0 l:2529 el:2534
> > cpu: 0 bufct: 24 p:0 ep:2534
> > /
> > IndexScan
> > nc2Big2
> > (VA = 0)
> > r:684 er:685
> > l:692 el:692
> > p:0 ep:659
> >
> >
> ==========================================================
> > ==
> > (1 row affected)
> > Execution Time (ms.): 750 Clock Time
> > (ms.): 766
> >
> >
> >
> >
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XXXXXX XXX With NLJ
> >
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XXXXXX C:\Documents and Settings\YanCohen>isql -Udbs
> > -PXXX -w300 -Dyaniv -p
> > Execution Time (ms.): 0 Clock Time
> > (ms.): 0
> > 1>
> > 2> set nl_join on
> > 3> set merge_join off
> > 4> set statistics plancost on
> > 5> go
> > Execution Time (ms.): 0 Clock Time
> > (ms.): 0
> > 1>
> > 2> select count(big1.c)
> > 3> from big1 , big2
> > 4> where big1.a = big2.a
> > 5> and big2.b = 213
> > 6> go
> >
> > -----------
> > 684
> >
> > ==================== Lava Operator Tree
> > ====================
> >
> > Emit
> > (VA = 4)
> > r:1 er:1
> > cpu: 0
> >
> >
> > /
> > ScalarAgg
> > Count
> > (VA = 3)
> > r:1 er:1
> > cpu: 0
> >
> > /
> > NestLoopJoin
> > Inner Join
> > (VA = 2)
> > r:684 er:685
> >
> >
> > / \
> > IndexScan IndexScan
> > nc2Big2 nc1Big1
> > (VA = 0) (VA = 1)
> > r:684 er:685 r:684 er:685
> > l:692 el:692 l:1421 el:2053
> > p:0 ep:659 p:0 ep:687
> >
> >
> ==========================================================
> > ==
> > (1 row affected)
> > Execution Time (ms.): 47 Clock Time
> > (ms.): 47
> > 1>
> >
> >
> >
> >> An example is a query that hit many rows in both tables
> >> being joined: those tend to benefit fro MJs.
> >> For example, let's say we have a table t1 with customer
> >> IDs and order numbers, and table t2 with the same
> customer >> ID and address details. If you want to make a
> list of all >> the order of customers in a particular area
> (or if you >> want to determine the collective value of
> orders in an >> area), you need to join both tables on
> customer ID: in t1 >> you need to access the order details
> , and in t2 you need >> to access the address data. With
> an MJ, both tables will >> be scanned only once, whereas
> with an NLJ the inner table >> will be accessed by walking
> down its index for every row >> that qualifies in the
> outer table. So if you'd hit 1000 >> rows in the outer
> table, an NLJ would cause 1000*H logical >> I/Os for the
> inner table (where H is the height of the >> index). If an
> MJ would be used, you'd only have 1000/P >> LIOs (where P
> is the number of rows per page). This is >> somewhat
> simplified but it shows how MJs can be more >> efficient
> than NLJs. Note than I'm assuming a 1-1 match >> here for
> simplicity of the example, whereas in reality >> there may
> be multiple matching rows in the inner table >>
> (amplifying the benefit of an MJ). As another example, >>
> imagine you have a large customer table which has been >>
> split vertically (i.e. one half of the attributes in each
> >> table with the primary key in both tables). To join
> them >> back together you need to match each and every
> row. An MJ >> could be more efficient here.
> >>
> >> MJs will work best when there is an index on the join
> >> columns, since it requires that the rows in both tables
> >> can be accessed in sorted order. If no such index
> exists, >> either reformatting (or a SORT step) may be
> inserted by >> the optimizer, or a hash join may be more
> efficient. MJs >> have some internal overhead, so they
> tend to become more >> efficient only when more than a
> handful of rows are >> accessed -- for a typical OLTP
> query which hits only one >> row (or a few at most), an
> MJ will likely be slower than >> an NLJ even though the
> logical I/O count may still be >> lower.
> >>
> >> HTH,
> >>
> >> Rob V.
> >>
> ----------------------------------------------------------
> >> ------- Rob Verschoor >>
> >> Certified Sybase Professional DBA for ASE
> >> 15.0/12.5/12.0/11.5/11.0 and Replication Server
> >> 15.0.1/12.5 // TeamSybase
> >>
> >> Author of Sybase books (order online at
> >> www.sypron.nl/shop): "Tips, Tricks & Recipes for Sybase
> >> ASE" (ASE 15 edition) "The Complete Sybase ASE Quick
> >> Reference Guide" "The Complete Sybase Replication
> Server >> Quick Reference Guide"
> >>
> >> mailto:rob@YOUR.SPAM.sypron.nl.NOT.FOR.ME
> >> http://www.sypron.nl
> >> Sypron B.V., Amersfoort, The Netherlands
> >> Chamber of Commerce 27138666
> >>
> ----------------------------------------------------------
> >> ------- >>
> >> <Yaniv C.> wrote in message
> >>> news:4b7ba85a.6620.1681692777@sybase.com... Hi
> >>> I am using ASE 15.0.2 ESD 6.
> >>> I would like to know, when merge join ( or hash join)
> is >>> better,faster than nested loop join.
> >>> For each tests that I made, NLJ is faster than Merge
> >>> Join. Can someone give me the minimum schema,data and
> >>> query that merge join for sure will be faster than NLJ
> >>>
> >>> Thanks,
> >>>
> >>> Yaniv
> >>
> >>