What is the exact difference betwwen Join,Merge and Lookup Stage??

Showing Answers 1 - 12 of 12 Answers

prasad duvasi

  • Jan 24th, 2006
 

The exact difference between Join,Merge and lookup is

The three stages differ mainly in the memory they use

DataStage doesn't know how large your data is, so cannot make an informed choice whether to combine data using a join stage or a lookup stage. Here's how to decide which to use:

if the reference datasets are big enough to cause trouble, use a join. A join does a high-speed sort on the driving and reference datasets. This can involve I/O if the data is big enough, but the I/O is all highly optimized and sequential. Once the sort is over the join processing is very fast and never involves paging or other I/O

Unlike Join stages and Lookup stages, the Merge stage allows you to specify several reject links as many as input links.

srinivas

  • Apr 14th, 2006
 

As of my knowledge join and merge both u used to join two files of same structure where lookup u mainly use it for to compare the prev data and the curr data.

  Was this answer useful?  Yes

joinmerge

  • Apr 20th, 2006
 

then how do we perform join in server jobs

  Was this answer useful?  Yes

Madan

  • Apr 29th, 2006
 

We can join 2 relational tables using Hash file only in server jobs. Merge stage is only for flat files

  Was this answer useful?  Yes

uday

  • May 23rd, 2006
 

join  only max of two input datasets to single output,

but

merge can have more than two dataset inputs to single output.

  Was this answer useful?  Yes

Ronaldo

  • Aug 5th, 2006
 

Also remember that to use Merge stage the  key's field names  MUST be equal in both input files (master and updates).

  Was this answer useful?  Yes

data selection is very easy in join
dataselection is difficult in merge
join does'nt support reject links
merge support reject links
join supports only 2 files or 2 databases
 merge supports
no.of inputs
so finally performance wise join is better than merge.

lookup is used just for comparison purpose

  Was this answer useful?  Yes

Hope the below one helps you.

Join Stage:
1.) It has n input links(one being primary and remaining being secondary links), one output link and there is no reject link
2.) It has 4 join operations: inner join, left outer join, right outer join and full outer join
3.) join occupies less memory, hence performance is high in join stage
4.) Here default partitioning technique would be Hash partitioning technique
5.) Prerequisite condition for join is that before performing join operation, the data should be sorted.

Look up Stage:
1.) It has n input links, one output link and 1 reject link
2.) It can perform only 2 join operations: inner join and left outer join
3.) Join occupies more memory, hence performance reduces
4.) Here default partitioning technique would be Entire

Merge Stage:
1.) Here we have n inputs master link and update links and n-1 reject links
2.) in this also we can perform 2 join operations: inner join, left outer join
3.) the hash partitioning technique is used by default
4.) Memory used is very less, hence performance is high
5.) sorted data in master and update links are mandatory


If iam wrong pls let me know
Thanks & Regards,
Mallika

  Was this answer useful?  Yes

vij

  • Jan 6th, 2014
 

default partition technique is Auto in all please check once

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions