How to eleminate duplicate rows in data stage?

Showing Answers 1 - 6 of 6 Answers

Anwar

  • Oct 2nd, 2006
 

TO remove duplicate rows you can achieve by more than one way 

1.In DS there is one stage called "Remove Duplicate" is exist where you can specify the key.

2.Other way you can specify the key while using the stage i mean stage itself remove the duplicate rows based on key while processing time.

Hope above things help you.

Anwar

  Was this answer useful?  Yes

Ravishankar

  • Oct 4th, 2006
 

By using Hash File Stage in DS Server we can elliminate the Duplicates in DS.

  Was this answer useful?  Yes

Ram

  • Oct 11th, 2006
 

Using a sort stage,set property: ALLOW DUPLICATES :false

OR

You can use any Stage in input tab choose hash partition And Specify the key and Check the unique checkbox.

  Was this answer useful?  Yes

shrini

  • Oct 17th, 2006
 

hi

if u r doing with server Jobs, V can use hashfile to eliminate duplicate rows.

Thanks

Shrini

  Was this answer useful?  Yes

ravi

  • Dec 11th, 2006
 

There are two methods for eleminating duplicate rows in datastage

1. Using hash file stage (Specify the keys and check the unique checkbox, Unique Key is not allowed duplicate values)

2. Using Sort stage by link remove duplicate stage

Thanx

  Was this answer useful?  Yes

Some of the methods to remove duplicates ->

1. Remove Duplicate Stages

2. Hash File

3. In Sort stage there are 3 ways ->
     a. Set CHANGECAPTURE Property to True . you have to specify a key and then after that use a filter to drop the fields which have the value of changecapture as '0' as these are duplicate fields.
     b. Set Hash partition method and select "Unique" Sorting.
     c. Set ALLOW DUPLICATES to false.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions