Duplicate Record in Datastage

I have one scenario below, how can we achieve.
City1,City2,Distance
==============
blr, pune,1000 km
pune, blr, 1000 km
As we have same data in 2 records but need to delete any one of the duplicate record.

Showing Answers 1 - 8 of 8 Answers

Ranajoy Gupta

  • Mar 20th, 2017
 

Use sort stage, put city1, city2 and distance as key columns, set allow duplicates as false.

  Was this answer useful?  Yes

Karan

  • May 4th, 2017
 

Use Remove duplicate Stage..And Select Distance as Key coloum.. one record will be deleted.

  Was this answer useful?  Yes

Naresh

  • May 11th, 2017
 

By using the sort stage we cant remove the duplicates. It sorts the values by mentioned columns and it wont delete the records such scenario

  Was this answer useful?  Yes

Bosemani

  • Jun 13th, 2017
 

Hi Priya,

In Transformer stage,

Right(City1,1)

  Was this answer useful?  Yes

Prashanth

  • Jun 13th, 2017
 

1. Create 2 Stage variables in transformer stage as NewCity1 & NewCity2
2. Give value for NewCity1 = if City1 < City2 then City1 else City2 end
3. Give value for NewCity2 = if City1 < City2 then City2 else City1 end
4. Use remove duplicates stage and give Key as NewCity1 & NewCity2

  Was this answer useful?  Yes

siva krishna

  • Jun 16th, 2017
 

Hi, Ive solved this as mentioned below.
seq stage (source)----->sort stage ------> transformer stage -----> remove dupli stage ---->seq stage(target)
read the source and add FILENAME COLUMN column and sort the 1st column in SORT STAGE and in TRANSFORMER stage take in StageVariable and write an expression like "if DSLink18.COL1 < DSLink18.COL2 then DSLink18.COL1 : DSLink18.COL2 else DSLink18.COL1 : DSLink18.COL2".
and assign this sv to a new column COl4 and in REMOVEDUPLCATE stage use COL4.
thats it. irreseptive of no of records this job will run.
thanks.
siva

  Was this answer useful?  Yes

Sirisha d

  • Aug 8th, 2017
 

Job Design :
Sequential file(I/p) --> Transformer --> Sequential file (o/p)
In the Transformer stage , define 2 stage varibles st1 and st2.
Derivation for st1 => If city2:city1=st2 Then 1 Else 0
Derivation for st2 => city1:city2
Now, in the transformer give a constraint => st1=0.
This should remove duplicate records in this scenario.

  Was this answer useful?  Yes

kalyanreddy

  • Jan 30th, 2024
 

use the remove duplicate stage and use the hash partition, select the key column as 3 rd column (1000 km).
and select record retained = first , it will remove the duplicate record, you can also use sort stage for this.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions