How to Seperate Repetating & non-Repetating data.

I have data in Table as
ID
1
2
3
1
4
3
3
5
6
6
7
Now I want to have Output as repeatating records in one table & non-repeatating in another table
Table 1 :
Id
2
4
5
7
Table 2:
ID
1
1
3
3
3
6
6
Thanks,
Sushil

Questions by Sushils13   answers by Sushils13

Showing Answers 1 - 12 of 12 Answers

James

  • Sep 10th, 2013
 

Write ID,count(*) in a hashed file from the same source and lookup the hashed file through the transformer and redirect output based on the match/mismatch

  Was this answer useful?  Yes

madhu

  • Sep 11th, 2013
 

Aggregate data on column . If count > 1 then table 2 and count =1 then table 1

  Was this answer useful?  Yes

Ramesh M

  • Oct 4th, 2013
 

First sort the data using sort stage and put create cluster key column =True ,than using filter- all duplicates to one file and unique data to other file .Now by using lookup stage in b/w (inner ), we can collect rejects as our o/ps

  Was this answer useful?  Yes

Manasa Parimi

  • Nov 19th, 2013
 

Use copy stage that has the copy of input

Give one input link from copy stage to aggregation stage and get the count

Give the other input link from copy stage to lookup (the reference for lookup stage should be the aggregation output)
Give the output of lookup stage to filter with where conditions count=1 and count>1

Rohit K A

  • Nov 28th, 2013
 

Define 3 Stage Variables in Transformer Stage. StageVar1 will hold the I/P field ID values. StageVar2 hold StageVar1 Value and then write a condition in StageVar3 -> If StageVar1=Stagevar2 Then "Repeating" Else "NotRepeating". Have two O/P links with contraint StageVar3="Repeating" for one link and StageVar3="NotRepeating" for other link.

bhargav

  • Nov 28th, 2013
 

first take copy stage after take the 2 stages from copy at a time lookup and aggregater in aggregater take count -> id after filter1 count =1 out put to lookup as a reference link and stream link is copy stage again take filter2-> count=1 target1 and count<>1 target 2

  Was this answer useful?  Yes

Ram

  • Mar 7th, 2016
 

Method:
Src-->copy (use link sort)-->aggr (count).
2ndlink of copy---> join (copy & agg) -->filter (count=1 for 1st source and count=2 for another src) -->2 sources.


  Was this answer useful?  Yes

Ram

  • Mar 9th, 2016
 

Method:
Src-->copy (use link sort)-->aggr (count).
2ndlink of copy---> join (copy & agg) -->filter (count=1 for 1st source and count > 1 for another src) -->2 sources.

  Was this answer useful?  Yes

Rajasekhar Reddy Balu

  • Mar 15th, 2016
 


Count=1
Seq File---->>copy----->>Filter--------------->>File1(2 4 5 7)Unique Rec
| /
| / Count>1
V /
V /
Join Stage / ----->>File2(1 1 3 3 6 6)Duplicate Rec

  Was this answer useful?  Yes

Neha

  • Apr 7th, 2017
 

Seq File-> Aggregator (Count group by ID)->Transformer(Two output link , Use @Iteration<=Cnt for OpLnk1 put constraint Cnt=1 and for OpLnk 2 put constraint>1 and put @Iteration value in OpLnk2

  Was this answer useful?  Yes

anirudh

  • May 9th, 2017
 

You can use sort->aggregrator->count_rows->transformer stage write constraints row_count>1 to one file and row_count>1 to another output file

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions