The Remove duplicate Stage in Datastage allows dropping duplicate records from the input data links.It can have one input and one output link. It also enable us to retain first or last record from the duplicates.
Mandatory: The data source must pre-sorted to get accurate results.
Remove duplicate is common practice done as part of data cleansing process before starting data loading .
|Keys that Define Duplicates/Key||Input Column|
|Keys that Define Duplicates/Sort as EBCDIC||True/False|
|Keys that Define Duplicates/Case Sensitive||True/False|
|Options/Duplicate to retain||First/Last|
consider the records of first data set as follows .
Output file :