Sort Transformation can be used to sort incoming data stream. In addition, the Sort Transformation distinct option can be used to remove duplicate values from the input.
Sort Transformation is a blocking transformation meaning that the input records are accumulated until the end of input. Blocking transformations affect the performance of overall dataflow because subsequent steps cannot execute until all the records have been received and processed by the blocking transformation.
Sort Transformation uses storage on the server for temporary data during sorting. The server must have enough capacity to store the entire data set and index.
How to Use Sort Transformation
To add a Sort transformation, drag the Sort object from the Transformations group in the Toolbox and drop it on the dataflow.
Map the fields from the source object to the Sort transformation object and configure the transformation properties.
To configure the properties of a Sort object after it was added to the dataflow, right-click on it and select Properties from the context menu. The following properties are available:
Layout Builder screen:
Layout builder screen allows you to add or remove fields in the field layout, as well as select their data type. The fields added in the Layout Builder will be shown in the Input node inside the object box.
Sort Transformation Properties screen:
Return Distinct Values Only – check this option to remove duplicate values from the output.
Note: Only the combination of key fields, which you select in the grid, is checked for duplicate values. If you specify a subset of key fields instead of the entire layout, you may lose non-duplicate records when the combination of key fields returns a duplicate.
Treat Null as the Lowest Value – turn this option on to have a Null value returned first in the Ascending sort order, and conversely, have the Null value returned last in the Descending sort order.
Case Sensitive – turn this option on if you require case sensitive comparison for strings.
Finally, select key fields in the grid and apply Ascending or Descending sort order to each field
General Options screen:
This screen shares the options common to most objects on the dataflow:
Clear Incoming Record Messages
When this option is on, any messages coming in from objects preceding the current object will be cleared. This is useful when you need to capture record messages in the log generated by the current object and filter out any record messages generated earlier in the dataflow.
Do Not Process Records with Errors
When this option is on, records with errors will not output by the object. When this option is off, records with errors will be output by the object, and a record message will be attached to the record. This message can then feed into downstream objects on the dataflow, for example, a destination file that will capture record messages, or a log that will capture the messages and as well as collect their statistics.
The Comments input allows you to enter comments associated with this object.