Source Diff Processor is one of the Database Write Strategies offered in Centerprise. It works like Database Diff processor, but unlike database diff processor, it is used for performing write actions (such as Insert, Update, Upsert, Delete) on file destinations.
To use Source diff processor in a dataflow, open a new a dataflow and drag-and-drop the source object that you want to work with. I have Employees data stored in an excel file, so I will drag-and-drop the excel source object from the sources section in the toolbox and configure the settings.
Go to the properties window of the excel workbook source object and give the file path. After configuring the settings for the Excel workbook source, I’ll click Ok and Centerprise will automatically build the source layout for me.
I will now drag-and-drop the Source Diff Processor object from the Database write strategies section in the Toolbox and map the fields from the excel source object.
Now that we have the layout ready, go to the properties section of the source diff processor object.
The first screen will be the layout builder screen where you can add or remove field and customize your layout.
Click Next and go to the next screen.
The next screen will be the Incremental Write Options screen.
Here, you have to specify the Record Matching Field. Record matching field is the field that Centerprise uses to match and compare incoming and existing records. I will select Employee ID as the Record Matching field.
- Now, if my incoming dataset has a new Employee ID i.e. that ID is not present in the existing file, which is being compared against the incoming file, Centerprise will perform the INSERT action.
- If the customer ID is already present in the existing file, Centerprise will compare the records against that ID and will perform UPDATE action where the information has updated.
- If the customer ID is there in the existing file, but not present in the incoming file, it means that the particular record is deleted. In such a case, Centerprise will then perform DELETE action.
In the Output Options section, you can either select the Single Output option or One Port for Each Action.
- If you select Single Port, the database action, such as Insert, Update, Skip or Error will be chosen by the database write strategy’s logic rather than specified by the user. Using a single output is recommended when a database write strategy is applied.
- If you select One Port for Each Action, you will get separate nodes for each Diff action in the Source diff processor object. So, you will have a separate node for Insert, Update, Delete, Skip, and Error.
In this example, I’ll select the Single Port option.
The third section on the Incremental Write Options screen is the Incremental Transfer Information File Path option. Specify the file path where you want to store information related to the last run.
Now if you have worked with Excel and Database table sources in Centerprise, you would have noticed that the database table source object gives you the option for reading incremental changes. However, no such option is available in excel or other file source objects.
This option here in Source Diff Processor enables you to read incrementally from file sources such as Excel, Delimited, and Fixed Length files.
Now I’ll click next and OK.
I’ve now successfully built the layout and configured the settings. Let’s preview the output.
When you preview the output, on the extreme right, you’ll see an added ‘Write Strategy Action’ column. Over here you can see the write strategy action that was performed on any specific record. This added column is not a part of the layout but think of it as a label.
You can write the output from the Source Diff to any destination that you want to or perform any transformation on the data set.
That concludes using the source diff processor write strategy.