Derivative Data Sources

Derivative data sources transform an input in some way in addition to performing the read function. We expect the list of derivative data sources to grow over time. Currently, a list of sources are:

Data Source Description
ComparisonDataSource Will take two inputs of identical format and emit records that were added, deleted, or changed. Typically, this is used to compare to extracts taken at different times from the same source. See ComparisonDataSource for additional information.
JoiningDataSource Will take two inputs and join them together based on key fields provided. To make a relational database comparison, the join type is "left outer". See JoiningDataSource for additional information.
ScreeningDataSource Will automatically filter out unwanted data from a data source based on filter conditions you provide. See ScreeningDataSource for additional information.

Next section.

ComparisonDataSource -- Processing only what's changed

ComparisonDataSource is most useful when comparing two identical sets of data taken at different times. Essentially, ComparisonDataSource compares them both and only emits data that has changed. Data emitted has an additional field DataSource_Difference that describes whether the retrieved record is an "Add", "Change" or "Delete". Transformer developers can process each type of change accordingly. While we do take efforts to be data type agnostic (e.g. see an Integer zero as equal to a Long zero), it's safest for each field to use an identical data type.

An example of setting up a ComparisonDataSource follows. yesterdaysData is a data source that contains a data extract from a previous day (e.g. Json formatted extract). todaysData is a data source that contains a current extract and has the same fields on it which are the same data type. You also need to provide a set of key fields that are used to match records in the old copy to the new copy.

ComparisonDataSource dataSource = new ComparisonDataSource(yesterdaysData, todaysData, new String[]{"customerId"});
				

Transformer developers can easily tell what type of change is being processed. An example follows:

@Override
protected void process(DataRecord row) {
	if (ComparisonDataSource.wasRowAdded(record)) {
		// Put "Add" processing logic here.
	}
	if (ComparisonDataSource.wasRowChanged(record)) {
		// Put "Change" processing logic here.
	}
	if (ComparisonDataSource.wasRowDeleted(record)) {
		// Put "Delete" processing logic here.
	}
}
				

See ComparisonDataSource javadoc for additional information.

JoiningDataSource -- Joining information from different sources

JoiningDataSource will combine two data sources into one source with fields from both. Records will be emitted for all rows in the "base" data source, regardless if matching rows in the "comparison" source are found. In other words, this is like a "left outer join" in SQL terms. While we do take efforts to be data type agnostic (e.g. see an Integer zero as equal to a Long zero), it's safest for each field to use an identical data type. An example establishing a JoiningDataSource follows:

JoiningDataSource source = new JoiningDataSource(baseSource, new String[]{"key"}, joinSource, new String[]{"customerId"});
				

Note: If both data sources are from the same relational database, using a JdbcDataSource with a SQL query that does the join is advised.

See JoiningDataSource javadoc for additional information.

ScreeningDataSource -- Processing only information you need

ScreeningDataSource will use filters you supply (e.g. classes that implement interface Filter). Transform4J provides a generic filter for your use (see FilterCondition). An example of establishing a ScreeningDataSource follows. The example will only emit rows with "age" strictly greater than 14 and having a "middleInitial" value of "C".

Filter<DataRecord> filter = new FilterCondition("age", FilterCondition.Operator.GREATER_THAN, 14)
	.and(new FilterCondition("middleInitial", FilterCondition.Operator.EQUALS, "C"));
ScreeningDataSource screeningSource = new ScreeningDataSource(mySource, filter );
				

See ScreeningDataSource javadoc for additional information.

Next section.