SCD3 – store something, if not everything!
To implement SCD3 using wizard, perform the following steps:
- In the designer, navigate to Tools | Mapping Designer | Mapping | Wizard | Slowly Changing Dimensions, as shown in the following screenshot:
- A new window will pop up, asking you the name (
m_SCD3
) of the new SCD mapping. Also, select the type of SCD you wish to implement. Select Type 3 Dimension - keep the current and previous value in the target, as we are implementing SCD3, and click on Next. - The next screen will ask you to select the source. Select a source from the dropdown. We are using
EMP_FILE.txt
as the source file for our reference. We will name the target asEMPLOYEE_SCD3
in this book for our reference. Then, click on Next. - In the next window, select
EMPLOYEE_ID
as Logical Key Field. Also, addLOCATION
under Fields to compare the changes and click on Next. - In the next window, select the target columns that you wish to compare in order to detect changes. In our case, the
LOCATION
column in the target will be compared againstPM_PREV_LOCATION
. You can select aPM_EFFECT_DATE
optional field to understand the loading of new or changed records, and click on Finish. - The wizard generates a complete mapping in your Mapping Designer Workspace. Make the necessary changes to the mapping if required.
- Change the target data type from the flat file to the
Oracle
table, as shown in the following figure:
When we create the mapping using this option, the wizard creates three additional columns in the target table:
PM_PRIMARY_KEY
: The wizard generates the primary key for each row to be inserted into target. Note thatEMPLOYEE_ID
will not be the primary key in the table.PM_PREV_columnName
: For every column for which we maintain the history, the wizard generates a previous column. In our case, we wish to maintain the history for theLOCATION
field, so the wizard creates another column, which isPM_PREV_LOCATION
.PM_EFFECT_DATE
: This is an optional field; the wizard loadsSYSTEMDATE
in this column to indicate insertions or updates to the record in the table.
The Informatica Power Center SCD2 mapping uses the LKP_GetData
lookup transformation to look up the data in the Target
table and uses the EXP_DetectChanges
expression transformation to compare the target data with the source data. Based on the comparison, the expression transformation marks a record as NewFlag
or ChangedFlag
. The mapping is divided into two flows:
- The
FIL_InsertNewRecord
filter transformation allows onlyNewFlag
record to pass further and filter theChangedFlag
record from the first flow. It passes new records toUPD_ForceInserts
, which inserts these records into the target. TheSEQ_GenerateKeys
sequence generator generates the primary key for eachNewFlag
record. If you select to create thePM_EFFECT_DATE
column option in the wizard, theEXP_EffectiveDate_InsertNew
expression transformation loadsSYSTEMDATE
into thePM_EFFECT_DATE
column to indicate the loading of new records. - The
FIL_UpdateChangedRecord
filter transformation allows only theChangedFlag
record to pass further. The current data is passed from theSQ_EMP_FILE
source qualifier, and the previous data is taken from the target by using a lookup transformation to load the data inPM_PREV_LOCATION
. It passes changed records toUPD_ChangedUpdates
, which updates changed records in the target. If you select to create thePM_EFFECT_DATE
column in the wizard, the expression transformationEXP_EffectiveDate_InsertChanged
updatesSYSTEMDATE
in thePM_EFFECT_DATE
column to indicate that new records have been updated.
Let's work through each transformation that is used in the SCD2 mapping:
- Source qualifier (
SQ_EMP_FILE
): This extracts the data from the file or table that you used as the source in the mapping. It passes data to the downstream transformations, that is, lookup, expression, and filter transformation. - Lookup (
LKP_GetData
): This is used to look up the target table. It caches the existing data fromEMPLOYEE_SCD3
. TheEMPLOYEE_ID=IN_EMPLOYEE_ID
condition will compare the data with the source table and target table. It passes the data based on the comparison with the expression transformation. - Expression (
EXP_DetectChanges
): This receives the data from the upstream transformation and based on that, it creates two flags, which areNewFlag
andChangedFlag
. The conditions for both the flags are as follows:NewFlag
:IIF(ISNULL(PM_PRIMARYKEY), TRUE, FALSE)
ChangedFlag
:IIF(NOT ISNULL(PM_PRIMARYKEY) AND (DECODE(LOCATION,PM_PREV_LOCATION,1,0)=0), TRUE, FALSE)
Based on the condition, it passes the data to downstream filter transformations.
- Filter (
FIL_InsertNewRecord
): This filters the records that come from the upstream expression transformation and are marked asChangedFlag
; it allows records withNewFlag
to get passed to theUPD_ForceInserts
update strategy. - Filter (
FIL_UpdateChangedRecord
): This filters the records that come from the upstream expression transformation and are marked asNewFlag
; it allows records withChangedFlag
to get passed to theUPD_ChangedUpdate
update strategy. It uses the value of theLOCATION
field that is returned fromLKP_GetData
to loadPM_PREV_LOCATION
. - Update strategy (
UPD_ForceInserts
): This uses theDD_INSERT
condition to insert data into theEMPLOYEE_SCD3
target instance. - Update strategy (
UPD_ChangedUpdate
): This uses theDD_UPDATE
condition to overwrite the existingLOCATION
field into theEMPLOYEE_SCD3
target instance. It passes data toEXP_EffectiveDate_insertChanged
in order to loadPM_PREV_LOCATION
in the target. - Sequence generator (
SEQ_GenerateKeys
): This generates a sequence of values for each new row marked asNewFlag
that comes into the target, getting incremented by1
. It passes the generated value toEXP_KeyProcessing_InsertNew
. - Expression (
EXP_EffectiveDate_InsertNew
): This transformation is created by the wizard only if you selected to load thePM_EFFECT_DATE
option in the wizard. It loads the generated value in thePM_PRIMARYKEY
column into the target, which isEMPLOYEE_SCD3
. It loadsSYSTEMDATE
into thePM_EFFECT_DATE
column in the target, marking the start of the record. - Expression (
EXP_EffectiveDate_InsertChanged
): This transformation is created by the wizard only if you selected to load thePM_EFFECT_DATE
option in wizard. It loads the generated value in thePM_PRIMARYKEY
column in theEMPLOYEE_SCD32
target instance. It loadsSYSTEMDATE
into thePM_EFFECT_DATE
column in the target in order to indicate that the record has been updated. - Target (
EMPLOYEE_SCD3
): This is theTarget
table instance that accepts new records into the target table instance. - Target (
EMPLOYEE_SCD31
): This is theTarget
table instance that accepts updates to the existing row in the target table instance.
With this, we saw in detail how to implement the different types of SCDs. Note that we have learned how to implement SCD using wizard. You can also manually create the mapping in order to get more practice and better hands-on experience.