SCD2 (flag) – flag the history
To implement SCD2 by maintaining the flag, perform the following steps:
- In the designer, navigate to Tools | Mapping Designer | Mapping | Wizard | Slowly Changing Dimensions, as shown in the following screenshot:
A new window will pop up, asking you the name (
m_SCD2_FLAG
) of the new SCD2 mapping. Select Type 2 Dimension - keep a full history of the changes in the target, as we are implementing SCD2, and click on Next. - The next screen will ask you to select the source. Select a source from the drop-down list. We are using
EMP_FILE.txt
as the source file for our reference. Also, specify the name of the target you wish to create. We will name the targetEMPLOYEE_SCD2_FLAG
in this book for our reference. Then, click on Next. - In the next window, select
EMPLOYEE_ID
as Logical Key Field. Also, addLOCATION
under Fields to compare the changes, and click on Next. - The next screen asks you to choose the option to maintain the history in the target. Select Mark the 'current' dimension record with a flag, and click on Finish.
- The wizard generates a complete mapping in your Mapping Designer Workspace. Make the necessary changes to the mapping, if required.
- Change the target data type from the flat file to
Oracle
table, as shown in the following figure:
When we create a mapping using the flag option, the wizard creates the following two additional columns in the target table:
PM_PRIMARY_KEY
: The wizard generates the primary key for each row to be inserted into the target. Please note thatEMPLOYEE_ID
will not be the primary key in the table.PM_CURRENT_FLAG
: The wizard loads1
for each new record inserted into the table and marks all history records as0
; this will allow us to differentiate between current and historical records.
The Informatica PowerCenter SCD2 mapping uses the LKP_GetData
lookup transformation to look up the data in the target table and uses the EXP_DetectChanges
expression transformation to compare the data with the source data. Based on the comparison, the expression transformation marks a record as NewFlag
or ChangedFlag
. The mapping is divided into three flows:
- The
FIL_InsertNewRecord
filter transformation allows only theNewFlag
record to pass further and filter theChangedFlag
record from the first flow. It passes new records toUPD_ForceInserts
, which inserts these records into the target. TheSEQ_GenerateKeys
sequence generator generates the primary key for eachNewFlag
record. TheEXP_KeyProcessing_InsertNew
expression transformation multiplies theNEXTVAL
value by1000
and loads1
as the current flag for each new row. - The
FIL_InsertChangedRecord
filter transformation allows only theChangedFlag
record to get passed toUPD_ChangedInserts
, which inserts changed records into the target, which isEMPLOYEE_SCD2_FLAG1
. TheEXP_KeyProcessing_InsertChanged
expression transformation increments the primary key by1
and loads the current flag as1
to indicate that the updated row contains the current data. - The
FIL_UpdateChangedRecord
filter transformation passes the primary key of the previous value for everyChangedFlag
record toUPD_ChangedUpdate
, which updates changed records in the target, which isEMPLOYEE_SCD2_FLAG2
. TheEXP_KeyProcessing_UpdateChanged
expression transformation changes the current flag to0
to indicate the row doesn't contain the current data anymore.
Let's work through each transformation that is used in the SCD2 mapping:
- Source qualifier (
SQ_EMP_FILE
): This extracts the data from the file or table that you used as the source in the mapping. It passes data to the downstream transformations, that is, lookup, expression, and filter transformation. - Lookup (
LKP_GetData
): This is used to look up the target table. It caches the existing data fromEMPLOYEE_SCD2_FLAG
. TheEMPLOYEE_ID=IN_EMPLOYEE_ID
condition will compare the data with the source and target table. It passes the data based on the comparison with the expression transformation. - Expression (
EXP_DetectChanges
): This receives the data from the upstream transformation and based on that, it creates two flags, which areNewFlag
andChangedFlag
:- Condition for NewFlag:
IIF(ISNULL(PM_PRIMARYKEY), TRUE, FALSE)
- Condition for ChangedFlag:
IIF(NOT ISNULL(PM_PRIMARYKEY) AND (DECODE(LOCATION,PM_PREV_LOCATION,1,0)=0), TRUE, FALSE)
Based on the condition, it passes the data to downstream filter transformations.
- Condition for NewFlag:
- Filter (
FIL_InsertNewRecord
): This filters the records that come from the upstream expression transformation and are marked asChangedFlag
; it only allows records asNewFlag
to get passed to theUPD_ForceInserts
update strategy. - Filter (
FIL_InsertChangedRecord
): This filters the records that come from the upstream expression transformation and are marked asNewFlag
; it only allows records asChangedFlag
to get passed to theUPD_ChangedInserts
update strategy. - Filter (
FIL_UpdateChangedRecord
): This filters the records that come from the upstream expression transformation and are marked asNewFlag
; it only allows records marked asChangedFlag
to pass. For every record marked asChangedFlag
, the filter passes the primary key of the previous version to theUPD_ChangedUpdate
update strategy. - Update strategy (
UPD_ForceInserts
): This uses theDD_INSERT
condition to insert the data into theEMPLOYEE_SCD2_FLAG
target instance. - Update strategy (
UPD_ChangedInserts
): This uses theDD_INSERT
condition to insert data into target instanceEMPLOYEE_SCD2_FLAG1
. - Update strategy (
UPD_ChangedUpdate
): This uses theDD_UPDATE
condition to overwrite the existingLOCATION
value into the target, which isEMPLOYEE_SCD2_FLAG2
. - Sequence generator (
SEQ_GenerateKeys
): This generates a sequence of values forPM_PRIMARYKEY
for each row marked asNewFlag
into the target, incrementing the value by1
. - Expression (
EXP_KeyProcessing_InsertNew
): This multipliesNEXTVAL
generated by the sequence generator by1000
using theNEXTVAL*1000
condition. Note that you can change this number as per your requirement. Using1000
here means that we can maintain a1000
history of a particular record. This creates a current flag of1
for eachNewFlag
record to load into thePM_CURRENT_FLAG
column in the target. - Expression (
EXP_KeyProcessing_InsertChanged
): This is used to increment the primary key by1
using thePM_PRIMARYKEY + 1
condition. It also creates a current flag of1
for eachNewFlag
record to load thePM_CURRENT_FLAG
column in the target. - Expression (
EXP_KeyProcessing_UpdateChanged
): This is used to setPM_CURRENT_FLAG
to0
for the record marked asChanged
, indicating that the record is no longer current. - Target (
EMPLOYEE_SCD2_FLAG
): This is the target table instance that accepts new records into the target table. - Target (
EMPLOYEE_SCD2_FLAG1
): This is the target table instance that accepts changed records into the target table. - Target (
EMPLOYEE_SCD2_FLAG2
): This is the target table instance that allows updates to existing records into the target table.