Learning Informatica PowerCenter 9.x
上QQ阅读APP看书,第一时间看更新

SCD2 (flag) – flag the history

To implement SCD2 by maintaining the flag, perform the following steps:

  1. In the designer, navigate to Tools | Mapping Designer | Mapping | Wizard | Slowly Changing Dimensions, as shown in the following screenshot:
    SCD2 (flag) – flag the history

    A new window will pop up, asking you the name (m_SCD2_FLAG) of the new SCD2 mapping. Select Type 2 Dimension - keep a full history of the changes in the target, as we are implementing SCD2, and click on Next.

    SCD2 (flag) – flag the history
  2. The next screen will ask you to select the source. Select a source from the drop-down list. We are using EMP_FILE.txt as the source file for our reference. Also, specify the name of the target you wish to create. We will name the target EMPLOYEE_SCD2_FLAG in this book for our reference. Then, click on Next.
    SCD2 (flag) – flag the history
  3. In the next window, select EMPLOYEE_ID as Logical Key Field. Also, add LOCATION under Fields to compare the changes, and click on Next.
    SCD2 (flag) – flag the history
  4. The next screen asks you to choose the option to maintain the history in the target. Select Mark the 'current' dimension record with a flag, and click on Finish.
    SCD2 (flag) – flag the history
  5. The wizard generates a complete mapping in your Mapping Designer Workspace. Make the necessary changes to the mapping, if required.
    SCD2 (flag) – flag the history
  6. Change the target data type from the flat file to Oracle table, as shown in the following figure:
    SCD2 (flag) – flag the history

When we create a mapping using the flag option, the wizard creates the following two additional columns in the target table:

  • PM_PRIMARY_KEY: The wizard generates the primary key for each row to be inserted into the target. Please note that EMPLOYEE_ID will not be the primary key in the table.
  • PM_CURRENT_FLAG: The wizard loads 1 for each new record inserted into the table and marks all history records as 0; this will allow us to differentiate between current and historical records.

The Informatica PowerCenter SCD2 mapping uses the LKP_GetData lookup transformation to look up the data in the target table and uses the EXP_DetectChanges expression transformation to compare the data with the source data. Based on the comparison, the expression transformation marks a record as NewFlag or ChangedFlag. The mapping is divided into three flows:

  • The FIL_InsertNewRecord filter transformation allows only the NewFlag record to pass further and filter the ChangedFlag record from the first flow. It passes new records to UPD_ForceInserts, which inserts these records into the target. The SEQ_GenerateKeys sequence generator generates the primary key for each NewFlag record. The EXP_KeyProcessing_InsertNew expression transformation multiplies the NEXTVAL value by 1000 and loads 1 as the current flag for each new row.
  • The FIL_InsertChangedRecord filter transformation allows only the ChangedFlag record to get passed to UPD_ChangedInserts, which inserts changed records into the target, which is EMPLOYEE_SCD2_FLAG1. The EXP_KeyProcessing_InsertChanged expression transformation increments the primary key by 1 and loads the current flag as 1 to indicate that the updated row contains the current data.
  • The FIL_UpdateChangedRecord filter transformation passes the primary key of the previous value for every ChangedFlag record to UPD_ChangedUpdate, which updates changed records in the target, which is EMPLOYEE_SCD2_FLAG2. The EXP_KeyProcessing_UpdateChanged expression transformation changes the current flag to 0 to indicate the row doesn't contain the current data anymore.

Let's work through each transformation that is used in the SCD2 mapping:

  • Source qualifier (SQ_EMP_FILE): This extracts the data from the file or table that you used as the source in the mapping. It passes data to the downstream transformations, that is, lookup, expression, and filter transformation.
  • Lookup (LKP_GetData): This is used to look up the target table. It caches the existing data from EMPLOYEE_SCD2_FLAG. The EMPLOYEE_ID=IN_EMPLOYEE_ID condition will compare the data with the source and target table. It passes the data based on the comparison with the expression transformation.
  • Expression (EXP_DetectChanges): This receives the data from the upstream transformation and based on that, it creates two flags, which are NewFlag and ChangedFlag:
    • Condition for NewFlag: IIF(ISNULL(PM_PRIMARYKEY), TRUE, FALSE)
    • Condition for ChangedFlag: IIF(NOT ISNULL(PM_PRIMARYKEY) AND (DECODE(LOCATION,PM_PREV_LOCATION,1,0)=0), TRUE, FALSE)

    Based on the condition, it passes the data to downstream filter transformations.

  • Filter (FIL_InsertNewRecord): This filters the records that come from the upstream expression transformation and are marked as ChangedFlag; it only allows records as NewFlag to get passed to the UPD_ForceInserts update strategy.
  • Filter (FIL_InsertChangedRecord): This filters the records that come from the upstream expression transformation and are marked as NewFlag; it only allows records as ChangedFlag to get passed to the UPD_ChangedInserts update strategy.
  • Filter (FIL_UpdateChangedRecord): This filters the records that come from the upstream expression transformation and are marked as NewFlag; it only allows records marked as ChangedFlag to pass. For every record marked as ChangedFlag, the filter passes the primary key of the previous version to the UPD_ChangedUpdate update strategy.
  • Update strategy (UPD_ForceInserts): This uses the DD_INSERT condition to insert the data into the EMPLOYEE_SCD2_FLAG target instance.
  • Update strategy (UPD_ChangedInserts): This uses the DD_INSERT condition to insert data into target instance EMPLOYEE_SCD2_FLAG1.
  • Update strategy (UPD_ChangedUpdate): This uses the DD_UPDATE condition to overwrite the existing LOCATION value into the target, which is EMPLOYEE_SCD2_FLAG2.
  • Sequence generator (SEQ_GenerateKeys): This generates a sequence of values for PM_PRIMARYKEY for each row marked as NewFlag into the target, incrementing the value by 1.
  • Expression (EXP_KeyProcessing_InsertNew): This multiplies NEXTVAL generated by the sequence generator by 1000 using the NEXTVAL*1000 condition. Note that you can change this number as per your requirement. Using 1000 here means that we can maintain a 1000 history of a particular record. This creates a current flag of 1 for each NewFlag record to load into the PM_CURRENT_FLAG column in the target.
  • Expression (EXP_KeyProcessing_InsertChanged): This is used to increment the primary key by 1 using the PM_PRIMARYKEY + 1 condition. It also creates a current flag of 1 for each NewFlag record to load the PM_CURRENT_FLAG column in the target.
  • Expression (EXP_KeyProcessing_UpdateChanged): This is used to set PM_CURRENT_FLAG to 0 for the record marked as Changed, indicating that the record is no longer current.
  • Target (EMPLOYEE_SCD2_FLAG): This is the target table instance that accepts new records into the target table.
  • Target (EMPLOYEE_SCD2_FLAG1): This is the target table instance that accepts changed records into the target table.
  • Target (EMPLOYEE_SCD2_FLAG2): This is the target table instance that allows updates to existing records into the target table.