Learning Informatica PowerCenter 9.x
上QQ阅读APP看书,第一时间看更新

SCD3 – store something, if not everything!

To implement SCD3 using wizard, perform the following steps:

  1. In the designer, navigate to Tools | Mapping Designer | Mapping | Wizard | Slowly Changing Dimensions, as shown in the following screenshot:
    SCD3 – store something, if not everything!
  2. A new window will pop up, asking you the name (m_SCD3) of the new SCD mapping. Also, select the type of SCD you wish to implement. Select Type 3 Dimension - keep the current and previous value in the target, as we are implementing SCD3, and click on Next.
    SCD3 – store something, if not everything!
  3. The next screen will ask you to select the source. Select a source from the dropdown. We are using EMP_FILE.txt as the source file for our reference. We will name the target as EMPLOYEE_SCD3 in this book for our reference. Then, click on Next.
    SCD3 – store something, if not everything!
  4. In the next window, select EMPLOYEE_ID as Logical Key Field. Also, add LOCATION under Fields to compare the changes and click on Next.
    SCD3 – store something, if not everything!
  5. In the next window, select the target columns that you wish to compare in order to detect changes. In our case, the LOCATION column in the target will be compared against PM_PREV_LOCATION. You can select a PM_EFFECT_DATE optional field to understand the loading of new or changed records, and click on Finish.
    SCD3 – store something, if not everything!
  6. The wizard generates a complete mapping in your Mapping Designer Workspace. Make the necessary changes to the mapping if required.
    SCD3 – store something, if not everything!
  7. Change the target data type from the flat file to the Oracle table, as shown in the following figure:
    SCD3 – store something, if not everything!

When we create the mapping using this option, the wizard creates three additional columns in the target table:

  • PM_PRIMARY_KEY: The wizard generates the primary key for each row to be inserted into target. Note that EMPLOYEE_ID will not be the primary key in the table.
  • PM_PREV_columnName: For every column for which we maintain the history, the wizard generates a previous column. In our case, we wish to maintain the history for the LOCATION field, so the wizard creates another column, which is PM_PREV_LOCATION.
  • PM_EFFECT_DATE: This is an optional field; the wizard loads SYSTEMDATE in this column to indicate insertions or updates to the record in the table.

The Informatica Power Center SCD2 mapping uses the LKP_GetData lookup transformation to look up the data in the Target table and uses the EXP_DetectChanges expression transformation to compare the target data with the source data. Based on the comparison, the expression transformation marks a record as NewFlag or ChangedFlag. The mapping is divided into two flows:

  • The FIL_InsertNewRecord filter transformation allows only NewFlag record to pass further and filter the ChangedFlag record from the first flow. It passes new records to UPD_ForceInserts, which inserts these records into the target. The SEQ_GenerateKeys sequence generator generates the primary key for each NewFlag record. If you select to create the PM_EFFECT_DATE column option in the wizard, the EXP_EffectiveDate_InsertNew expression transformation loads SYSTEMDATE into the PM_EFFECT_DATE column to indicate the loading of new records.
  • The FIL_UpdateChangedRecord filter transformation allows only the ChangedFlag record to pass further. The current data is passed from the SQ_EMP_FILE source qualifier, and the previous data is taken from the target by using a lookup transformation to load the data in PM_PREV_LOCATION. It passes changed records to UPD_ChangedUpdates, which updates changed records in the target. If you select to create the PM_EFFECT_DATE column in the wizard, the expression transformation EXP_EffectiveDate_InsertChanged updates SYSTEMDATE in the PM_EFFECT_DATE column to indicate that new records have been updated.

Let's work through each transformation that is used in the SCD2 mapping:

  • Source qualifier (SQ_EMP_FILE): This extracts the data from the file or table that you used as the source in the mapping. It passes data to the downstream transformations, that is, lookup, expression, and filter transformation.
  • Lookup (LKP_GetData): This is used to look up the target table. It caches the existing data from EMPLOYEE_SCD3. The EMPLOYEE_ID=IN_EMPLOYEE_ID condition will compare the data with the source table and target table. It passes the data based on the comparison with the expression transformation.
  • Expression (EXP_DetectChanges): This receives the data from the upstream transformation and based on that, it creates two flags, which are NewFlag and ChangedFlag. The conditions for both the flags are as follows:
    • NewFlag: IIF(ISNULL(PM_PRIMARYKEY), TRUE, FALSE)
    • ChangedFlag: IIF(NOT ISNULL(PM_PRIMARYKEY) AND (DECODE(LOCATION,PM_PREV_LOCATION,1,0)=0), TRUE, FALSE)

    Based on the condition, it passes the data to downstream filter transformations.

  • Filter (FIL_InsertNewRecord): This filters the records that come from the upstream expression transformation and are marked as ChangedFlag; it allows records with NewFlag to get passed to the UPD_ForceInserts update strategy.
  • Filter (FIL_UpdateChangedRecord): This filters the records that come from the upstream expression transformation and are marked as NewFlag; it allows records with ChangedFlag to get passed to the UPD_ChangedUpdate update strategy. It uses the value of the LOCATION field that is returned from LKP_GetData to load PM_PREV_LOCATION.
  • Update strategy (UPD_ForceInserts): This uses the DD_INSERT condition to insert data into the EMPLOYEE_SCD3 target instance.
  • Update strategy (UPD_ChangedUpdate): This uses the DD_UPDATE condition to overwrite the existing LOCATION field into the EMPLOYEE_SCD3 target instance. It passes data to EXP_EffectiveDate_insertChanged in order to load PM_PREV_LOCATION in the target.
  • Sequence generator (SEQ_GenerateKeys): This generates a sequence of values for each new row marked as NewFlag that comes into the target, getting incremented by 1. It passes the generated value to EXP_KeyProcessing_InsertNew.
  • Expression (EXP_EffectiveDate_InsertNew): This transformation is created by the wizard only if you selected to load the PM_EFFECT_DATE option in the wizard. It loads the generated value in the PM_PRIMARYKEY column into the target, which is EMPLOYEE_SCD3. It loads SYSTEMDATE into the PM_EFFECT_DATE column in the target, marking the start of the record.
  • Expression (EXP_EffectiveDate_InsertChanged): This transformation is created by the wizard only if you selected to load the PM_EFFECT_DATE option in wizard. It loads the generated value in the PM_PRIMARYKEY column in the EMPLOYEE_SCD32 target instance. It loads SYSTEMDATE into the PM_EFFECT_DATE column in the target in order to indicate that the record has been updated.
  • Target (EMPLOYEE_SCD3): This is the Target table instance that accepts new records into the target table instance.
  • Target (EMPLOYEE_SCD31): This is the Target table instance that accepts updates to the existing row in the target table instance.

With this, we saw in detail how to implement the different types of SCDs. Note that we have learned how to implement SCD using wizard. You can also manually create the mapping in order to get more practice and better hands-on experience.