OBIA 11g: Dimension Ch-Ch-Changes
A little primer on Slowly Changing Dimensions (SCD). Slowly Changing Dimensions are a star schema design principle and design pattern used to reflect how source data field changes are handled in dimension tables. There are basically 5 types that I know of. Of which 2 are the most prevalent and used in OBIA. These 2 types are named Type 1 and Type2. (The other 3 are Time-stamp, Type 3 and Hybrid).
- Type 1: This type is used where no historical change tracking is necessary. Whenever a column value changes the dimension record is updated. The column now reflects the current value and there is no trace of the old value. Always AS-IS view on the dimensional attributes and history of dimension not maintained.
- Type 2: This type is used when historical change tracking is important. When the source data changes for a historical significant attribute, it will trigger for a new record to be created with these changes. The old value is preserved. In this type of SCD, columns can be either historically significant (SCD2 column) or not (SCD1 column). The SCD2 columns trigger for new records, SCD1 columns will not and will just update the current record, or if it is setup to update all history. Update all history, means these SCD1 columns will trigger an update for this column for all records for this particular unique dimension set.
Using Type 2 SCDs, one would or could have it that they can view dimensional records in two different views: point-in-time or AS-IS. Point-in-time is where the dimensional record at that events date is used. So given a factual event at say June 1st 2010, we would see the snapshot or version of that dimension’s values on June 1st 2010. With AS-IS, we always see the most current values of the dimension. So for that same example above, we would see the most current values of the dimension instead of the historical snapshot.
In OBIA versions 7.9.X and older, it always used point-in-time. For someone that wished for the AS-IS view, there would be significant customizations and changes in the ETL and RPD. Oracle recognized this deficiency and changed its methodology starting in 11g. In 11g, we now have point-in-time still OOTB; but, the switch to AS-IS is very simple and just a simple RPD change. Welcome the SCD1_WID column!
Now for each Dimension table (_D), that can be setup for SCD2, there is a corresponding Type 1 table (_T1_D). So for example, W_EMPLOYEE_D has a corresponding W_EMPLOYEE_T1_D table, as this dimension table can be setup for SCD2. The type 1 table holds only one record for each unique dimension record, uniquely identified by datasource_num_id and integration_id. It is here where the SCD1_WID value is born. The type 1 table has a SCD1_WID column which is the surrogate key for the type 1 table and populated by a sequence number. So for our example, W_EMPLOYEE_D which can have N records for say employee ‘John Smith’ there will only be one record for ‘John Smith’ in the T1 table. This record will be uniquely identified by the SCD1_WID. These T1 tables are very thin and typically only have the following 6 columns:
The T1 SCD1_WID values are used to populate the related dimension’s SCD1_WID column and also the conformed Facts to the dimension. So the T1 tables are first populated, then the dimensions perform a lookup to their T1 table to get the SCD1_WID to populate its SCD1_WID column. The facts then lookup the dimensions and use their SCD1_WID column for the WID foreign key fields, if the dimensions are SCD else it uses the ROW_WID value.
In OBIA 7.9.x and older, the dimensions ROW_WID was always used, as there were no SCD1_WID columns. So facts always joined to dimensions via point-in-time, with no flexibility. Now they use the SCD1_WID and offer the flexibility to change between the two views.
All of this is managed in the ETL layer within the ‘IKM BIAPPS Slowly Changing Dimension’. This KM manages the T1 table as well as the SCD logic for the dimension. It inserts records in the T1 table within the ‘Insert New Type 1 values’ step as shown below.
The dimension’s SCD1_WID column is also populated by a lookup to the T1 table later on in another step within the same KM. In essence this SCD IKM handles everything in terms of the SCD functionality and this T! table.
Within the facts’ SIL interfaces, they use this SCD1_WID column from the dimensions to populate the _WID fields. It will use the ROW_WID for those dimensions that are not SCD, aka Type 1 dimensions.
The RPD physical join expressions used to join the facts to the dimensions via the ROW_WIDs, but now they utilize the SCD1_WID as seen below.
In the 11g out of the box RPD, it is setup to use the point-in-time view. Just as it did so in prior releases. However, if one would want to use instead the AS_IS view it is just a simple RPD change. Only a change in the physical join expression is needed. I will use the W_WRKFC_EVT_MONTH_F -> W_EMPLOYEE_D join as the example:
W_WRKFC_EVT_MONTH_F.employee_wid = W_EMPLOYEE_D.scd1_wid and W_WRKFC_EVT_MONTH_F.effective_from_dt >= W_EMPLOYEE_D.effective_from_dt and W_WRKFC_EVT_MONTH_F.effecitve_from_dt < W_EMPLOYEE_D.effective_to_dt
W_WRKFC_EVT_MONTH_F.employee_wid = W_EMPLOYEE_D.scd1_wid and W_EMPLOYEE_D.currrent_flg = ‘Y’
Just changing the expression pieces in bold, italicized dictates the dimensional view you wish to accomplish, point-in-time vs AS-IS.