What’s Your (Dimension) Type?
In the data warehousing world, much attention has been paid to detailing the different dimension types and the design selection criteria for each. A related data warehousing design conundrum has garnered little fanfare, however—a post go-live transition from one dimension type to another.
In today’s post, we will focus on the transition from a Type 1 to a Type 2 dimension—highlighting not the technical how-tos of Type 2 implementation but the steps to identify and modify the downstream items that could be adversely affected by your Type 2 implementation. We won’t go into the nuances of Type 1 and Type 2 dimensions within this post; rather, the esteemed Ralph Kimball has completed thorough write-ups that can be referenced to gain a greater understanding of each type:
While this post will be skewed to an Oracle infrastructure, these techniques can be tweaked for use across all of the major database and BI platforms.
Step 1: Identify Your Type 1 to Type 2 Dimension Transition Candidate
The first step in any Type 1 to Type 2 Dimension transition is to identify your dimension transition candidate, or the dimension you want to transition from a Type 1 to a Type 2, and what changes you will be tracking within the dimension.
Step 2: Identify All Database Objects Referencing This Dimension Transition Candidate
Once you’ve identified your dimension transition candidate, you’ll need to identify all of your database objects that reference this dimension. This can be done by querying the database system tables for all objects that contain either the primary key and/or the natural key of the dimension.
Within the Oracle Database:
SELECT OWNER, TABLE_NAME, COLUMN_NAME
WHERE OWNER IN (‘YOUR SCHEMA NAME 1’, ‘YOUR SCHEMA NAME 2’) AND COLUMN_NAME IN (‘THE PRIMARY KEY OF YOUR DIMENSION’, ‘THE NATURAL KEY OF YOUR DIMENSION’)
Step 3: Identify All ETL Scripts/Jobs That Populate Referential Objects & Modify Where Required
Once you’ve identified your population of database objects that reference your dimension transition candidate, you will need to identify the ETL script(s)/job(s) that populate each object. For each ETL script/job, you will need to determine whether modifications are required to accommodate the new Type 2 structure of your dimension transition candidate.
In most instances, the modifications that will be required will be on the joins to your dimension transition candidate. With most Type 1 dimensions, joins are made purely on the natural key of the dimension. These joins are valid, as in a Type 1 dimension you will find only one record per natural key within a dimension. However, in a Type 2 environment where a single natural key may have several records that reflect changes over time, joins will need to incorporate one of the following clauses, depending upon your business requirements, to maintain data integrity:
- The Transaction Date falls between the Effective and Expiration Dates of your Type 2 record — pulls the dimension record effective at the time of your transaction
- The Current Flag of your Type 2 record is True—pulls the dimension record effective at the time of the load
Step 4: Validate BI Metadata Object Joins & Reimport Dimension Metadata
In addition to the join validations, the dimension table metadata should be reimported into the OBIEE Physical Layer to bring in the additional columns you added to the database with your Type 2 implementation. Depending upon business requirements and whether your users want to see the effective dates of various dimension records or determine which dimension record is current at any given time, these columns should also be promoted to your Business Model and Mapping and Presentation Layers.
With the four sequential steps described above, you will ensure a Type 1 to Type 2 Dimension transition that maintains data integrity both in the transitioned dimension as well as all downstream items while providing your users the change tracking functionality they require.