A conformed dimension is a dimension
that has exactly the same meaning and content when being referred from
different fact tables. A conformed dimension can refer to multiple tables in
multiple data marts within the same organization. For two dimension tables to
be considered as conformed, they must either be identical or one must be a
subset of another. There cannot be any other type of difference between the two
tables. For example, two dimension tables that are exactly the same except for
the primary key are not considered conformed dimensions.
Why is conformed dimension important?
This goes back to the definition of data warehouse being "integrated." Integrated means that even if
a particular entity had different meanings and different attributes in the
source systems, there must be a single version of this entity once the data
flows into the data warehouse.
The time dimension is a common
conformed dimension in an organization. Usually the only rule to consider with
the time dimension is whether there is a fiscal year in addition to the
calendar year and the definition of a week. Fortunately, both are relatively
easy to resolve. In the case of fiscal vs. calendar year, one may go with
either fiscal or calendar, or an alternative is to have two separate conformed
dimensions, one for fiscal year and one for calendar year. The definition of a
week is also something that can be different in large organizations: Finance
may use Saturday to Friday, while marketing may use Sunday to Saturday. In this
case, we should decide on a definition and move on. The nice thing about the
time dimension is once these rules are set, the values in the dimension table
will never change. For example, October 16th will never become the 15th day in
October.
Not all conformed dimensions are as
easy to produce as the time dimension. An example is the customer dimension. In
any organization with some history, there is a high likelihood that different
customer databases exist in different parts of the organization. To achieve a
conformed customer dimension means those data must be compared against each
other, rules must be set, and data must be cleansed. In addition, when we are
doing incremental data loads into the data warehouse, we'll need to apply the
same rules to the new values to make sure we are only adding truly new
customers to the customer dimension.
Building a conformed dimension also
part of the process in master data management, or MDM. In MDM, one must not only make sure the master
data dimensions are conformed, but that conformity needs to be brought back to
the source systems.