Saturday 5 July 2014

Slowly Changing Dimensions (SCD)

Slowly Changing Dimensions (SCD) : With Slowly Changing Dimensions (SCDs) , data changes slowly, rather than changing on a time-based, regular schedule. In Data Warehouse there is a need to track changes in dimension attributes in order to report historical data.
Example of SCD dimensions are : Employee marital Status , City
Varios Types of SCD:
·         Type 0 - The passive method
·         Type 1 - Overwriting the old value
·         Type 2 - Creating a new additional record
·         Type 3 - Adding a new column
·         Type 4 - Using historical table
·         Type 6 - Combine approaches of types 1,2,3 (1+2+3=6)

SCD Type 1: Overwriting the old value. In this method no history of dimension changes is kept in the database. The old dimension value is simply overwritten be the new one. 
Example, we have this table:
Emp ID
Empname
City
123
Naveen
Delhi

After Naveen moved from Delhi to bangalore, the new information replaces the new record.
Emp ID
Empname
City
123
Naveen
Bangalore

The disadvantage of the Type I method is that there is no history in the data warehouse. It has the advantage however that it's easy to maintain
Note: This type should be only used when it is not  necessary  to track  historical changes.

SCD Type 2: This method tracks historical data by creating multiple records means a new record is added to give us the new information hence both records will be present

In our example,we have this table 
Emp ID
Empname
City
Start_Date
End_Date
Active_Flg
123
Naveen
Delhi
14-10-2010
31-12-9999
Y

after naveen moved from Delhi to bangalore ,a new record will be added in the table:
Emp ID
Empname
City
Start_Date
End_Date
Active_Flg
123
Naveen
Delhi
14-10-2010
22-5-2014
N
124
Naveen
Bangalore
23-5-2014
31-12-9999
Y

The advantage of the Type 3 method is that it helps us to keep all historical information  .it has the disadvantage however that this will cause the size of the table to grow fast

Note: This type should be only used when it is necessary  to track  historical changes.

SCD Type 3:  This method tracks changes using separate columns and preserves limited history.
In this type usually only the current and previous value of dimension is kept in the datawarehouse. there will be two columns one represents previous information  and another represents current information .there will also be a column that indicates when the current value becomes active.

Emp ID
Empname
Original City
Current_City
Effective_start_date
123
Naveen
Delhi
Bangalore
23-5-2014

The advantage of the Type 3 method is that it helps us to keep previous change only also it does not increase the size of table.it has the disadvantage however that it won't be able to keep all history where an attribute is changed more than twice.

No comments:

Post a Comment

A fatal error occurred while processing the request. The server responded with: FOProcessor cannot run without a valid TMP directory. Please check if TMP directory exist and write enabled.

When attempting to export a Dashboard to Excel the following error message in sawlog is found: A fatal error occurred while processi...