http://reddymsbitools.blogspot.com

Thursday, 18 November 2010

Explain degenerated dimension in detail.

 Degenerate dimension: A column of the key section of the fact table that does not have the associated dimension table but used for reporting and analysis such column is called degenerate dimension or line item dimension.For ex we have a fact table with customer_id product_id branch_id employee_id bill_no date in key section and price quantity amount in measure section. In this fact table bill_no from key section is a single value it has no associated dimension table. Instead of cteating a seperate dimension table for that single value we can include it in fact table to improve performance.SO here the column bill_no is a degenerate dimension or line item dimension
-----Scenario 2
Degenerated Dimension is achieved through a gradual modeling approach following Dimensional Modeling standards. Let's take example of a Star Schema representing Sales Invoices. The FACT would have the "Invoiced Amount" as primary measure. Now when we look at the source of the Invoice, it is the body if the Paper Invoice that gives us the following particulars about each Invoice:
Invoice Date
Customer ID
Products within the Invoice
Reference to Order Number(s)
Invoice Number
Invoice Line Numbers (which are multiple lines in single Invoice)
Invoice Line Amount
Invoice Total Amount
When we model the above following Dimensional Modeling standards, we get following distinct Dimensions:
Calendar Dimension - representing the Invoice Date
Customer Dimension - representing Customer ID
Product Dimension - representing Products within the Invoice
Order Dimension - representing Orders
Invoice Dimension – representing Invoice Number & Invoice Line Numbers
Question comes - what attributes would be left to be part of the INVOICE DIMENSION, if at all we decide to have one! Only candidate attributes are Invoice Number and Invoice Line Numbers. But, this is at the granularity of the FACT, which stores references to all above said Dimensions as well as the measures i.e. Invoice Line Amount, Invoice Total Amount (Derived by aggregation).
It is at this situation, we may decide to degenerate the attributes Invoice Number & Invoice Line Number into the Fact and avoid having a distinct entity to represent Invoice Number / Line Numbers as a Dimension. What we achieve by this:
1. avoiding a huge join as both Fact and this Dimension would have the same granularity,
2. still able to query with Invoice Number as the entry point
So, when such a scenario appears, we make the left out attributes (i.e. Invoice Number & Invoice Line Number in our case) part of the Fact and part of the Primary Key in the Fact. This is why and how we model Degenerated Dimensions.

No comments:

Post a Comment