Data Warehousing - Date Dimension

In this example I am going to show you how to create a date dimension in SQL Server. You can take the code and apply it to whatever other RDBMS you are working with.

In Data Warehousing - What is a Dimension?

A dimension is a data entity used to categorize a resultant dataset. If this explanation proves to be a bit obtuse you can work through the slideshare presentation I have embedded.

Warehousing dimension star-snowflake_schemas from Eric Matthews

Date Dimension Misconceptions

When discussing data dimensions I have noted that every explanation I have encountered focusses solely on the schema alone. Frankly this is not enough explanation, even for database professionals that are entering the field of data warehousing for the first time.

It may not be intuitive what a date dimension entails. The best way to understand it is for us to build one.

Building a Date Dimension in SQL Server

We are going to build a date dimension table with the hierarchy Day, Date, Month, Year and Quarter.

Date Dimension Table Schema

CREATE TABLE dbo.DateDimension
 id            INT              NOT NULL  IDENTITY(1,1)  , 
 Day           INT              NOT NULL  , 
 Month         INT              NOT NULL  , 
 Year          INT              NOT NULL  , 
 quarter       INT              NOT NULL  , 
 date          DATETIME         NOT NULL  , 
 datevarchar   VARCHAR(30)      NOT NULL  , 


T-SQL Script to Populate the Data Dimension Table

DECLARE @startdate datetime
   DECLARE @enddate datetime
SET @startdate = '01/01/2000'
   SET @enddate = '12/31/2013'
DECLARE @loopdate datetime
   SET @loopdate = @startdate
WHILE @loopdate <= @enddate
   CASE WHEN Month(@loopdate) IN (1, 2, 3) THEN 1
   WHEN Month(@loopdate) IN (4, 5, 6) THEN 2
   WHEN Month(@loopdate) IN (7, 8, 9) THEN 3
   WHEN Month(@loopdate) IN (10, 11, 12) THEN 4
   SET @loopdate = DateAdd(d, 1, @loopdate)

Looking at the data

id    Day  Month  Year  quarter  date                   datevarchar  
1     1    1      2000  1        2000-01-01 00:00:00.0  2000/01/01   
2     2    1      2000  1        2000-01-02 00:00:00.0  2000/01/02   
3     3    1      2000  1        2000-01-03 00:00:00.0  2000/01/03   
4     4    1      2000  1        2000-01-04 00:00:00.0  2000/01/04   
5     5    1      2000  1        2000-01-05 00:00:00.0  2000/01/05   
6     6    1      2000  1        2000-01-06 00:00:00.0  2000/01/06   

Debrief on Date Dimension Code

First, looking at the data you can see that our data dimension is a fixed data set that represents the dates we might want to associate with a Fact table. In the example presented I loaded date data from 2000 through 2013. Dimensions provide structured label information to otherwise unordered numeric measures. Put in other terms, dimensions provide key resolution that allows us to better contextualize the returned dataset.

The T-SQL code provided to load the data is for the most part self-explanatory (assuming you understand T-SQL Code). I did add a varchar field to provide just the date portion of the datetime column. There are other ways to do this, but you should get the general idea.