Data Vault Core Components

This is a quick overview of the Data Vault Core Components

The data vault consists of three core components, the Hub, Link and Satellite.  While we are all discussing certain exceptions and details around Data Vault deployments, I thought it would be useful to “get back to basics” on the main building blocks.  In the end, committing to these constructs in the most pure form should be our main goal.

The Hub represents a Business Key and is established the first time a new instance of that business key is introduced to the EDW.  It may require a multiple part key to assure an enterprise wide unique key however the cardinality of the Hub must be 1:1 with the true business key.  The Hub contains no descriptive information and contains no FKs.  The Hub consists of the business key only, with a warehouse machine sequence id, a load date/time stamp and a record source.

A Link represents an association between business keys and is established the first time this new unique association is presented to the EDW.  It can represent an association between several Hubs and other Links.  It does maintain a 1:1 relationship with the business defined association between that set of keys.  Just like the Hub, it contains no descriptive information.  The Link consists of the sequence ids from the Hubs and Links that it is relating only, with a warehouse machine sequence id, a load date/time stamp and a record source.

The Satellite contains the descriptive information (context) for a business key.  There can be several Satellites used to describe a single business key (or association of keys) however a Satellite can only describe one key (Hub or a Link).  There is a good amount of flexibility afforded the modelers in how they design and build Satellites.  Common approaches include using the subject area, rate of change, source system, or type of data to split out context and design the Satellites. The Satellite is keyed by the sequence id from the Hub or Link to which it is attached plus the date/time stamp to form a two part key. 

Note that the Satellite then is the only construct that manages time slice data (data warehouse historical tracking of values over time). 

These three constructs are the building blocks for the DV EDW.  Together they can be used to represent all integrated data from the organization.  The Hubs are the business keys, the Links represent all relationships and the Satellites provide all the context and changes over time.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: