Archive

Tag Archives: Focal Point

Ensemble Modeling Forms:  Modeling the Agile Data Warehouse

Anchor ModelingData Vault ModelingFocal Point Modeling.  To name a few.  In fact there are dozens of data warehouse data modeling patterns that have been introduced over the past decade.  Among the top ones there are a set of defining characteristics.  These characteristics are combined in the definition of Ensemble modeling forms (AKA Data Warehouse Modeling).  See coverage notes on the Next Generation DWH Modeling conference here (and summary here).

The differences between them define the flavors of Ensemble Modeling.  These flavors have vastly more in common than they have differences.  When compared to 3NF or Dimensional modeling, the defining characteristics of the Ensemble forms have an 80/20 rule of commonality. 

  • All these forms practice Unified Decomposition (breaking concepts into component parts) with a central unique instance as a centerstone (Anchor, Hub, Focal Point, etc.). 
  • Each separates context attributes into dedicated table forms that are directly attached to the centerstone. 
  • Each uncouples relationships from the concepts they seek to relate. 
  • Each purposefully manages historical time-slice data with varying degrees of sophistication concerning temporal variations. 
  • Each recognizes the differences between static and dynamic data.
  • Each recognizes the reality of working with constantly changing sources, transformation and rules. 
  • Each recognizes the dichotomy of the enterprise-wide natural business key.

From that foundation of commonality, the various forms of Ensembles begin to take on their own flavors. 

  • While Data Vault is foundationally based on the natural business key as the foundation of the centerstone (Hub), both Anchor and Focal Point center on a unique instance of a concept where the business key is strongly coupled but separate from the centerstone (Anchor, Focal Point). 
  • Both Data Vault and Anchor aim to model Ensembles at the Core Business Concept level while Focal Point tends to deploy slightly more abstracted or generic forms of concepts. 
  • Data Vault and Focal Point utilize forms of attribute clusters (logical combinations) for context while Anchor relies on single attribute context tables. 

     And there are other differentiating factors as well.

There is one thing that we can all agree on: modeling the agile data warehouse means applying some form of Ensemble modeling approach.  The specific flavor choice (Data Vault, Anchor, Focal Point, etc.) should be based on the specific characteristics of your data warehouse program. 

* Learn more on Anchor Modeling with Lars Rönnbäck here: Anchor Modeling

Advertisements

Unified Decomposition sounds a bit like an oxymoron.   And sure enough the combining of unifying with the idea of breaking things into parts does seem innately contradictory.  But upon closer inspection this idea makes a good deal of sense – especially for the field of data warehousing. 

With an enterprise data warehouse (EDW), we want to break things out into component parts for reasons of flexibility, adaptability, agility, and generally to facilitate the capture of things that are either interpreted in different ways or changing independently of each other.  At the same time a core premise of data warehousing is integration and moving to a common standard view of unified concepts.  So we want to tie things together at the same time as we are breaking them out into parts.

Unifying

If you ever worked with object oriented design, you are probably accustomed the idea of encapsulation.  The idea of encapsulation is to bring together methods and data into the same object so that everything that deals with that object is contained within it.  One of the advantages of this kind of design is the ability to take an object class from one area and place it in another area knowing that everything it needs to exist (keys and descriptive context) and to perform (behaviors) moved along with it.  The object is self-contained. 

Another way to look at this is to think about a “self-contained underwater breathing apparatus” or “SCUBA” for short.  The idea is that everything you need to breathe underwater is contained in the same thing (apparatus).  You don’t need hoses to feed you air from a boat above.  Because the air, the tanks, the hoses, the mask, the regulator, and etc. are all contained in the same apparatus. 

These concepts both deal with bringing component parts together to form a whole.  This is the idea of unifying.  That we encircle everything we need to define a concept and keep all of the component parts together in this circle somehow.    

Breaking into Parts

The other part of unified decomposition is the idea of breaking things into component parts.  The decomposition is in some ways the opposite of unifying.  If we strive to keep things together, why then would we want to break them apart?  One major reason in data warehousing is that things change.  In fact things change all of the time.  If there is one constant it is that things change.  But not everything about a concept changes at the same time. 

Unified Decomposition

If the concept parts are all kept together (in the same table for example) then that would mean any change to any one component part would have an impact on the whole.  If we want to limit the impact of the changes we need to isolate the part that is changing.  In data modeling (especially for data warehousing) this theory is being deployed in many different forms.  If we are designing a database that needs to integrate data and also needs to maintain history then the benefits of decomposing the core concepts is very compelling.  This happens in Dimensional modeling with mini-dimensions and factless facts, it happens in Data Vault with hubs, links and satellites, but it also happens with other approaches such as Anchor Modeling, 2G and Focal Point.  The common theme is data warehousing and the common thread is decomposition.

Putting it all Together

If all we did was to break things apart then we would be missing half the story.  Much like the modem translates a digital signal into an analog signal (modulation) it is not of much use without taking that analog signal and translating it back to digital at the other end (demodulation). 

Taking a core concept which is represented as an entity (physically a table) and breaking it into component parts (lower level tables, held together by a common key) is the “mo” (modulator) part of the modem.  While this is great for data warehousing agility, it does not do much for the business users.  People who are considering some form of table decomposition variation (Data Vault, Anchor, 2G, Focal Point) are often stumped when they think about how to get their business intelligence team to access this data for their reporting and analytics.  Really the answer is simple.  Don’t access it.  We need the “dem” (demodulator) part of the modem – we need to first translate it back to digital – or in this case move it back to a combined form (entity).  And this combined form is typically a Dimension in a Data Mart. 

Think of it this way.  In your data warehouse architecture, you take a concept which is represented as a combined form (entity) and you break it into parts (hub, link, satellite).  You then put it back together into a combined form (dimension) and deliver it to your downstream users. 

Unified Decomposition Data Vault

Unified Decomposition Data Vault

Because we need to put it back together before we deliver to the data marts, the factor of unifying is a critical feature of unified decomposition.  That is to say that the modeling pattern that we deploy must somehow address the unifying at the same time as it is breaking things into parts.  With data vault modeling decomposition is the breaking out to Hubs, Links and Satellites and the unified is accomplished through the direct connection between the Hub and the surrounding Satellites and Links. 

 © Copyright Hans Hultgren, 2012. All rights reserved. Unified Decomposition™