Consejos para el Diseño de Cubos OLAP – PARTE 1
Of all the topics likely to spoil the mood at a dinner party, OLAP cube design theory is one of the most effective. Or at least I would imagine so. For some of us, tasked with building Business Intelligence on a daily basis, it is a necessary discipline to master and an important topic to discuss.
So, without trying to cover all the fundamentals of cube design, as there is plenty of material out there on that, I would like to offer some opinions to remind cube builders about some basic things you should focus on to better meet the functional and performance needs of your users.
Cube Usability and Performance
There are many contributing factors to a cube’s usability and performance:
- Is it nicely organized?
- Is it targeted with the required data?
- Is it lean with no duplications?
Less is more
We can all get bogged down in the requirements phase and it’s extremely easy to let the seemingly endless lists of requirements persuade you to agree to things that you know you shouldn’t. It is my opinion that you are best to challenge items you don’t agree with up front rather than to simply accept and include them initially, only to start removing them later, after things don’t go so well.
Resist the «That is the way we have always done it» there is usually always a better new way of doing it!
Have you ever had to work with a cube where the names of all the dimensions and their members are the names of the columns of the underlying database? It’s horrible!
To illustrate, let’s play…(answers at the bottom of this post)
- Can you decipher the table and column name for f0911.GLAA
- What is the difference between.
- Customer ID Name
- Customer Name ID
When you are under pressure to deliver, this is usually the first thing to slip. Of course, it is true that if you persist with using a cube in this state you will soon get used to it. But you’re experienced with BI—spare a thought for the poor end users.
Paying attention to naming your cube components as clearly as possible will go a long way to maximizing user adoption and the overall success of your project.
So, along with crazy names, what about items that are duplicated at the same grain, or, what is worse, just plain redundant like an attribute in a dimension named Division when there is a whole dimension for Division already. Duplication and redundancies also contribute towards confusing users and can have detrimental effects on cube build times and overall query performance. Why would you process and present the same data more than once in your cube? Isn’t it getting too big already? You can always use member properties as alternate display captions on members. Again, when the pressure is on, these are the things that will slip.
Weeding out any duplications in your cube will contribute to good cube performance and happy users. I recommend you make this a priority.