Tag Archive for: performance

There are a ton of reasons to convert a planning load file to an Essbase load file.  Maybe you are migrating a file from one environment to another, or simple want to load the file faster, but there are reasons to use the Essbase format. Read more

 

Say what? An application where no calculations are performed with the Essbase calculation engine?

Business logic, currency, and every other calculation can be performed within Groovy outside of Essbase. I don’t know how I feel about this yet, but the thought of building this in calculations on an ASO cube is intriguing. The elimination of consolidating BSO cubes and data movements with Smart Pushes is also interesting. Some of my coworkers have built entire applications this way with great success and speed. I wanted to replicate this on a small scale to identify the pros and cons of this methodology, and here is what I found. Read more

 

There are a lot of reasons one might loop through children in a Groovy Calculation.  On my journeys with Groovy, I have run into a few roadblocks where this has been helpful.  Some of these were related to limits in PBCS.  Looping through sets of members allowed us to get around some of the limitations. Read more

 

No, But Can It Solve Yours?

I received a lot of positive feedback on the Groovy Series and have been asked a many great questions.  People are excited about the improvements but are still a little hesitant to buy in to the hype.  They question, and rightfully so,  Read more

 

Introduction

With the introduction of Groovy Calculations this summer, one of the things I use most, especially for applications with data forms that include a large sparse dimension in the rows with suppression on, is the option to loop through cells and identify only the POV on the cells that have changed.  Read more

 

The generic rule in Essbase is that calculations FIX on sparse members because sparse members are what define the number of blocks.  When you want to limit the members of the block on which the calculation is executed, an IF statement is appropriate. Read more

 

The format of the data that is loaded to Essbase is often an after-thought.  But, should it be?  When requesting the data file from a source system, it is more important than you may think to have it sorted to mirror your outline.

Assume an outline has the following dimensions.

  • Period [DENSE]
  • Account [DENSE]
  • Region [SPARSE]
  • Category [SPARSE]
  • Product [SPARSE]
  • Organization [SPARSE]

The most efficient way to receive a data file would be to have it sorted by Organization, Product, Category, Region, and then Account.  Data files load faster when the columns that hold the sparse members are sorted in reverse order of the sparse dimensions that exist in the outline.

The reason the data loads faster is because it opens a block of data only one time.  If the data was sorted by the dense members first, then every block would have to be opened multiple times.  If the same sparse member combinations have 3,000 dense members with data, the block would be opened up to 3,000 times.

There are some more important benefits of doing this, however.  When the block is opened multiple times, the database becomes far more fragmented than it needs to be.   Fragmentation causes calculations to be slower and retrieving data can also be impacted, which can lead to frustrated customers.

By not sorting the data when loaded, every time a data load occurs, any performance issues that may exist are exacerbated.  So, anytime possible, sort the data load files by the last sparse dimension in the outline, the second to last sparse dimension in the outline, and so on.  You may be presently surprised at the benefits.

 

Everybody knows the quickest way from point A to point B is a straight line.  Everybody assumes that the path is traveled only one time – not back and forth, over and over again.  I see a lot of Essbase calculations and business rules, from experienced and novice developers, that go from point A to point B taking a straight line.  But, the calculation travels that line multiple times and is terribly inefficient.

Here is a simple example of a calculation.  Assume the Account dimension is dense, and the following members are all members in the Account dimension.  We will also assume there is a reason to store these values rather than making them dynamic calc member formulas.  Most of these are embedded in a FIX statement so the calculation only executes on the appropriate blocks.  To minimize confusion, this will not be added to the example.

Average Balance = (Beginning Balance  Ending Balance)  / 2;
Average Headcount = (Beginning Headcount   Ending Headcount) / 2;
Salaries = Average Headcount * Average Salaries;
Taxes = Gross Income * Tax Rate;

One of the staples of writing an effective calculation is to minimize the number of times a single block is opened, updated, and closed.  Think of a block as a spreadsheet, with accounts in the rows, and the periods in the columns.  If 100 spreadsheets had to be updated, the most efficient way to update them would be to open one, update the four accounts above, then save and close the spreadsheet (rather than opening/editing/closing each spreadsheet 4 different times for each account).

I will preface by stating the following can respond differently in different version.  The 11.1.x admin guide specifically states the following is not accurate.  Due to the inconsistencies I have experienced, I always play it safe and assume the following regardless of the version.

You might be surprised to know that the example above passes through every block four times.  First, it will pass through all the blocks and calculate Average Balance.  It will then go back and pass through the same blocks again, calculating Average Headcount.   This will occur two more times for Salaries and Taxes.  This is, theoretically, almost 4 times slower than passing through the blocks once.

The solution is very simple.  Simply place parenthesis around the calculations.

(
Average Balance = (Beginning Balance  Ending Balance)  / 2;
Average Headcount = (Beginning Headcount   Ending Headcount) / 2;
Salaries = Average Headcount * Average Salaries;
Taxes = Gross Income * Tax Rate;
)

This will force all four accounts to be calculated at the same time.  The block will be opened, all four accounts will be calculated and the block will be saved.

If you are new to this concept, you probably have done this without even knowing you were doing it.  When an IF statement is written, what follows the anchor?  An open parenthesis.  And, the ENDIF is followed by a close parenthesis.  There is your block!

"East"
(IF(@ISMBR("East"))
    "East" = "East" * 1.1;
ENDIF)

I have seen this very simple change drastically improve calculations.  Go back to a calculation that can use blocks and test it.  I bet you will be very pleased with the improvement.

 

As an Essbase user, you have more power to improve performance than you think.  How many times do you lock and send data through Excel, SmartView, or web forms, that include zeros?  How many times do you allocate data to a finite level out of convenience?  Understanding what this does to Essbase is critical to understanding how a user can negatively impact performance without adding any value to the analysis or the results the database produces.

I analyzed a planning database used in one of the largest financial institutions in the world.  Over 60% of the values entered were zero.  Another 20% of the values were less than 1 dollar.  By eliminating the zeros, the total calculation time of the planning application was under 20 minutes.  With the zeros, it was nearly 2 hours.

There are two reasons for this.  First, there is a different between empty and zero.  Empty consumes no space to store whereas a zero consumes the same space as 1 billion.  Think of this as a grocery bag.  If you fill a grocery bag with nothing, it takes up no space.  If you fill it with empty cans (a zero), it consumes the same amount of space as if those cans were full (1 billion).

The example below is very common.  Assume that a forecast needs to be done for the last 3 months of the year.  Frequently, a spreadsheet would hold zeros for the first 9 months.  18 cells have zero and 6 cells have a positive value.  That means that 75% of your data could be eliminated by not loading zeros.

The same load with #Missing is more effective.

I highly recommend reading the article explaining dense and sparse to understand what a block is and what it represents before you continue this article.

There is also another very significant factor in loading zeros.  Loading a zero that creates a block just to hold a value of zero can explode the size of the database, as well as the time it takes to consolidate and execute business rules.  The more blocks that have to be loaded and consolidated, the longer it takes to finish.  If each block was a spreadsheet and you had to do this manually, you would have to open each spreadsheet and enter the number into a calculator to consolidate.  If 75% of the blocks you opened were zero, it wouldn’t change your total, but it would drastically increase the time it takes because you still have to open each spreadsheet.   If an Essbase database has 1,000 blocks, and 75% of them only hold zeros, it will likely take 2 or 3 times longer to calc the zeros because it still has to open the block and add the zero.  Remember, a zero acts no differently than a value of 100.

As an example to the above, the following example would create a block for South and West, inflating the database size.

 

Users can significantly reduce this unnecessary explosion in size by loading a blank as apposed to a zero.  If zeros are already in the database, leaving the cell blank will NOT overwrite the zeros.  If zeros are loaded inadvertently, a #Missing has to be used to remove them.

For all you users loading data, it can be a hassle removing the zeros.  Being responsible can significantly improve your experience with Essbase.  To make it easier, take a look at the function in the In2Hyperion Excel Ribbon that replaces all zeros with #Missing.

 

Fragmentation occurs naturally when a database is used frequently by adding, deleting, and modifying the data within it.  The more changes occur, the more fragmented the database gets as data becomes scattered through the pag files, and the size of the database becomes inflated.  The index files have to compensate for this, and what starts as a simple map becomes a spaghetti mess.

If you are unfamiliar with Essbase’s storage method, here is a brief overview.  Essbase has two sets of files related to the data stored in a database.  The numeric data is stored in files with an extension of pag.  Essbase also has files with an ind extension.  These index files are used to store the pointers to the data in the pag files.  As data is requested, Essbase must read the index files to know where the data is located in the pag files.

The result of a more fragmented database can have drastic effects on size and performance.  If you have a database where performance continues to decrease, fragmentation might be the source of the problem.  Performance degradation can occur over weeks or months, but can also occur much more frequently.  Databases with frequent data loads, or updates, can be impacted within a day.

A great way to identify the impact fragmentation is having with a database is to export your data (level 0 in most cases), reload it, and execute the process in question.  By exporting and reloading the data, fragmentation can be completely eliminated.

For more information about pag or ind files, please refer to the database administrator’s guide provided by Oracle.