OLAP is one technology that has provided this ability, especially in cases where the data is structured and well-defined with not many changes in data schema are expected over a period of time. OLAP is an acronym for ‘Online Analytical Processing, a technique of analyzing data to look for insights. The term ‘Cube’ here refers to a multi-dimensional dataset, which is also sometimes called a hypercube if the number of dimensions is greater than 3.
Here the data are represented in non-normalized form in a table known as fact table and is pre-populated as needed for reporting.
Below is a logical representation of OLAP cube with 3 dimensions,
OLAP Cube with 3 dimensions.
Once data are populated for this cube, we can easily get the number of page views to the Catalogue page 1 in the year 2011 by just querying on the table. We do not need to join tables to retrieve this data.
This saves a lot of time, especially when the tables are huge and lots of joins are required to get the required result.
Moreover, various operations on this cube help us further in the analysis.
Slice is the act of picking a rectangular subset of a cube by choosing a single value for one of its dimensions, creating a new cube with one fewer dimension. This helps to focus on the remaining data when we are not concerned with the sliced dimension. For e.g. we are just concerned with like then we can slice other action dimensions.
The dice operation produces a sub-cube by allowing the analyst to pick specific values for multiple dimensions. This allows the analyzer to focus only on a particular subset of the dimensions for e.g. if he wants to analyze only on data for the years 2011 and 2012 for like and sign up on the Home page.
It allows the user to navigate among levels of data ranging from the most summarized (up) to the most detailed (down)
A roll-up involves summarizing the data along a dimension. The summarization rule might be computing totals along a hierarchy or applying a set of formulas such as “profit = sales – expenses”.
The OLAP cube is implemented in the POC project in Planes database.
Here for the sake of simplicity only two dimensions are selected via, Destination and Origin airport.
The basic architecture for implementing the OLAP cube in a separate OLAP database is different from the usual application database.
Following are the stages for implementing OLAP database design:
OLAP schema design: Currently in this project we are using the Star Shaped with Fact Table as Fact_Plane and Dimension Table as Airport
OLAP Initialization: It is the process of creating the OLAP database with required tables. In the project it is done through Stored Procedure
OLAP Hydration/Loading: This involves loading the OLAP tables with required data. This is done through Stored Procedure
Querying on OLAP: This is the process of querying the OLAP database to get the required data for reporting and analysis purposes.
This enables us to query the Fact table with no Joins and only into comparisons.