Friday, January 4, 2013

Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1




The Oracle BI Applications 7.9.6 consists of a pre‐defined dimensional data warehouse, ETL routines, an
Oracle BI Enterprise Edition repository and example dashboards and reports. The ETL routines are built
using Informatica PowerCenter and are scheduled and orchestrated using the Oracle Data Warehouse
Administration Console.

The data warehouse provided with the Oracle BI Applications is designed to be deployed on either
Oracle Database Enterprise Edition, Microsoft SQL Server or IBM DB/2. As such, whilst it uses common
data warehousing features such as bitmap indexes, it does not make use of any Oracle‐specific features
such as segment compression, partitioning or materialized views. It is possible however to make use of
these features, and this paper sets out a methodology for their use with Oracle BI Applications 7.9.6,
along with the Oracle Data Warehouse Administration Console 10.1.3.4 or higher.

An Oracle Business Analytics Warehouse Overview
The Oracle Business Analytics Warehouse consists of a number of staging and presentation tables that
together support the loading and querying of enterprise data via a conformed dimensional model.
Tables are created as regular heap tables, with a minimal amount of NOT NULL check constraints but no
primary keys, foreign keys, partitions or other additional items of metadata. Tables are loaded via
Informatica PowerCenter 8.6, using the PowerCenter Integration Service and row‐by‐row data loading.
Aggregate tables are created and populated to support key fact tables, using separate ETL processes
after the main fact table loads that truncate, and then rebuild the aggregates.

It is however possible to customize the Oracle Business Analytics Warehouse to take advantage of
features such as segment compression, partitioning, materialized views and other Oracle data
warehouse features. It is also possible to add additional metadata such as primary key and foreign key
constraints, dimensions and other features to support more efficient querying of detail‐level and
summarized data. To illustrate how these Oracle features can be used to optimize the loading and
querying of the Oracle Business Analytics Warehouse, this paper will take on of the fact tables within the
data warehouse and apply these techniques to it.

Performance Optimization Scenario
The Oracle Business Analytics Warehouse contains a table called W_SALES_INVOICE_LINE_F that
contains fact data on the sales invoices generated by the business. It is supported by an aggregate table,
W_SALES_INVOICE_LINE_A that takes data from the original table and summarizes it to improve
query performance. In the sample data set used in this paper, these two tables had the following row
count and size.

select count(*)
from w_sales_invoice_line_a;
COUNT(*)
----------
36264
select count(*)
from w_sales_invoice_line_f;
COUNT(*)
----------
478571
select segment_name
, bytes/1024/1024 "Size in MB"
from user_segments
where segment_name in ('W_SALES_INVOICE_LINE_F’,
'W_SALES_INVOICE_LINE_A');
SEGMENT_NAME Size in MB
------------------------------ ----------
W_SALES_INVOICE_LINE_A 9
W_SALES_INVOICE_LINE_F 189.125

These tables are loaded by two DAC tasks, and a DAC Task Group:

1. TASK_GROUP_Load_SalesFact calls the following SIL and PLP tasks, and when they have
completed recreates any indexes required for supporting queries.
2. SIL_SalesInvoiceLinesFact, which initially drops the indexes on the fact table, then calls
either the SIL_SalesInvoiceLinesFact and SIL_SalesInvoiceLinesFact_Full
Informatica workflows for incremental and full loads respectively, and then recreates just those
indexes required for the rest of the ETL process.
3. PLP_SalesInvoiceLineAggregate, which again drops indexes this time on the aggregate
table, then calls either the PLP_SalesInvoiceLinesAggregate_Load and
PLP_SalesInvoiceLinesAggregate_Load_Full Informatica Workflows for incremental
and full loads of the aggregate table, then recreates the indexes required for the rest of the ETL
process.

Initial Benchmarks
To create baseline figures to compare your optimizations to, start the Oracle Data Warehouse
Administration Console 10.1.3.4 or higher, and create a new subject area within the DAC Repository that
uses the two tasks and one task group listed above.



Now switch to the Execute view in the DAC Console and create a new execution plan that deploys this
new subject area. Create the parameter list for the execution plan and build the set of ordered tasks, so
that you are ready to run the execution plan and generate some baseline ETL timings.



Now run the execution plan two times, firstly in FULL mode and then in INCREMENTAL mode, so that
you can compare them against subsequent timings to establish the benefit that each feature provides.

Adding Compression to the Fact Table
The first optimization task is to add the COMPRESS clause to the W_SALES_INVOICE_LINE_F fact table,
so that rows that are inserted using direct path operations are compressed. Compression is an Oracle
Database Enteprise Edition feature, and stores more rows of data into each individual data block to
provide two main benefits:
• less space is taken up by data warehouse data, and
• full table scans can be performed faster as less blocks are required to retrieve all the table’s
data.
To test the benefits of compressing the W_SALES_INVOICE_LINE_F fact table, first truncate it and then
alter the table to add compression.

SQL> truncate table w_sales_invoice_line_a;
Table truncated.
SQL> alter table w_sales_invoice_line_f compress;
Table altered.

You can then restore the ETL source tables back to their original state and run the full, and then
incremental loads into the fact and the aggregate tables in order to test that compression is working as
expected. Note that tables will only be compressed when data is inserted, as Informatica PowerCenter
by default uses bulk load functionality perform table inserts. Updates, or mixed insert/update loads will
not benefit from compression as Informatica will revert to conventional‐path inserts, and of course
updates remove compression from Oracle tables unless you are using the Advanced Compression
Option to the database.

Partitioning the Fact Table
Partitioning is an option to the Enterprise Edition of the Oracle Database that allows you to split one
large physical table into several smaller physical ones, with users still seeing it as one big table but giving
you far more flexibility in how you can store and manage data within it. Partitioning is typically used
with large fact tables and allows the DBA to assign each partition to separate tablespaces, which can
then be stored on different physical disk units and backed up independently. As with table compression
though, unfortunately the Data Warehouse Administration console does not have any concept of
partitioning and you will therefore have to carry out some additional steps to use this feature.

Tables such as the W_SALES_INVOICE_LINE_F table are normally created by the DAC administrator
when initially installing the Oracle Business Analytics Warehouse, by selecting Tools > ETL Management
> Configure from the DAC menu. However there is no provision to create tables using the PARTITION
BY(or COMPRESS) clauses, and so we can either create the table outside of the DAC, as we did in the
previous step for the COMPRESS clause, or we can use the Actions feature in DAC 10.1.3.4.1 to create
our table for us, using the requisite clause, before we try and do a full load into the table.
Actions are a new feature of the 10.1.3.4.1 version of the DAC and allow us to create table, index and
task actions.
• Table actions allow us to override the Truncate and Analyze steps carried out on tables during
an ETL process
• Index actions allow us to override the creation and dropping actions associated with indexes
• Task actions allow us to execute SQL and PL/SQL steps before or after a task executes.

Note that the 10.1.3.4 version of the DAC has a bug in it that corrupts the SQL text for an action. You will
need to apply a patch over this release to be able to carry out the actions in this paper (available from
Oracle Metalink, patch number TBA at the time of writing this paper), or install the 10.1.3.4.1 release
when it becomes available.

As DAC actions cannot override the creation step for a table, only the truncate and analyze steps, you
will add a new task action for the SIL_SalesInvoiceLinesFact task, that will run when the task is
run in FULL mode, and that will drop the existing, non‐partitioned version of the table and recreate it
using the required partitioning clause. To start this process, select Tools > Seed Data > Actions > Task
Actions from the DAC application menu.



At the Task Action dialog, press New to create a new action, call the action Create
W_SALES_INVOICE_LINE_F Partitioned, save the action and click in the Value text box to set the
table creation scripts.
Using the action Value dialog, create two steps, one to drop the table and the other to create it.



Make sure the drop step is listed above the create step is set to Continue on Fail, and enter the
following SQL statement to drop the table:

drop table w_sales_invoice_line_f

For the create table step, do not check the Continue on Fail checkbox, then enter the appropriate table
creation command into the SQL Statement text box, remembering to add the PARTITION BY clause to
the script, and the COMPRESS clause if you would like the table to be compressed as well.

create table w_sales_invoice_line_f
(sales_ordln_id varchar2(80 char),
……
x_custom varchar2(80 char) compress
partition by range (cost_center_wid)
(partition cc_1 values less tan 4000)
…….
Do not place any semi‐colons at the end of the SQL script as this will cause it to fail when run.

Even though this new task action will drop and then recreate, using partitioning, the
W_SALES_INVOICE_LINE_F table, the DAC still holds details of it in its repository together with details of
the indexes that are associated with it. As things stand, the DAC would drop these indexes as part of the
SIL task and recreate them using the task group task, however it does not normally understand the
concept of local indexes and will try and create them without any local or global clause, which has the
effect of creating them as global indexes. To instruct the DAC to create our indexes as local indexes, you
now need to create an Index Action to override the normal index creation process for these indexes.

The first step in this process is to define the index action, then you will associate it with the relevant
indexes. To create the index action, select Tools > Seed Data > Actions > Index Action, name the index
action, press Save and then enter the Value editor.



This index action will execute for every index that we associate it with. It consists of an SQL statement
that uses a built‐in DAC function to return the name of the index in question, and the list of columns
that it indexes.



The SQL statement to be used is shown below, with the DAC functions highlighted in bold:

create bitmap index getIndexName() on getTableName()
(getIndexColumns()) local nologging parallel

When the action is used, the DAC will substitute the index name, table name and index columns into the
SQL statement and thereby create the index in question as a local index. Next, save the action and
return to the main DAC console.

Now you need to associate the index action with the bitmap indexes that need to be created as local
indexes. To do this, navigate to the Indices tab in the Design view of the DAC, and query the repository
to return just the bitmap indexes associated with the W_SALES_INVOICE_LINE_F table. When the list of
indexes are displayed, right‐click anywhere on the list and select Add Actions…




Using the Add Actions dialog, select Create Index as the action type, Both as the Load Type, and then
select the index action that you created in the previous step for the Action Type.




Now you can associate the task action with the SIL DAC task that populates the
W_SALES_INVOICE_LINE_F table, so that it drops and recreates the table using partitioning when it runs
in full load mode. To do this, locate the SIL_SalesInvoiceLinesFact task using the Task tab in the
Design view, and select the Actions tab when the task is displayed. Then, select Preceding Action as the
Action Type, Full as the Load Type and then select the task action that you created earlier.




Now you can re‐run your execution plan, which will now include these index and task actions in the
steps that are carried out. After the execution plan completes, you can check the list of steps carried out
by the SIL_SalesInvoiceLinesFact task to see your task action being carried out.





Using Materialized Views for Table Aggregation
The Oracle BI Applications use aggregate tables to improve the performance of queries that request
aggregated data. These aggregate tables are then mapped into the Oracle Business Intelligence
Enterprise Edition metadata layer, so that the BI Server can internally rewrite queries to use these
aggregates. Post‐Load Processing (PLP) Informatica workflows load data into these aggregate tables,
either as a complete refresh or incrementally, as part of the subject area load.

The Enterprise Edition of the Oracle Database has similar functionality in the form of Materialized Views.
These allow database administrators to define aggregates in the database, which are then used at query
runtime to satisfy queries that required aggregated data. Materialized Views can be fast refreshable and
can be designed to either satisfy a single aggregation, a range of aggregations or can even be created
using an OLAP Analytic Workspace to meet the aggregation needs of an entire star schema.

Like partitioned tables and local indexes, the DAC does not contain out‐of‐the‐box functionality to
create and maintain materialized views. To add this functionality to your ETL process, you therefore
need to add two new actions to the DAC repository:

1. An action to create the W_SALES_INVOICE_LINE_A object as a materialized view rather than a
regular table, and to create the required Materialized View Logs to support fast refresh of this
object.
2. An action to perform the refresh of the materialized view, which you will use in place of the
regular PLP_SalesInvoiceLinesAggregate_Load and
PLP_SalesInvoiceLinesAggregate_Load_Full Informatica workflows.

To see how the existing aggregated table is populated, and to extract the base SQL statement that you
will need to create the materialized view, open the Informatica Designer application and locate the
PLP_SalesInvoiceLinesAggregate_Load_Full mapping. When you view the mapping logic, you
will see that the W_SALES_INVOICE_LINE_F table is joined to the W_DAY_D table through a Source
Qualifier mapping, which is then supplemented with a sequence number that is used to populate the
ROW_WID column.




Whilst we cannot reproduce the sequence functionality with a materialized view, you will be able to
take the data from these two tables and use it to initially populate, and then refresh, the materialized
view.

Like the previous example where you create a partitioned table, creation of the materialized view will be
performed by a new task action that you will associate with the
PLP_SalesInvoiceLinesAggregate_Load DAC task when run in Full mode. To create the action,
select Tools > Seed Data > Actions > Task Action, and create and save a new task action. At the Value
dialog, create individual steps to drop and recreate the required materialized view logs, then drop and
recreate the materialized view, using the following SQL statements:

1. Drop Materialized View Log on W_SALES_INVOICE_LINE_F

drop materialized view log on W_SALES_INVOICE_LINE_F

2. Drop materialized view log on W_SALES_INVOICE_LINE_F

drop materialized view log on W_DAY_D

3. Create Materialized View Log on W_SALES_INVOICE_LINE_F

create materialized view log on W_SALES_INVOICE_LINE_F

with sequence, rowed
( sales_ordln_id,
sales_pckln_id
...
discount_line_flg)
including new values

4. Create Materialized View Log on W_DAY_D

create materialized view log on W_DAY_D
with sequence, rowed
( row_wid,
calendar_date
...
x_custom)
including new values

5. Drop Materialized View

drop materialized view W_SALES_INVOICE_LINE_A

6. Create Materialized View

create materialized view W_SALES_INVOICE_LINE_A
pctfree 0
build immediate
refresh fast
as
select 1 as row_wid,
w_sales_invoice_line_f.chnl_type_wid

from w_sales_invoice_line_f,
w_day_d
where ….
group by …
)

Note that with the materialized view definition script, you will need to populate the ROW_WID column
with a constant, as it is not possible to create a materialized view that uses an Oracle sequence to
populate a column. You will also need to include COUNT(*) and COUNT(column) columns for all
aggregated columns in order for the materialized view to be fast‐refreshable. See the Oracle Data
Warehousing Guide available on http://docs.oracle.com for full details on creating materialized views.

Make sure that you mark all “drop” steps as Continue on Fail, so that the whole ETL process doesn’t
stop because the object did not exist in the first place, something that will happen when you first make
use of the action.




Now that you have the task action in place to create the materialized view and it’s associated logs, you
can now create another task action to refresh the materialized view. To do this, select Tools > Seed Data
> Actions > Task Action again, and this time create a new action to perform the refresh. Enter the
following anonymous PL/SQL block into the SQL Statement text box to refresh the materialized view:

begin
dbms_mview.refresh(‘OBAW.W_SALES_INVOICE_LINE_A’)
end;

Remember to replace the name of the schema with the one appropriate for your database.

In addition, create another task action called “Dummy Refresh” or similar that you will associate with
the PLP task run in full mode, create a new step within it but do not enter any SQL text. This is required
as running the task in full mode will create and refresh the materialized view automatically, but we need
an action to associate with the task to make it valid.

Once all of your task actions are created, including the ones used in the previous example, your list of
task actions should look like this:




Next, locate the PLP_SalesInvoiceLinesAggregate_Load task in the DAC Design view and change
the Execution Type to SQL File, then replace the Command for Incremental Load command with a call
to the Fast Refresh task action created previously, and the Command for Full Load command with a call
to the dummy action you created at the same time.




Then, switch to the Target Tables tab and un‐check the Truncate for Full Load checkbox, otherwise the
DAC will automatically truncate the materialized view just after you have created and refreshed it, and
subsequent fast refreshes will fail with an ORA-32320 error due to the truncation having counted as a
partition maintenance operation (PMOP).




Finally, to get the DAC to create the materialized view for you when the task is first run, add a new
Preceding Task action to the task to run the Create Materialized View task action you created
previously. Be sure to drop the existing W_SALES_INVOICE_LINE_A aggregate table using SQL*Plus
before you re‐run your ETL, as trying to drop it whilst referring to it as a materialized view will cause an
error and fail to drop the table.




Delete, and recreate the execution plan for your subject area, to pick up the changes to the PLP
mapping. Once you have done this, you are now ready to re‐run your DAC execution plan, to assess
what improvement to processing time these changes have made.

Quantifying the Improvements
When run against the author’s installation of Oracle Business Intelligence Applications 7.9.5, using a
subset of the Oracle E‐Business Suite Vision dataset, the following timings were recorded using these
scenarios:

1. Baseline run of the standard out‐of‐the‐box ETL routines
2. Addition of the COMPRESS clause to the W_SALES_INVOICE_LINE_F table
3. Adding of partitioning to the W_SALES_INVOICE_LINE_F table, and keeping compression.
4. All of the above, and replacement of the W_SALES_INVOICE_LINE_A table with a fast‐refresh
materialized view.

The results of these scenarios are shown in the table below.




Overall, using table compression on the main fact table reduced its storage requirement by 77%, from
189MB to 44MB. Using a fast‐refreshable materialized view, along with partitioning and compression,
reduced the ETL time for the main fact table and associated aggregate by 51% for a full load and 33% for
an incremental load.

In addition, queries against the partitioned version of the fact table that can benefit from partition
elimination can experience significantly lower execution plan costs. The following query and execution
plan were executed against the original, non‐partitioned version of the W_SALES_INVOICE_LINE_F
table:




When the table is subsequently partitioned though, queries that benefit from partition elimination show
a significant drop in their cost. In the example below, the table is being partitioned on
COST_CENTER_WID, whereas in reality it is likely to be partitioned on a date column, so that queries
that only require data for a particular range of months or years can avoid scanning the entire table.




Further Opportunities for Optimization
As well as the actions outlined in this paper, there are further opportunities for optimizing the ETL and
query processing carried out by the Oracle BI Applications when working with the Enterprise Edition of
the Oracle Database.

For example, the DAC does not create primary key or foreign key constraints on the tables that it
creates, which together with the default setting for the STAR_TRANSFORMATION_ENABLED parameter for newly created databases, in most cases means that star transformations are not used when the
database handles queries against fact tables that involve filtering against two or more dimensions. You
could therefore add addition task actions to SIL and PLP tasks to create and drop these constraints,
possibly using the RELY NOVALIDATE clauses to minimize unnecessary redo, set the STAR_TRANSFORMATION_ENABLED parameter appropriately and take advantage of this key Oracle data warehousing feature.

Another optimization possibility is to use the OLAP Option to Oracle Database 11g to replace the
materialized view outlined in this paper with a Cube Organized Materialized View, which could provide
aggregations for an entire star schema at multiple levels of aggregation. You would need to use Oracle
Analytic Workspace Manager (a free download from http://otn.oracle.com) to create the cube
organized materialized view, but once it is created it could be refreshed in the same manner as the
materialized view that this paper describes.

Conclusions
The ETL and query optimization techniques provided out‐of‐the‐box with the Oracle BI Applications
provides are appropriate for generic databases, but can be improved apon if you make use of the
specific data warehouse optimizations available on your actual target database. The Enterprise Edition
of the Oracle Database provides many such features including segment compression, materialized views
and partitioning, and this paper sets out how they can be used in conjunction with the new “actions”
feature available with the Oracle Data Warehouse Administration Console 10.1.3.4 and higher.