Thursday, January 3, 2013

An Introduction to the Oracle BI Applications



Whilst OBIEE is in itself a pretty well featured business intelligence platform, it’s still only a platform and you have to design and build the data warehouse, metadata repository yourself before you can do any reporting. An alternative to this “do it yourself” approach that Oracle have recently been promoting is the Oracle BI Applications, a packaged set of dimensional star schemas, ETL routines and prebuilt reports and dashboards that you can use with OBIEE. So what are the BI Applications, what technology do they use and how do you get started with them?
Oracle have sold packaged BI solutions for many years now, including the various “intelligence” products based around Discoverer that are sold along Oracle E-Business Suite, Daily Business Intelligence (DBI) that worked off materialized views created over the E-Business Suite database, tools like Oracle Financial Analyzer and Oracle Sales Analyzer that are based on the old Express Server OLAP technology, and the recent “Fusion Intelligence for E-Business Suite” that took the DBI materialized views and placed OBIEE on top of them, to “dashboard-enable” the existing DBI repository and reports. Oracle BI Applications takes this approach one step further though and provides a complete dimensional data warehouse (which is, I believe, based on the Informatica Warehouse that Siebel OEM’d from Informatica, together with a CRM data warehouse that Siebel developed on top of their on CRM product) together with ETL routines that take data out of ERP suites such as Oracle E-Business Suite 10 and 11i, SAP, Peoplesoft and Siebel and present to users a set of horizontal and vertical subject areas and dashboards. Prior to the Oracle acquisition Siebel had the most success in selling what they called at the time Siebel Business Analytics on top of their own Siebel CRM product, with limited success in the general ERP space. Now that Oracle have acquired Siebel the emphasis now is much more on the ERP packaged solutions, with their source-system agnostic data warehouse equally sold to SAP shops as well as Oracle/Peoplesoft/Siebel shops. About the only ERP suite that isn’t covered by the BI Applications is JD Edwards, but I’ve heard of partners supplying solutions in that area to fill in where Oracle have left off.
Oracle BI Applications is currently at version 7.9.5, which moved the underlying ETL infrastructure from Informatica 7.x to Informatica 8.x, whilst the previous 7.9.4 version extended source system support to Oracle E-Business Suite R12 (it now covers releases 11.5.8, 11.5.9 and 11.5.10 as well as R12). The data warehouse that it provides can run on most large-scale databases (DB2, Teradata, SQL Server, Oracle) and the applications themselves more or less all run on Windows and Unix hosts, except the Database Administration Console (DAC) and BI Administrator applications which currently only run on Windows. Installation itself is a little bit tricky; you have to install a Java SDK first of all, then install the BI Applications over OBIEE, then install Informatica (plus a patch), then add some Hibernate binaries to the DAC installation and then configure the DAC and Informatica Repositories. The install itself is easier on Windows than Linux and because you need a Windows install somewhere to administer it, and my laptop itself runs on Mac OS X, I installed it all in a single Windows Virtual Machine and run everything in the same environment.
An installation of the BI Applications consists of the following components:
  • The BI Applications installation, including the Data Warehouse Administration Console (DAC). This gets installed in the same location as your existing OBIEE installation, and adds files into the /OracleBI and /OracleBIData directories. Versions 10.1.3.3.2 through to 10.1.3.3.3 are supported with the latest 7.9.5 release of the BI Apps.
  • The DAC metadata repository files, which contain metadata that the DAC uses to schedule and orchestrate the Informatica ETL routines used to load the data warehouse
  • The Embedded Informatica ETL tool, which is actually a standalone installation that is licensed for the loading of the BI Apps data warehouse only
  • Prebuilt Informatica ETL routines used to load the data warehouse in steps that are orchestrated by the DAC tool
  • A prebuilt OBIEE repository (RPD) file used to hold the presentation, logical and physical models that OBIEE uses
  • Prebuilt reports and dashboard content, and
  • the Oracle Business Analytics Warehouse, a dimensional data warehouse that can sit on Oracle, SQL Server, DB2 or Teradata databases and holds the data that the Informatica ETL routines extract from the ERP suites and make available for reporting
So a key difference here then is that the BI Applications, unlike the standard EBI Discoverer reports that are usually shipped with E-Business Suite, report against a standalone data warehouse that takes its data, via ETL routines, from your ERP suite. A typical installation of the BI Applications might actually take some of its source data from Oracle E-Business Suite, some of it from Peoplesoft, some of it from SAP, and some of it perhaps from home-grown applications that the customer has built themselves. It’s this potential extensibility that sets the BI Applications apart from Daily Business Intelligence, which was quick to deploy and configure but very tricky to extend beyond it’s initial subject area coverage. Of course the true test of the value of the BI Applications is how much it actually covers what you want to report against; anecdotally I’ve heard that it often seems to cover about 40% or so of what you need, with the customer then needing to extend the data coverage using ETL routines they then need to build themselves using Informatica. Another comment I’ve heard is that the BI Applications work well if you’re an “out of the box” business that follows a standard business model and hasn’t made many customizations to their ERP suite. Indeed, if you’re planning to customize E-Business Suite, for example, you’re better off installing the BI Applications at the same time as you initially install E-Business Suite, and then applying the customizations to both E-Business Suite and the BI Applications as the same time, keeping both in sync, as if you try and do this later and customize the BI Applications some time after you’ve customized E-Business Suite, it’s more than likely that the original designers of the customizations have moved on and there is nobody around who can draw up the necessary list of changes.
Taken from the Oracle Business Intelligence Applications Installation and Configuration Guide, the Oracle Business Analytics Warehouse (OBAW, the official name for the dimensional data warehouse that comes with the BI Apps) has the following logical structure:
obaw_diagram.jpg
The ETL processes are orchestrated and co-ordinated using the DAC and are executed using an embedded version of Informatica (the stated direction for Oracle is to also provide this functionality, in time, using Oracle’s own Data Integrator product). Metadata is then provided through a prebuilt OBIEE repository that contains subject areas based on the analysis areas selected at install time (each one is separately licensed), whilst the data itself is loaded into the Oracle Business Analytics Warehouse (or, OBAW for short). The OBAW data model can be extended, along with the ETL mappings used to load it, by following a development guide and set of standards set out in the documentation. Some clients I have spoken to have mentioned this as a benefit of buying the product (it adds structure and a methodology to their data warehouse design process), others have mentioned that they see it as too complex and rigid. Some consultants believe that whilst this structure is good, the cost of the BI applications (a typical license transaction, including several subject areas and OBIEE itself is often north of a million dollars) outweighs this benefit and given that it typically only covers 40% or so of the customers requirements, they’d rather build the warehouse from scratch themselves. It’ll be interesting to see based on our future experiences which of these observations are right – I noticed a blog posting by Jeff McQuigg a few months ago where he mentioned a five-fold increase in productivity when he started with the OBAW as a base rather than building it all from scratch, I guess in the end it’ll be down to how well structured the client was in the first place and the degree to which the packaged subject areas actually match what the customer wants to analyze.
Going back to the install and configuration guide, a typical BI Apps installation is arranged in four logical tiers, as shown in the diagram below:
obiee_arch.jpg
The bottom tier holds the source databases used to load the OBAW. These can include the databases that hold your ERP data, and databases that you use to hold data used by custom applications. The next tier up from that contains the Informatica server components together with the DAC server that orchestrates the ETL routines. On a typical simple BI Apps installation, the DAC server gets installed into the /OracleBI directory used for OBIEE, but you can copy the directory structure onto a separate server if you want to host this elsewhere. Most customers install DAC on the same server as the Informatica binaries though they can be held separately.
The next tier up contains the databases that hold your Informatica and DAC repositories. These can be stored in the same database (the same schema, even) and can be on their own physical server, or placed on the server that holds the OBAW or the source system databases. The final tier is the client tier that holds the administration tools used to define and then execute the Informatica mappings.
Assuming you don’t change or extend the Informatica mappings, the main client tool you use when administering the BI Applications is the Data Warehouse Administration Console, or DAC for short. This is a Siebel/Oracle-developed application that currently only runs on Windows, I’ve seen ports on to Linux but these aren’t supported yet. The DAC is used to initially create the OBAW tables and then schedule and co-ordinate the ETl routines that load it. It uses its own repository tables and groups ETL executions into “Execution Plans” that package up the various stages and subject areas in the load process.
The DAC itself is a Java application that lists out all the subject areas in your OBAW, all the execution plans used to load the warehouse, all the connection details and the history of all jobs that have previously run.
dac_screenshot.jpg
As such it provides quite a bit of functionality that’s missing from Warehouse Builder; with Warehouse Builder, you can create mappings, schedule mappings and so on, but there’s no infrastructure around the running of logical sets of jobs, controlling the response to errors, generally administering the ETL process. In the past I’ve seen this missing functionality been provided by custom applications built using tools such as ApEx, but the DAC brings this all into one place and is a real added-value beyond just the ETL routines and pre-built dashboards that you normally think of when evaluating the BI Apps. I’m often asked to run courses of OWB adminstrators (as opposed to developers) and a tool like DAC would make this tasks a whole lot simpler.
So now you know how the BI Applications work, the question that’s probably on your lips if you class yourself as a BI developer is “how can I extend and customize the OBAW, together with the Informatica mappings and the BI Apps RPD? In a way this is the $64,000 question as if most customers need customizations, and Oracle are successful in selling the BI Applications, this is where a lot of development work could come from in future.
If you go down this route, two key tools that you are going to use to do this are the Informatica Designer application, used to define sources, targets, connections and mappings (like ODI, and like OWB without the data modeling component), and Informatica Repository Manager, used for administering and examining the contents of the Informatica repository. Taking a look at the repository project tree in the Designer application, it’s not too dissimilar to a project in Warehouse Builder, with each master folder corresponding to a different adapter (one for E-Business Suite 11.5.10, one for Siebel 7, and so on).
infa_design_1.jpg
Looking through the list of components, sources, targets, cubes and dimensions are fairly familiar, as are mappings which correspond to the mappings in Warehouse Builder. Mapplets correspond to Warehouse Builder pluggable mappings, and are used extensively in the OBAW load routines. The mappings themselves look pretty familiar to a Warehouse Builder developer, one difference I did notice when looking through the seeded mappings is that they all make use of this mapplet feature, with one mapplet sending data through a transformation, joining to something else then feeding into a mapplet. It’s worth remembering that Informatica, unlike OWB, does its data transformation “row-by-row” rather than set-based using SQL, again I’ve heard differing feedback on this – sometimes set-based transformations are quicker, but sometimes the amount of memory they require and the complexity of the SQL means that a row-based mapping runs faster. Of course in OWB you can choose either one or the other, I wonder how this will affect the migration to ODI though that exclusively uses set-based transformations?
Like OWB, Informatica has a workflow element that allows you to package up individual mappings into a single process. Workflows are administered using the Informatica Workflow Manager (like ODI, you access all of the individual Informatica components using a single logon and a set of buttons in the application toolbar). Like mappings and maplets, workflows have worklets (the same as sub-processes in Warehouse Builder) and a typical workflow process doesn’t look too different to a process flow in OWB or a package in ODI.
workflow.jpg
So what about customizing the provided ETL routines to bring in additional data, or indeed bring in a whole new subject area that’s not currently catered for? Oracle in the documentation categorize customizations into three types, confusingly (if you’re a dimensional data modeler) named Type 1, Type II and Type III customizations.
customization_grid.jpg
Type I customizations are the most common and involve adding additional columns to existing OBAW tables either from existing sources or new sources, Type II is a bit more complex and involves adding whole new tables, whilst Type III is concerned with adding rows to existing tables from a new data source. Whichever way you go, the first step is use Informatica Repository Manager to add a new folder to your repository which you’d typically call something like “Custom_ETL”. You then open the folder, switch over to the Designer application and specify your source and target tables. Within the designer application, source tables are coloured green whilst targets are purple.
source_target.jpg
Then, you create your new mapping, add the source objects to the mapping and, as you would do in Warehouse Builder, add joiners as necessary to join individual tables together.
joiner.jpg
Once your data set is prepared, you can then start copying the source data into the target table, placing all new columns after the X_CUSTOM column that acts as a separator between the seeded columns and any new ones you add. New columns should be given an X_ prefix to show that they are custom content.
custom_fact.jpg
In this way new columns (Type I) and indeed new tables (Type II) changes can be added, either by creating a new mapping from scratch as in this example, or taking an existing mapping, altering it and saving it under a new name. Once you’ve designed all your mappings they then need to be added to a workflow, and then DAC needs to be configured in order to include the new or amended workflow in an execution plan.
Configuring DAC to execute your new mapping is a three stage job. Firstly, you need to bring any new or amended tables into the DAC repository, marking them as source or target tables so that it can associate them with the mapping that your going to get it to execute (it uses this to report on impact and lineage, like Warehouse Builder does). Then, you create a new task in the DAC repository that corresponds to your Informatica workflow you created earlier, and finally you need to create a new execution plan that contains the task and allows the operator to schedule it from the DAC console.