Thursday, January 3, 2013

Performance of the DAC Load



            Data Warehouse Application Console: The DAC provides a framework for the entire life cycle of data warehouse implementations. It allows you to create, configure, execute, and monitor modular data warehouse applications in a parallel, high-performing environment. The DAC complements the Informatica ETL platform. It provides application-specific capabilities that are not prebuilt into ETL platforms. For example, ETL platforms are not aware of the semantics of the subject areas being populated in neither the data warehouse nor the method in which they are populated. In a nutshell, all the tasks that are needed to bring the data from the transactional database into Analytical database are registered with DAC. Hence by analyzing all the tasks and the way all the tasks behaves each day it is possible to gauge the volume of data that transfers periodically from the transactional database to the analytical database. Analysis of DAC Tasks: Analyzing all the tasks as of a single day is possible with the help of the following available features in DAC. • Gantt Task Chart • Log File • Gantt Phase Chart • Task Graph • Analyze Run But for any ETL system, it is necessary to analyze the data volume and time taken across intervals (Days or Weeks or Hours) to gauze the performance. So a reporting tool that reports the performance of the ETL tasks in DAC across time period is necessary so that the data transfer is under control and essential ETL routines could be culled from the hundreds of vanilla and custom routines.
Building the data model for reporting:
The tools that are used for building the routines are PL/SQL,Informatica, and obi. The DAC tables cannot be directly used forreporting as they do not contain the measures and attributes necessary for the analysis. So a data model could be built based on the DAC tables and custom tables.
Sample Reports:
Execution Plan on all days:
All the information related to the Execution plan could be reported.
Basic information like these could be obtained from DAC views. But getting these details from reports would simplify the analysis. 





















1
ETL Tasks:
All the tasks that are executed on any day could be analyzed. The
Analysis could be widened by selecting the other attributes and measures through obi Answers.





















ETL Task across Days:
An unwanted ETL routine might be consuming much time of the DAC
Run. An ETL routine might be doing the full load instead of the incremental load.These kinds of issues could be rectified by analyzing the task across several days.
 



















Task Duration as a percentage of DAC Duration:
In case if a single task keeps on increasing its share in the DAC
runtime, then the ETL routine has to be pruned. Such analysis could be
done with the graphs as follows.
  
  



Successfully Processed records:It is possible that there is unwanted data (Trojan horse) entering the system. Any abrupt increase in any entity could be tracked with the following grap.
  
  





Note : The date in the graphs contain the time stamp since it is possible
that more than one DAC run happens on a same day. The number of reports
that could be taken from the DAC data model for the analysis is not limited to only the reports present in this document.