We can use the Lookup transformation to perform following:
- Get a related value: EMP has DEPTNO but DNAME is not there. We use Lookup to get DNAME from DEPT table based on Lookup Condition.
- Perform a calculation: We want only those Employees who’s SAL > Average (SAL). We will write Lookup Override query.
- Update slowly changing dimension tables: Most important use. We can use a Lookup transformation to determine whether rows already exist in the target.
1. LOOKUP TYPES
We can configure the Lookup transformation to perform the following types of lookups:
- Connected or Unconnected
- Relational or Flat File
- Cached or Un cached
Relational Lookup:
When we create a Lookup transformation using a relational table as a lookup source, we can connect to the lookup source using ODBC and import the table definition as the structure for the Lookup transformation.
- We can override the default SQL statement if we want to add a WHERE clause or query multiple tables.
- We can use a dynamic lookup cache with relational lookups.
Flat File Lookup:
When we use a flat file for a lookup source, we can use any flat file definition in the repository, or we can import it. When we import a flat file lookup source, the Designer invokes the Flat File Wizard.
Cached or Un cached Lookup:
We can check the option in Properties Tab to Cache to lookup or not. By default, lookup is cached.
Connected and Unconnected Lookup
Connected Lookup
|
Unconnected Lookup
|
Receives input values directly from the pipeline.
|
Receives input values from the result of a :LKP expression in another transformation.
|
We can use a dynamic or static cache.
|
We can use a static cache.
|
Cache includes all lookup columns used in the mapping.
|
Cache includes all lookup/output ports in the lookup condition and the lookup/return port.
|
If there is no match for the lookup condition, the Power Center Server returns the default value for all output ports.
|
If there is no match for the lookup condition, the Power Center Server returns NULL.
|
If there is a match for the lookup condition, the Power Center Server returns the result of the lookup condition for all lookup/output ports.
|
If there is a match for the lookup condition,the Power Center Server returns the result of the lookup condition into the return port.
|
Pass multiple output values to another transformation.
|
Pass one output value to another transformation.
|
Supports user-defined default values
|
Does not support user-defined default values.
|
2 .LOOKUP T/F COMPONENTS
Define the following components when we configure a Lookup transformation in a mapping:
- Lookup source
- Ports
- Properties
- Condition
1. Lookup Source:
We can use a flat file or a relational table for a lookup source. When we create a Lookup t/f, we can import the lookup source from the following locations:
- Any relational source or target definition in the repository
- Any flat file source or target definition in the repository
- Any table or file that both the Power Center Server and Client machine can connect to The lookup table can be a single table, or we can join multiple tables in the same database using a lookup SQL override in Properties Tab.
2. Ports:
Ports
|
Lookup
Type
|
Number
Needed
|
Description
|
I
|
Connected
Unconnected
|
Minimum 1
|
Input port to Lookup. Usually ports used for Join condition are Input ports.
|
O
|
Connected
Unconnected
|
Minimum 1
|
Ports going to another transformation from Lookup.
|
L
|
Connected
Unconnected
|
Minimum 1
|
Lookup port. The Designer automatically Designates each column in the lookup source as a lookup (L) and output port (O).
|
R
|
Unconnected
|
1 Only
|
Return port. Use only in unconnected Lookup t/f only.
|
3. Properties Tab
Options
|
Lookup Type
|
Description
|
Lookup SQL Override
|
Relational
|
Overrides the default SQL statement to query the lookup table.
|
Lookup Table Name
|
Relational
|
Specifies the name of the table from which the transformation looks up and caches values.
|
Lookup Caching Enabled
|
Flat File, Relational
|
Indicates whether the Power Center Server caches lookup values during the session.
|
Lookup Policy on Multiple Match
|
Flat File, Relational
|
Determines what happens when the Lookup transformation finds multiple rows that match the lookup condition. Options: Use First Value or Use Last Value or Use Any Value or Report Error
|
Lookup Condition
|
Flat File, Relational
|
Displays the lookup condition you set in the Condition tab.
|
Connection Information
|
Relational
|
Specifies the database containing the lookup table.
|
Source Type
|
Flat File, Relational
|
Lookup is from a database or flat file.
|
Lookup Cache Directory Name
|
Flat File, Relational
|
Location where cache is build.
|
Lookup Cache Persistent
|
Flat File, Relational
|
Whether to use Persistent Cache or not.
|
Dynamic Lookup Cache
|
Flat File, Relational
|
Whether to use Dynamic Cache or not.
|
Recache From Lookup Source
|
Flat File, Relational
|
To rebuild cache if cache source changes and we are using Persistent Cache.
|
Insert Else Update
|
Relational
|
Use only with dynamic caching enabled. Applies to rows entering the Lookup transformation with the row type of insert.
|
Lookup Data Cache Size
|
Flat File, Relational
|
Data Cache Size
|
Lookup Index Cache Size
|
Flat File, Relational
|
Index Cache Size
|
Cache File Name Prefix
|
Flat File, Relational
|
Use only with persistent lookup cache. Specifies the file name prefix to use with persistent lookup cache files.
|
Some other properties for Flat Files are:
- Date time Format
- Thousand Separator
- Decimal Separator
- Case-Sensitive String Comparison
- Null Ordering
- Sorted Input
4: Condition Tab
We enter the Lookup Condition. The Power Center Server uses the lookup condition to test incoming values. We compare transformation input values with values in the lookup source or cache, represented by lookup ports.
- The data types in a condition must match.
- When we enter multiple conditions, the Power Center Server evaluates each condition as an AND, not an OR.
- The Power Center Server matches null values.
- The input value must meet all conditions for the lookup to return a value.
- =, >, <, >=, <=, != Operators can be used.
- Example: IN_DEPTNO = DEPTNO
In_DNAME = 'DELHI'
Tip: If we include more than one lookup condition, place the conditions with an equal sign first to optimize lookup performance.
Note:
1. We can use = operator in case of Dynamic Cache.
2. The Power Center Server fails the session when it encounters multiple keys for a Lookup transformation configured to use a dynamic cache.
3. Connected Lookup Transformation
Example: To create a connected Lookup Transformation
- EMP will be source table. DEPT will be LOOKUP table.
- Create a target table CONN_Lookup_EXAMPLE in target designer. Table should contain all ports of EMP table plus DNAME and LOC as shown below.
- Create the shortcuts in your folder.
Creating Mapping:
1. Open folder where we want to create the mapping.
2. Click Tools -> Mapping Designer.
3. Click Mapping-> Create-> Give name. Ex: m_CONN_LOOKUP_EXAMPLE
4. Drag EMP and Target table.
5. Connect all fields from SQ_EMP to target except DNAME and LOC.
6. Transformation-> Create -> Select LOOKUP from list. Give name and click
Create.
7. The Following screen is displayed.
8. As DEPT is the Source definition, click Source and then Select DEPT.
9. Click Ok.
10. Now Pass DEPTNO from SQ_EMP to this Lookup. DEPTNO from SQ_EMP will be named as DEPTNO1. Edit Lookup and rename it to IN_DEPTNO in ports tab.
11. Now go to CONDITION tab and add CONDITION.
DEPTNO = IN_DEPTNO and Click Apply and then OK.
Link the mapping as shown below:
12. We are not passing IN_DEPTNO and DEPTNO to any other transformation from LOOKUP; we can edit the lookup transformation and remove the OUTPUT check from them.
13. Mapping -> Validate
14. Repository -> Save
- Create Session and Workflow as described earlier. Run the workflow and see the data in target table.
- Make sure to give connection information for all tables.
- Make sure to give connection for LOOKUP Table also.
We use Connected Lookup when we need to return more than one column from Lookup table.There is no use of Return Port in Connected Lookup.
SEE PROPERTY TAB FOR ADVANCED SETTINGS
4. Unconnected Lookup Transformation
An unconnected Lookup transformation is separate from the pipeline in the mapping. We write an expression using the :LKP reference qualifier to call the lookup within another transformation.
Steps to configure Unconnected Lookup:
1. Add input ports.
2. Add the lookup condition.
3. Designate a return value.
4. Call the lookup from another transformation.
Example: To create a unconnected Lookup Transformation
- EMP will be source table. DEPT will be LOOKUP table.
- Create a target table UNCONN_Lookup_EXAMPLE in target designer. Table should contain all ports of EMP table plus DNAME as shown below.
- Create the shortcuts in your folder.
Creating Mapping:
1. Open folder where we want to create the mapping.
2. Click Tools -> Mapping Designer.
3. Click Mapping-> Create-> Give name.
Ex: m_UNCONN_LOOKUP_EXAMPLE
4. Drag EMP and Target table.
5. Now Transformation-> Create -> Select EXPRESSION from list. Give name
and click Create. Then Click Done.
6. Pass all ports from SQ_EMP to EXPRESSION transformation.
7. Connect all fields from EXPRESSION to target except DNAME.
8. Transformation-> Create -> Select LOOKUP from list. Give name and click
Create.
9. Follow the steps as in Connected above to create Lookup on DEPT table.
10. Click Ok.
11. Now Edit the Lookup Transformation. Go to Ports tab.
12. As DEPTNO is common in source and Lookup, create a port IN_DEPTNO
ports tab. Make it Input port only and Give Datatype same as DEPTNO.
13. Designate DNAME as Return Port. Check on R to make it.
14. Now add a condition in Condition Tab.
DEPTNO = IN_DEPTNO and Click Apply and then OK.
15. Now we need to call this Lookup from Expression Transformation.
16. Edit Expression t/f and create a new output port out_DNAME of data type as DNAME. Open the Expression editor and call Lookup as given below:
We double click Unconn in bottom of Functions tab and as we need only
DEPTNO, we pass only DEPTNO as input.
17. Validate the call in Expression editor and Click OK.
18. Mapping -> Validate
19. Repository Save.
- Create Session and Workflow as described earlier. Run the workflow and see the data in target table.
- Make sure to give connection information for all tables.
- Make sure to give connection for LOOKUP Table also.
5. Lookup Caches
We can configure a Lookup transformation to cache the lookup table. The Integration Service (IS) builds a cache in memory when it processes the first row of data in a cached Lookup transformation.
The Integration Service also creates cache files by default in the $PMCacheDir. If the data does not fit in the memory cache, the IS stores the overflow values in the cache files. When session completes, IS releases cache memory and deletes the cache files.
- If we use a flat file lookup, the IS always caches the lookup source.
- We set the Cache type in Lookup Properties.
Lookup Cache Files
1. Lookup Index Cache:
- Stores data for the columns used in the lookup condition.
2. Lookup Data Cache:
- For a connected Lookup transformation, stores data for the connected output ports, not including ports used in the lookup condition.
- For an unconnected Lookup transformation, stores data from the return port.
Types of Lookup Caches:
1. Static Cache
By default, the IS creates a static cache. It caches the lookup file or table and Looks up values in the cache for each row that comes into the transformation.The IS does not update the cache while it processes the Lookup transformation.
2. Dynamic Cache
To cache a target table or flat file source and insert new rows or update existing rows in the cache, use a Lookup transformation with a dynamic cache.
The IS dynamically inserts or updates data in the lookup cache and passes data to the target. Target table is also our lookup table. No good for performance if table is huge.
3. Persistent Cache
If the lookup table does not change between sessions, we can configure the Lookup transformation to use a persistent lookup cache.
The IS saves and reuses cache files from session to session, eliminating the time Required to read the lookup table.
4. Recache from Source
If the persistent cache is not synchronized with the lookup table, we can Configure the Lookup transformation to rebuild the lookup cache.If Lookup table has changed, we can use this to rebuild the lookup cache.
5. Shared Cache
- Unnamed cache: When Lookup transformations in a mapping have compatible caching structures, the IS shares the cache by default. You can only share static unnamed caches.
- Named cache: Use a persistent named cache when we want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches.
Building Connected Lookup Caches
We can configure the session to build caches sequentially or concurrently.
- When we build sequential caches, the IS creates caches as the source rows enter the Lookup transformation.
- When we configure the session to build concurrent caches, the IS does not wait for the first row to enter the Lookup transformation before it creates caches. Instead, it builds multiple caches concurrently.
1. Building Lookup Caches Sequentially:
2. Building Lookup Caches Concurrently:
- To configure the session to create concurrent caches
Edit Session -> In Config Object Tab-> Additional Concurrent Pipelines for
Lookup Cache Creation -> Give a value here (Auto By Default)
Note: The IS builds caches for unconnected Lookups sequentially only.