OTN Appreciation Day: Oracle Data Integrator 12c Flexibility
Well… it looks like Oracle ACE Director and former colleague Michael Rainey beat me to the topic. But as I read his story… I realized I could approach this in a slightly different way, so I decided to press on with my own take on ODI 12c Flexibility for Oracle Technology Network (OTN) Appreciation Day.
Just a quick thank you to the Oracle Technology Network for everything they do. They are the main reason that Oracle ACE Directors (including myself) are able to become experts in Oracle Technology, and why software from other vendors just can’t stack up. And as Oracle conquers the cloud, OTN will be no small reason for why that’s successful.
ODI 12c and Flexibility
I’m a huge fan of Oracle Data Integrator, and especially the work they’ve done in ODI 12c, from the flow-based mapping paradigm (while keeping declarative, optionally, in whole or in small bits), the component KMs, and the logical-to-physical abstraction. It’s everything you want in a data integration tool. But what I want to talk about on OTN Appreciation Day is functionality that’s been there all along… from version 10g to 12c, and all the versions in between. I know… just keep reading.
Michael did a great job in describing the flexibility… and that’s the ability I come back to time-and-time again with ODI 12c. I’m going to approach this in a holistic fashion… in the anatomy of building a mapping. To produce a final, functioning data flow, we basically go through four stages:
- Configure the Topology: at it’s core, choosing a Technology to serve the Logical and Physical schemas, and configure those connections with Context.
- Build the Model: reverse engineering the Physical Schema into a model, accepting some of the defaults that it gives us (constraints, joins, etc.), and tweaking wherever we want. We can of course create the model manually, but as you’ll see below, there’s no reason to do that.
- Build the Logical Mapping: connecting source and target data stores exposed in the Model, relying on all the abstractions along the way: Physical Schema, Logical Schema, Context, etc. You know… draw the pretty pictures. Click-and-drag. Produce functional art.
- Build the Physical Mapping: not deciding what to load, but how to load it. This can be as abstracted or specific as we want: basic JDBC SQL-to-SQL loading, platform specific syntax, file loaders, big data processing frameworks, etc.
The good news about the anatomy of the mapping is that three of these four steps can use 100% custom functionality. The only exception is Build the Logical Mapping, which is already completely abstracted because of the flexibility provided in the other three.
Configure the Topology
There’s too much to cover in the Topology for such as short article, so I want to focus on the Technology, which is an XML configuration that tells us everything we need to know about a specific source or target platform. How are joins done, and what is the correct syntax for defining such joins? How are datatypes mapped to the datatypes in other Technologies? Where other ETL tools pack all this important detail into closed, proprietary code, ODI defines the Technology as an open, configurable standard. You can even create your own… usually by copying an existing Technology and modifying only the necessary bits.
Build the Model
Again… there’s a lot to cover here, but specifically, I want to talk about Reverse Knowledge Modules (RKMs), and their part in extending ODI into the great beyond. My colleague Pete Scott has written recently on Modules of Knowledge, where he briefly covers the RKM. For many defined Technologies, custom RKMs are not necessary, as the JDBC driver can handle the process of reverse-engineering a source or target data store into the Model. However… not everything is covered with a JDBC driver: the big data revelation has at least taught us that. Or more specifically… there are times when the JDBC driver simply won’t manage this process. ODI provides a set of system tables in the Repository schema that we can populate to provide a bridge in the reverse-engineering process. Think of the RKM as a lightweight mapping where we load metadata instead of data, providing it in a format that can produce a Model. The best part… we can create our own. It’s absolutely brilliant.
Build the Physical Mapping
The final step in the customization process is the step most people point to when talking about customization: the Knowledge Module (KM). We have the ability to write custom Loading Knowledge Modules (LKMs) and Integration Knowledge Modules (IKMs) to define exactly how we want to extract from sources and load targets, and many of the transformations along the way. Of course, the functional transformations still belong in the Logical Mapping: that’s where our business logic lives. But backend transformations — or “plumbing” — that we often repeat for particular data platforms can be wrapped up in reusable packages to be selected from a drop-down list later in the Designer. I can’t tell you how much time I’ve wasted with other ETL tools (including Oracle Warehouse Builder and Informatica) copying and pasting boilerplate transformations from one place to another because the tool lacked the capabilities of encapulating all of that plumbing. The creation and management of staging tables, behind-the-scenes data-management for incremental loading, enabling powerful, platform-specific load strategies such as partition-exchange loading: the KM manifests my Logical Mapping any way I want it, as many different ways as I want it.
Customization and flexibility… that’s the name of the game. It’s the reason ODI is so quick to support (officially or unofficially) so many new sources and targets. They didn’t shoehorn the open frameworks in along the way… ODI (originally Sunopsis) was designed from the very beginning to be a way to join one anything with another anything to load one more anything.