Tuesday, June 29, 2010

Odi Load Multiple XML Files

Loading multiple XML files into Oracle using ODI is illustrated in this post.

The standard reverse engineering driver will parse the dtd or if the dtd is not available, it will parse the XML and generate a DTD on the fly in the same location where the XML file exists.

To load multiple XML files we will have to employ ODI variables. An ODI variable is needed to store the name of the file that needs to be loaded to Oracle.
Declare the variable in your project V_MYXMLFILE with the following query:
select xmlfilename from my_metadata_table.

This my_metadata_table should exist in an Oracle schema and should contain a column xmlfilename which lists all the XML filenames (alongwith path).
Each time the variable is refreshed, it will populate the first value from the table. You can follow the instructions found at http://odiexperts.com/?p=524 to loop through each row in the table.

Now create 2 ODI procedures:
  1. Truncate XML Schema
  2. Load XML Schema
1.) Truncate XML Schema will contain 1 command "Truncate"

Technology = XML and Schema = Logical XML schema that you want to work on.
The "Command on Target" will be "TRUNCATE SCHEMA MYSCHEMA".
Here MYSCHEMA is the name of the XML schema that is defined in Topology Manager using the s tag as in
jdbc:snps:xml?d=E:/XMLDIR/myXML.dtd&s=MYSCHEMA

Truncate XML Schema will wipe out all the records in the data structure that holds the XML data. It can be in memory or it can be set to a physical database schema by using &db_props=mydbparams parameter in the xml server definition. Here mydbparams is a reference to a physical file mydbparams.properties created in the $ODI_HOME/oracledi/drivers.

2.) Load XML Schema will contain 2 commands:

Show variable value - This will be created using Jython Technology with the following commands :
a = 'Filename = '+ '#V_MYXMLFILE'
raise(a)
Also, mark the Ignore Errors = Yes, for this step.

This is cheat to display the value of the ODI variable during execution in the Operator.
Also, this step is somehow needed so that the value of the variable can be accessed by the next step.
If you remove this step, the next step (which actually loads the XML schema with the contents of the file) errors out.

Load XML Schema
- This will be created using XML Technology and XML logical schema.
Command on Target = LOAD FILE "#MYPROJECT_NAME.V_MYXMLFILE" ON SCHEMA MYSCHEMA INSERT_ONLY


Now you can create a package using these procedures and refresh the variable V_MYXMLFILE with location/name of the file and execute Load XML Schema.
The procedure "Load XML Schema" can also be executed in a loop to load multiple XML files (that adhere to the same DTD) into the database tables.