In our data feeds, we have a stack of XML files as well as numerous flat files to put into an Oracle11g database. We flatten the XML into delimited files, and load the whole set using SQLLoader.
I want to try a proof-of-concept loading via TABLE ORGANIZED EXTERNALLY but I need to make a convincing case to DBA’s that it won’t do something evil to the server. The only plausible test files I have are 400-600 MB, in production we’d add some multi-gigabyte files.
What are the risks and how should I approach them, any ideas?
UPDATE: Thanks for all of the helpful comments, folks. More discussions with have yielded “we can’t give you shell access on the database server to load files, we can’t mount remote files via NFS” – these are basically security concerns. We handle PII so DBA’s are touchy. Also, some concerns about who’s providing the storage.
Any further suggested ‘slam dunk’ arguments on why 1) external tables are so much better than sqlloader or 2) why a testbed for external tables is low-risk?
IMO, an external table is much easier to manage and flexible than a SQLLoader script. So if you’re already doing a recurring SQLLoader load, I see no possibilities of evil in switching to external tables.
By doing the external table, even though the base of the table resides in a text file outside the database, the data can be accessed from within the database, no need for a separate SQLLoader tool/scripts. So you can write procedures/packages to manipulate the data straight from the external table. No need to load it anywhere…as long as it’s in the directory you have set, it’s already “loaded”.
As a DBA, I’d much prefer managing a DML script using an external table than having to manage a SQLLoader script. But SQLLoader has been around a while….if your DBA has too, it may be a tough sell. 🙂
If your DBA’s won’t allow you to even explore the benefits in a proof of concept, time to move companies.
External tables go back to version 10.1 at least, so they’re not cutting edge technology. They’re so widely (see Tom Kyte et al) held to be superior for many reasons (more flexible data transformations on import, MERGEing into other tables, multitable inserts, better parallelism) that making this case ought to be trivial.
Unfortunately there’s a breed of DBA who sees it as their job to say “no”, and maybe you’re stuck with them.
Bring on the downvotes, DBA’s ….