First Look: XML databases
1 April 2005 | 0
Without a doubt, XML is fast becoming the lingua franca of B2B data exchange. As the use of XML increases, executives and IT managers must begin factoring in the growing number and differing types of XML products now coming to market before they can determine the most cost-effective XML strategy to implement.
Recently major relational database developers, such as Oracle and Microsoft, have introduced XML-enabling technologies in their products: Oracle’s XDB and Microsoft’s SQLXML. Rival IBM has offered an XML Extender for its DB2 database for some time.
Another promising, more manageable approach to XML in the enterprise is the emerging native XML database (NXDB). An NXDB does not replace your existing enterprise data sources, rather it acts as an intermediate cache that sits between back-end data sources and middle-tier application components.
Using an NXDB provides two principal benefits. First, it’s likely your enterprise has multiple back-end data sources and various types of middle-tier applications. Rather than liberally sprinkling XML capabilities across the middle tier and back end, which may significantly increase technology expenditures, you could add the XML support you need by implementing an NXDB. An NXDB supplies the programmatic interfaces and data access methods necessary to support multiple applications and data sources.
Second, you might use an NXDB to augment the processing power of your primary enterprise databases. Rather than devote primary database processing cycles to XML translation, storage and retrieval during peak hours, moving these operations to an NXDB can free primary databases for more important tasks, such as transaction processing. Interaction between the NXDB and your back-end data sources can then be performed at times of the day or night that allow you to optimise processing performance and to reduce the load on back-end databases that must also serve other applications and end-users.
Many of the XML handling capabilities recently added to relational database management systems (RDBMSs) provide functionality similar to that provided by an NXDB. This has caused some confusion and begs the question, What constitutes an NXDB?
An NXDB differs from an RDBMS in three key ways. First, an NXDB uses XML documents as the primary method of storage, whereas the primary storage method in an RDBMS is rows. An RDBMS may store an XML document, but often the XML document is stored in a database row or as an object in the database. Storing an XML document natively makes it easier to retrieve during exchanges with business partners.
Second, an NXDB uses a variety of underlying physical storage models. Some NXDBs are constructed using relational technology for physical storage, whereas others use object-oriented, hierarchical, or even proprietary storage facilities. The underlying facilities in an NXDB are transparent to the end-user, which makes managing groups of XML documents (called collections) much easier.
Finally, XML documents in an NXDB are accessed by applications using XML technologies, such as XPath. By contrast, an RDBMS may require application access to XML-based data via other technologies, such as open database connectivity (ODBC) or Java database connectivity (JDBC), which may slow access times when exchanging B2B documents.
Anatomy of an NXDB
When an XML document is stored in an NXDB, the NXDB creates an XML-based model that includes multiple levels of nesting and support for semi-structured data. The model is then mapped into the underlying physical storage mechanism that is supported by the NXDB.
NXDBs manage XML documents via a construct known as a collection. Collections allow you to query and modify multiple XML documents. You might compare an NXDB collection to the construct of a table within an RDBMS.
However, NXDB collections are different from RDBMS tables in that many NXDBs do not require a schema to be associated with a collection, whereas RDBMS tables do. NXDBs that do not require schemas can offer increased flexibility, but they also can reduce data integrity.
Currently, XPath is the primary query language used to access XML documents in an NXDB. XPath can be used to query multiple XML document collections, but it has some shortcomings. Chief among these is the lack of sorting, grouping, and data types. Moreover, XPath lacks support for cross-document joins. Developers can overcome some of these obstacles by using extensible stylesheet language transformations (XSLTs).
XQuery, a new query language for NXDBs, addresses many of the shortcoming in XPath. Several NXDB providers, such as Software AG, are beginning to support XQuery. As the XQuery standard reaches final specification with the World Wide Web Consortium (W3C), expect to see more NXDB providers adopt the standard.
At the moment, updating data in an NXDB can also be tricky. Most NXDB solutions require that the entire XML document be retrieved and that the data within the document be updated via an XML-based application programming interface.
Some NXDBs address these shortcomings by providing a proprietary update facility or by suggesting that developers use document object modelling (DOM) to update data in XML documents. The final specification for the XQuery language will likely address the issue of NXDB data updating.
As stated earlier, applications access NXDBs using XML technologies. Most NXDBs supply one or more APIs — some based on Java, others on C++ — to make life easier for developers. On the back end, NXDBs supply services that allow heterogeneous database access.
Are NXDBs the best choice for all enterprises? Certainly not. For some enterprises a different approach to managing XML may be warranted. For example, if most of your XML documents are stored and retrieved from a single data source, it may make more sense to leverage the XML capabilities of that single source. Likewise, for easy application access, you might choose to move XML documents closer to the middle tier by implementing an XML server or an XML-enabled application server.
An NXDB is a promising XML strategy for enterprises that have multiple, heterogeneous data sources and multiple, mixed applications executing on the middle tier. Implementing an NXDB can provide good return on investment by centralising XML activity for multiple data sources in a single solution. Moreover, NXDBs can help you better manage the processing of XML documents.
Executives and IT managers should consider NXDBs when formulating an XML strategy. However, NXDBs are an emerging technology; querying and update capabilities are still maturing. Weigh the options carefully, start slowly, and proceed with caution.
Native XML databases emerge
XML has been around for approximately six years, but native XML databases are a relatively new phenomenon. Listed below are a number of new and maturing native XML database solutions.
THE BOTTOM LINE
Native XML databases
Executive Summary: A native XML database makes good economic sense for enterprises that must support XML document handling and interaction with multiple back-end data sources. In addition, native XML databases can simplify the management of enterprise data processing performance.
Test centre Perspective: An emerging technology, native XML databases are currently best suited to early adopters willing to experiment. When existing shortcomings — such as query and update handling — are resolved, these databases promise to make XML handling much more manageable for most IT shops.