Managing XML data with evolving DTD's,

 

M S thesis, by B V N Prashanth, July 2008

 

Change is a fundamental aspect of any database system. Database schema must

evolve over a period of time to suit the changing user needs and to correct errors

in the design. Schema design can also be changed to improve overall system performance,

for example users may want to change the database structure(normalize) to

reduce redundancy in data. We have considered this problem in the context of XML

databases described by DTD(Document Type Definition). Our goal is to provide a set

of DTD change operators through an easy-to-use interface, to the user, for performing

necessary changes on DTD. This interface automates the process of changes on DTD,

and hides unnecessary details from the user. We have proposed a set of four operators

named Dual-Transform, Subtree-Moveup, Ref-Ref Inverse, Hie-Hie Inverse which

manipulate hierarchical and reference relationships in a DTD. We have discussed the

situations where these operators can be applied to reduce redundancy in data. We

have designed and implemented two approaches to handle changes in data along with

DTD. The first approach translates the data along with every change performed on

DTD to keep them consistent with changes. The second approach creates versions of

DTDs with every change performed and existing data is not transformed into the new

form. As part of this approach, we have also designed query translation algorithms

to translate user queries that use the latest version of DTD to older versions.