Algoritmos de Pré-Processamento para Uniformizaçãode Instâncias XML Heterogêneas

The following document contains my graduate dissertation (in portuguese), which deals with the following problem:

 The increasing availability of data on the Web creates the need for more pratical and efficient systems for collecting and integrating these data, in order to provide queries over them. One of the most used formats to represent information in the Web is XML. XML, given its dynamic nature, allows complete and adequate representation of data from different domains. But, at the same time, such dynamic nature makes integration activities complex. This work focus on reducing such complexity, providing a set of preprocessing techniques to compatibilize the structures of XML instances. This compatibilization, which seeks to respect data semantics, tries to facilitate the comparison and further integration of these data by already existing approaches for XML data comparison and integration. Through case studies and experiments, we demonstrate how the suggested preprocessings provide better results for the existing related work.