Project Leader
Alex Poulovassilis
Project staff
Lucas Zamboulis
Sandeep Mittal
Dean Williams
Other Project Partners
Imperial College
Project Details
2001 to date
Funded by EPSRC, MoD, BBSRC
Project Web Site
Please see here for further details of the project, downloadable software and articles: www.doc.ic.ac.uk/automed
Keywords
Data Transformation
Data Integration
Metadata Management
Download
automed (doc)
automed (pdf)
|
Automed: AutoMatic Generation of Mediator Tools for Heterogeneous Data Integration
The AutoMed project is developing tools to assist users and developers in the transformation and integration of data from different data sources. The AutoMed toolkit can access structured (e.g. relational), semi-structured (e.g. XML, RDF) and text data sources. Data transformation/integration is achieved by defining transformation pathways between schemas, for example between a set of data source schemas and a ‘virtual’ Integrated Schema, as illustrated on the right. The AutoMed toolkit comprises a number of components, and at the LKL we are currently working on the following tools:
The AutoMed Query Processor answers queries expressed over an Integrated Schema. It uses the transformation pathways between the Integrated Schema and the data source schemas to reformulate the query, optimizes the reformulated query, submits sub-queries to the data source wrappers for evaluation, and finally merges the sub-query results into an overall result for the original query.
This tool provides facilities for transforming/integrating XML data. It can extract a schema from an XML file, use schema-matching techniques to reconcile different XML schemas, restructure a set of reconciled schemas into a single integrated schema, and then use this integrated schema for querying the original XML files or creating one integrated file.
The P2P ECA tool provides reactive functionality in peer-to-peer data exchange or data integration scenarios. This tool detects data update events occurring at peers, translates such updates using the transformation pathways between peers' schemas, and propagates updates to other remote peers as specified by the ECA rules.
This tool generates annotations on text sources using grammar rules. New data and metadata are then derived from these annotations which can be integrated with other data sources via AutoMed transformation pathways.
The AutoMed toolkit is currently being used to support heterogeneous biological data integration in the ISPIDER and BioMap projects.
|