Project Outline and Aims
Semi-structured data is usually modelled as a graph structure, with labelled edges. The original aim of this project was to investigate algorithms for answering path queries on such graph structures. A crucial assumption was that the queries asked about nodes connected by simple paths, that is, ones in which no node is repeated. This assumption caused query answering to become intractable, with the result that the project focussed on discovering various classes of queries and graphs for which query answering could be guaranteed to be performed in time polynomial in the size of the graph being queried. Our early work on finding regular simple paths has received over 200 citations, according to Google scholar.
More recently, the project has been concerned with flexible querying of semi-structured data. Flexible querying allows users to request that conditions in queries be relaxed to allow more general answers to be returned, ranked in terms of how closely they match the original query. This is useful in areas where users are not familiar with the structure of the data or where they want to browse the data in an exploratory manner. The project initially focussed on RDF data, but now more general solutions for semi-structured data are being investigated.
Funding and Staffing Details
The initial project ran until 1999 and was undertaken by Peter Wood, Alberto Mendelzon and Zhivko Nedev. Early work on the project, not listed in the publications below, evolved into the Hy+ project at the University of Toronto, with many additional contributors. Funding for Peter Wood's work was provided by the Foundation for Research Development (FRD), now the National Research Foundation, of South Africa.
The aspect of flexible querying is being investigated by Peter Wood, Alex Poulovassilis, Carlos Hurtado and Pablo Barcelo. The project was funded by the Royal Society between 2007 and 2009. Petra Selmer is working in the area for her PhD.
Project publications
- A. Poulovassilis and P.T. Wood, ``Combining Approximation and Relaxation in Semantic Web Path Queries,'' in Proc. 9th International Semantic Web Conference (Shanghai, China, Nov. 7-11), 2010, pp. 631-646.
- P. Barcelo, C.A. Hurtado, L. Libkin and P.T. Wood, ``Expressive Languages for Path Queries over Graph-Structured Data,'' in 29th ACM Symposium on Principles of Database Systems (Indianapolis, Indiana, June 6-10), 2010, pp. 3-14.
- P.T. Wood, ``Graph Database,'' in Encyclopedia of Database Systems, Editors-in-chief: M. Tamer Ozsu and Ling Liu, Springer, 2009, pp. 1263-1266.
- C.A. Hurtado, A. Poulovassilis and P.T. Wood, ``Finding Top-k Approximate Answers to Path Queries,'' to appear in Proc. 8th Int. Conf. on Flexible Query Answering Systems (Roskilde, Denmark, Oct. 26-28), 2009.
- C.A. Hurtado, A. Poulovassilis and P.T. Wood, ``Ranking Approximate Answers to Semantic Web Queries,'' in Proc. 6th European Semantic Web Conference (Heraklion, Crete, May 31-June 4), LNCS 5554, Springer-Verlag, 2009, pp. 263-277.
- C.A. Hurtado, A. Poulovassilis and P.T. Wood, ``Query Relaxation in RDF,'' in Journal on Data Semantics X, LNCS 4900, Springer-Verlag, 2008, pp. 31-61.
- C.A. Hurtado, A. Poulovassilis and P.T. Wood, ``A Relaxed Approach to RDF Querying,'' in Proc. 5th Int. Semantic Web Conference, (Athens, Georgia, Nov. 5-9), LNCS 4273, Springer-Verlag, 2006, pp. 314-328.
- Z.P. Nedev and P.T. Wood, ``A Polynomial-Time Algorithm for Finding Regular Simple Paths in Outerplanar Graphs,'' Journal of Algorithms, 35, May 2000, pp. 235-259.
- A.O. Mendelzon and P.T. Wood, ``Finding Regular Simple Paths in Graph Databases,'' SIAM Journal on Computing 24, 6 (Dec. 1995), pp. 1235-1258.