MOSIG Master 2ND YEAR Research
YEAR 2010/2011

MASTER TOPIC PROPOSAL

ADVISOR: Jérôme Euzenat and Cássia Trojahn dos Santos

TEL: 476 61 53 66 and 476 61 53 52 66

EMAIL: Jerome:Euzenat#inrialpes:fr and Cassia:Trojahn#inrialpes:fr

TEAM AND LAB: Exmo team, INRIA & LIG

MASTER PROFILE: Artificial intelligence and the web

TITLE:

Multidimensional analysis of RDF data

Reference number: Proposal n°780

Multidimensional processing aims at analysing large quantities of data in real-time, through a set of queries based on a multidimensional view on a data set. This technique is known as OLAP (On-Line Analytical Processing) [1,2]. Usually, the source of data is a relational database and a multidimensional view (or cube) defines the dimensions (i.e., attributes that represent columns in tables of the database, such as product name, or foreign keys, such as department id) and measures (i.e., a quantity for measuring, such as unit sales of a product) on this data. Hierarchies and levels in dimensions can be specified in order to allow for constructing complex queries on this cube.

One potential application of OLAP is the analysis of the large amount of data in the Semantic Web. For instance, Web of data aims at interconnecting distributed and unstructured sources of data by using Semantic Web resources such as well-defined formats, such as RDF (Resource Description Framework) and OWL (Ontology Web Language), and ways for linking this data. Furthermore, more recently, Data.gov1 initiative has worked on a way for making available public sector data, what has created a huge database of public data. The use of automatic and powerful technologies for analysing and visualizing this data is an open issue and OLAP seems to be a suitable technique for such a task.

However, a potential limitation of current OLAP libraries is the fact that they are designed to manipulate multidimensional views on relational databases, while data in the Semantic Web are mostly represented using formats like RDF.

The objective of this project is to propose a way for manipulating RDF-based data using OLAP technique.

Few works have been proposed to combine Semantic Web technologies and OLAP-based data analysis. Furthermore, most of the proposed approaches differ from the main aim of this project. Niinimaki and Niemi [3] proposes an ontology that serves as a basis ("meta-model") for defining and creating the OLAP view, its corresponding database tables and populating the database. Based on this ontology, heterogeneous and distributed databases can be manipulated using OLAP. However, the source of data are relational databases. Priebe and Pernul [4] have proposed to combine OLAP and ontologies for providing adaptive search in information retrieval. Users navigate in OLAP reports and query context is used to search corresponding related documents. Ontologies provide the metadata for describing the different sources of information for searching. The closer proposal is from Näppila et al. [5]. These authors propose a tool for constructing multidimensional views based on XML data sources in the context of data integration. The RDF format is not supported.

Expected results:

This project will include the following main tasks:

  1. study of the state-of-the-art on Semantic Web and OLAP as well as OLAP libraries;
  2. propose an approach for exploiting RDF data in OLAP libraries;
  3. implement of the designed approach;
  4. evaluate the proposed approach in the context of an application for manipulating RDF data describing data from evaluation of ontology matching systems.
The tasks (3) and (4) can be carried out in a possible extension of the project.

References

[1]
S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Rec., 26(1):65-74, 1997.
[2]
E. Codd, S. Codd, and C. Salley. Providing OLAP to user-analysts: An IT Mandate. Technical Report, Hyperion, 1993.
[3]
Niinimakim, M. and T. Niemi: An ETL Process for OLAP Using RDF/OWL Ontologies. Journal of Data Semantics 13: 97-119, 2009.
[4]
T. Priebe and G. Pernul. Ontology-based Integration of OLAP and Information Retrieval. In Proceedings of the 14th International Workshop on Database and Expert Systems Applications, 2003.
[5]
T. Näppila, K. Järvelin, and T. Niemi. A tool for data cube construction from structurally heterogeneous XML documents. J. Am. Soc. Inf. Sci. Technol., 59(3):435-449, 2008.

MOSIG Master 2E ANNÉE Research
ANNÉE 2010/2011

PROPOSITION DE SUJET DE MASTÈRE

RESPONSABLES: Jérôme Euzenat and Cássia Trojahn dos Santos

TÉL: 476 61 53 66 and 476 61 53 52 66

ADRESSE ÉLECTRONIQUE: Jerome:Euzenat#inrialpes:fr and Cassia:Trojahn#inrialpes:fr

LABORATOIRE ET ÉQUIPE: Équipe Exmo, INRIA & LIG

PROFIL DU PROJET: Parcours web et intelligence artificielle

TITLE:

Analyse multidimensionnelle de données RDF

Reference number: Proposal n°780


http://exmo.inria.fr/training/M2R-2010-olap-rdf.html

$Id: M2R-2010-olap-rdf.html,v 1.5 2017/01/13 19:59:25 euzenat Exp $