Methods for Unification of Non-traditional Data Models

Author(s): Stupnikov S. A., Skvortsov N. A., Budzko V. I., Zakharov V. N., Kalinichenko L. A.
Published:Systems of High Availability. -- Moscow: Radiotechnika, 2014. -- Iss. 1. -- P. 18-39. (In Russian)
In the current period of IT development the creation of data manipulation and analysis facilities aimed at Web, social media, machine and sensor data, etc., is regarded as of paramount importance. The data of such scale (frequently measured in petabytes) are related to the category of the Big Data. To represent and manipulate collections of such data the new data models were created that differ of the traditional (relational) data models. One of the not yet solved problems of Big Data manipulation is the problem of integration of various non-traditional data models. For solving of such problem first of all it is required to create the unified representation of various kinds of non-traditional data models in the canonical information model (the generalized language unifying the languages of various data models). For such representation it is required to construct the data model mapping preserving semantics of its data description and data manipulation languages in the canonical one. Such mapping is required for the materialized integration (creation of a data warehouse) as well as for the virtual integration (by means of the subject mediators) of the respective collections of data. In the paper the principles of mapping of four kinds of non-traditional data models into the canonical model (for which the SYNTHESIS language is used representing the object-frame composed data model) are considered: the data models based on the multidimensional arrays; the graph-based data models; the NoSQL data models; the triple-based RDF data model. The method of data model mapping verification applied for the proof of information and operation preserving under the mapping is illustrated by examples. The objective of this research is the definition of well founded unifying mappings of the non-traditional data models to present the possibility of the unified representation of so different data models for the materialized or virtual integration of the respective collections of data.
