Razširitev metode za iskanje ujemanj med podatkovnimi shemami s podporo za dodatne podatkovne tipe in upoštevanjem strukture podatkovne sheme

magistrsko delo

Benjamin Kastelic (Author), Matjaž B. Jurič (Mentor)

Abstract

Z iskanjem ujemanj med podatkovnimi shemami želimo odkriti čim več semantično enakovrednih elementov med dvema shemama in določiti povezave med njimi. Ta proces je ena izmed glavnih aktivnosti pri integraciji podatkov. Večina obstoječih metod za iskanje ujemanj med shemami ima še vedno težave s kompleksnimi preslikavami. Prav zato smo se odločili, da izboljšamo eno od obstoječih metod za iskanje ujemanj med shemami, ki primarno rešuje ta problem. Izbrana metoda temelji na evolucijskem algoritmu, ki postopoma generira boljše posameznike (preslikave) samo na podlagi podatkovnih instanc. Ker se lahko zgodi, da v nekem scenariju podatkovne instance niso na voljo, smo v ta namen razvili izboljšano metodo, ki upošteva tudi podatke sheme. Metodo smo še dodatno razširili s podporo za dodatne podatkovne tipe. Izboljšano metodo smo ovrednotili na enakih testnih podatkih kot originalno metodo. Ugotovili smo, da je naša izboljšana metoda v povprečju za 20 % bolj natančna od originalne pri iskanju tako enostavnih kot tudi kompleksnih preslikav.

Keywords

podatkovna integracija;iskanje ujemanj in preslikav med shemami;evolucijski algoritmi;računalništvo;računalništvo in informatika;magisteriji;

Data

Language:	Slovenian
Year of publishing:	2015
Typology:	2.09 - Master's Thesis
Organization:	UL FRI - Faculty of Computer and Information Science
Publisher:	[B. Kastelic]
UDC:	004.6.021(043.2)
COBISS:	1536299203
Views:	874
Downloads:	175
Average score:	0 (0 votes)
Metadata:

Other data

Secondary language:	English
Secondary title:	Extending the method for schema matching by adding support for additional data types and taking the structure of the schema into account
Secondary abstract:	Schema matching aims at identifying semantically similar elements of two schemas and determining the mappings between them. This process is one of the main activities in data integration. Most of the existing methods for finding mappings between schemas still have difficulties with complex mappings. Therefore, we decided to improve one of the existing methods which deals primarily with finding complex mappings. The chosen method in based on an evolutionary algorithm that generates progressively better individuals (mappings) only on the basis of data instances. It may happen that in a certain scenario the data instances are not available. That is why we have developed an improved method which also takes the schema data into consideration. We have further improved the chosen method by adding support for additional data types. Our improved method was evaluated on the same test data as the original method. We have found that our method is 20 % more accurate on average than the original one for both simple and complex mappings.
Secondary keywords:	data integration;schema mapping and matching;evolutionary algorithms;computer science;computer and information science;master's degree;
File type:	application/pdf
Type (COBISS):	Master's thesis/paper
Study programme:	1000471
Embargo end date (OpenAIRE):	1970-01-01
Thesis comment:	Univ. v Ljubljani, Fak. za računalništvo in informatiko
Pages:	74 str.
ID:	8751859