Schema Matching

Schema matching is the process of identifying semantically related objects between two schemas, while mapping refers to the transformations between matched objects. Common challenges to automating matching and mapping include syntactic, structural, representational, and semantic heterogeneities between schemas. Approaches to schema matching can exploit schema information alone or combine schema and instance level information using techniques like linguistic analysis and constraint matching.

Uploaded by

katherine976

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views

Schema Matching

Uploaded by

katherine976

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Schema matching

The terms schema matching and mapping are often used interchangeably for a database process. For this
article, we differentiate the two as follows: Schema matching is the process of identifying that two objects
are semantically related (scope of this article) while mapping refers to the transformations between the
objects. For example, in the two schemas DB1.Student (Name, SSN, Level, Major, Marks) and DB2.Grad-
Student (Name, ID, Major, Grades); possible matches would be: DB1.Student ≈ DB2.Grad-Student;
DB1.SSN = DB2.ID etc. and possible transformations or mappings would be: DB1.Marks to DB2.Grades
(100-90 A; 90-80 B: etc.).

Automating these two approaches has been one of the fundamental tasks of data integration. In general, it is
not possible to determine fully automatically the different correspondences between two schemas —
primarily because of the differing and often not explicated or documented semantics of the two schemas.

Impediments
Among others, common challenges to automating matching and mapping have been previously classified
in[1] especially for relational DB schemas; and in[2] – a fairly comprehensive list of heterogeneity not
limited to the relational model recognizing schematic vs semantic differences/heterogeneity. Most of these
heterogeneities exist because schemas use different representations or definitions to represent the same
information (schema conflicts); OR different expressions, units, and precision result in conflicting
representations of the same data (data conflicts).[1] Research in schema matching seeks to provide
automated support to the process of finding semantic matches between two schemas. This process is made
harder due to heterogeneities at the following levels[3]

Syntactic heterogeneity – differences in the language used for representing the elements
Structural heterogeneity – differences in the types, structures of the elements
Model / Representational heterogeneity – differences in the underlying models (database,
ontologies) or their representations (key-value pairs, relational, document, XML, JSON,
triples, graph, RDF, OWL)
Semantic heterogeneity – where the same real world entity is represented using different
terms or vice versa

Schema matching
[4][5][6][7][8]

Methodology

Discusses a generic methodology for the task of schema integration or the activities involved.[5] According
to the authors, one can view the integration.

Preintegration — An analysis of schemas is carried out before integration to decide upon

some integration policy. This governs the choice of schemas to be integrated, the order of
integration, and a possible assignment of preferences to entire schemas or portions of
schemas.
Comparison of the Schemas — Schemas are analyzed and compared to determine the
correspondences among concepts and detect possible conflicts. Interschema properties
may be discovered while comparing schemas.
Conforming the Schemas — Once conflicts are detected, an effort is made to resolve them
so that the merging of various schemas is possible.
Merging and Restructuring — Now the schemas are ready to be superimposed, giving rise
to some intermediate integrated schema(s). The intermediate results are analyzed and, if
necessary, restructured in order to achieve several desirable qualities.

Approaches

Approaches to schema integration can be broadly classified as ones that exploit either just schema
information or schema and instance level information.[4][5]

Schema-level matchers only consider schema information, not instance data. The available information
includes the usual properties of schema elements, such as name, description, data type, relationship types
(part-of, is-a, etc.), constraints, and schema structure. Working at the element (atomic elements like
attributes of objects) or structure level (matching combinations of elements that appear together in a
structure), these properties are used to identify matching elements in two schemas. Language-based or
linguistic matchers use names and text (i.e., words or sentences) to find semantically similar schema
elements. Constraint based matchers exploit constraints often contained in schemas. Such constraints are
used to define data types and value ranges, uniqueness, optionality, relationship types and cardinalities, etc.
Constraints in two input schemas are matched to determine the similarity of the schema elements.

Instance-level matchers use instance-level data to gather important insight into the contents and meaning
of the schema elements. These are typically used in addition to schema level matches in order to boost the
confidence in match results, more so when the information available at the schema level is insufficient.
Matchers at this level use linguistic and constraint based characterization of instances. For example, using
linguistic techniques, it might be possible to look at the Dept, DeptName and EmpName instances to
conclude that DeptName is a better match candidate for Dept than EmpName. Constraints like zipcodes
must be 5 digits long or format of phone numbers may allow matching of such types of instance data.[9]

Hybrid matchers directly combine several matching approaches to determine match candidates based on
multiple criteria or information sources. Most of these techniques also employ additional information such
as dictionaries, thesauri, and user-provided match or mismatch information[10]

Reusing matching information Another initiative has been to re-use previous matching information as
auxiliary information for future matching tasks. The motivation for this work is that structures or
substructures often repeat, for example in schemas in the E-commerce domain. Such a reuse of previous
matches however needs to be a careful choice. It is possible that such a reuse makes sense only for some
part of a new schema or only in some domains. For example, Salary and Income may be considered
identical in a payroll application but not in a tax reporting application. There are several open ended
challenges in such reuse that deserves further work.

Sample Prototypes Typically, the implementation of such matching techniques can be classified as being
either rule based or learner based systems. The complementary nature of these different approaches has
instigated a number of applications using a combination of techniques depending on the nature of the
domain or application under consideration.[4][5]

Identified relationships
The relationship types between objects that are identified at the end of a matching process are typically
those with set semantics such as overlap, disjointness, exclusion, equivalence, or subsumption. The logical
encodings of these relationships are what they mean. Among others, an early attempt to use description
logics for schema integration and identifying such relationships was presented.[11] Several state of the art
matching tools today[4][7] and those benchmarked in the Ontology Alignment Evaluation Initiative[12] are
capable of identifying many such simple (1:1 / 1:n / n:1 element level matches) and complex matches (n:1 /
n:m element or structure level matches) between objects.

Evaluation of quality

The quality of schema matching is commonly measured by precision and recall. While precision measures
the number of correctly matched pairs out of all pairs that were matched, recall measures how many of the
actual pairs have been matched.

See also
Data integration
Dataspaces
Federated database system
Minimal mappings
Ontology alignment
Schema crosswalk

References
1. Kim, W. & Seo, J. (Dec 1991). "Classifying Schematic and Data Heterogeneity in
Multidatabase Systems.". Computer 24, 12.
2. Sheth, A. P. & Kashyap, V. (1993). "So Far (Schematically) yet So Near (Semantically)". In
Proceedings of the IFIP WG 2.6 Database Semantics Conference on interoperable
Database Systems.
3. Sheth, A. P. (1999). "Changing Focus on Interoperability in Information Systems: From
System, Syntax, Structure to Semantics". In Interoperating Geographic Information Systems.
M. F. Goodchild, M. J. Egenhofer, R. Fegeas, and C. A. Kottman (eds.), Kluwer, Academic
Publishers.
4. Rahm, E. & Bernstein, P (2001). "A survey of approaches to automatic schema matching".
The VLDB Journal 10, 4.
5. Batini, C., Lenzerini, M., and Navathe, S. B. (1986). "A comparative analysis of
methodologies for database schema integration.". ACM Comput. Surv. 18, 4.
6. Doan, A. & Halevy, A. (2005). "Semantic-integration research in the database community". AI
Mag. 26, 1.
7. Kalfoglou, Y. & Schorlemmer, M. (2003). "Ontology mapping: the state of the art". Knowl.
Eng. Rev. 18, 1.
8. Choi, N., Song, I., and Han, H. (2006). "A survey on ontology mapping". SIGMOD Rec. 35, 3.
9. Pereira Nunes, Bernardo; Mera, Alexander; Casanova, Marco Antonio; P. Paes Leme, Luis
Andre; Dietze, Stefan (2013). "Complex Matching of RDF Datatype Properties" (http://www.r
epo.uni-hannover.de/handle/123456789/1358). Database and Expert Systems Applications
- 24th International Conference. Lecture Notes in Computer Science. 8055: 195–208.
doi:10.1007/978-3-642-40285-2_18 (https://doi.org/10.1007%2F978-3-642-40285-2_18).
ISBN 978-3-642-40284-5.
10. Hamdaqa, Mohammad; Tahvildari, Ladan (2014). "Prison Break: A Generic Schema
Matching Solution to the Cloud Vendor Lock-in Problem". IEEE 8th International Symposium
on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems: 37–46.
doi:10.1109/MESOCA.2014.13 (https://doi.org/10.1109%2FMESOCA.2014.13). ISBN 978-
1-4799-6152-8. S2CID 14499875 (https://api.semanticscholar.org/CorpusID:14499875).
11. Ashoka Savasere; Amit P. Sheth; Sunit K. Gala; Shamkant B. Navathe; H. Markus (1993).
"On Applying Classification to Schema Integration". RIDE-IMS.
12. Ontology Alignment Evaluation Initiative::2006 (http://oaei.ontologymatching.org/2006/)

External links
Early work in schema matching (http://knoesis.wright.edu/library/download/S04-Dagstuhl-Ea
rly-Work.pdf)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Schema_matching&oldid=1000323062"

Analytical Chemistry II Classical Methods Notes
100% (1)
Analytical Chemistry II Classical Methods Notes
65 pages
Laurence S Greene - Training Young Distance Runners (2014, Human Kinetics) - Libgen - Li
100% (2)
Laurence S Greene - Training Young Distance Runners (2014, Human Kinetics) - Libgen - Li
257 pages
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
From Everand
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
1/5 (1)
IP-III Tutorial
No ratings yet
IP-III Tutorial
174 pages
Semantic Schema Matching Approaches: A Review
No ratings yet
Semantic Schema Matching Approaches: A Review
9 pages
Information Integration: Existing Methods and Solutions
No ratings yet
Information Integration: Existing Methods and Solutions
25 pages
Week-3 Schema Matching and Mapping
No ratings yet
Week-3 Schema Matching and Mapping
26 pages
New Challenges in Data Integration: Large Scale Automatic Schema Matching
No ratings yet
New Challenges in Data Integration: Large Scale Automatic Schema Matching
8 pages
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application To Schema Matching
No ratings yet
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application To Schema Matching
12 pages
Ontology Mapping Survey: Siyamed Seyhmus SINIR
No ratings yet
Ontology Mapping Survey: Siyamed Seyhmus SINIR
34 pages
Data Integration
No ratings yet
Data Integration
8 pages
Duplicate Record Detection - A Survey
No ratings yet
Duplicate Record Detection - A Survey
16 pages
A RDF-based Data Integration Framework
No ratings yet
A RDF-based Data Integration Framework
6 pages
Record Matching Over Query Results From Multiple Web Databases
No ratings yet
Record Matching Over Query Results From Multiple Web Databases
93 pages
Semantic Integration in Heterogeneous Databases Using Neural Networks
No ratings yet
Semantic Integration in Heterogeneous Databases Using Neural Networks
30 pages
A UML Profile For Modeling Schema Mappings
No ratings yet
A UML Profile For Modeling Schema Mappings
10 pages
15 Iccrts: A Semantics-Based Approach To Schema Matching and Transformation in Network Centric Environments
No ratings yet
15 Iccrts: A Semantics-Based Approach To Schema Matching and Transformation in Network Centric Environments
17 pages
Web Data Integration Summary
No ratings yet
Web Data Integration Summary
10 pages
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
The Clio Project
No ratings yet
The Clio Project
6 pages
Semantic Integration
No ratings yet
Semantic Integration
3 pages
A Suggestion-Based RDF Instance Matching System: January 2017
No ratings yet
A Suggestion-Based RDF Instance Matching System: January 2017
6 pages
01012513 Ijcsis Camera Ready Academia
No ratings yet
01012513 Ijcsis Camera Ready Academia
12 pages
7-Database Integration Nhom4
No ratings yet
7-Database Integration Nhom4
67 pages
Data Structures Explained: A Practical Guide with Examples
From Everand
Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Data Integration: Click To Edit Master Subtitle Style
No ratings yet
Data Integration: Click To Edit Master Subtitle Style
60 pages
Regular Expressions Demystified: A Practical Guide with Examples
From Everand
Regular Expressions Demystified: A Practical Guide with Examples
William E. Clark
No ratings yet
Duplicate Record Detection: A Survey
No ratings yet
Duplicate Record Detection: A Survey
40 pages
Mastering Algorithms and Data Structures
From Everand
Mastering Algorithms and Data Structures
Manish Soni
No ratings yet
NM
No ratings yet
NM
6 pages
The Future of Search
From Everand
The Future of Search
Andres J. Clary
No ratings yet
Data & Knowledge Engineering: David Kensche, Christoph Quix, Xiang Li, Yong Li, Matthias Jarke
No ratings yet
Data & Knowledge Engineering: David Kensche, Christoph Quix, Xiang Li, Yong Li, Matthias Jarke
23 pages
Conceptual Database Design
No ratings yet
Conceptual Database Design
22 pages
ESWC'05 MatchingHandOuts
No ratings yet
ESWC'05 MatchingHandOuts
81 pages
1 IntegrationApproaches
No ratings yet
1 IntegrationApproaches
19 pages
Data Mapping
No ratings yet
Data Mapping
3 pages
Semantic Translation: Fundamentals and Applications
From Everand
Semantic Translation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Peerj Cs 254
No ratings yet
Peerj Cs 254
30 pages
What Is Structured Data?: Information Retrieval
No ratings yet
What Is Structured Data?: Information Retrieval
6 pages
Data Integration
No ratings yet
Data Integration
18 pages
7-Database_Integration_Nhom4
No ratings yet
7-Database_Integration_Nhom4
67 pages
JavaScript Data Structures Explained: A Practical Guide with Examples
From Everand
JavaScript Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
4.on Demand Quality of Web Services Using Ranking by Multi Criteria-31-35
No ratings yet
4.on Demand Quality of Web Services Using Ranking by Multi Criteria-31-35
5 pages
Workshop Master Revealed
From Everand
Workshop Master Revealed
Anil Soni
No ratings yet
DQ Matching
No ratings yet
DQ Matching
6 pages
Irs Unit5
No ratings yet
Irs Unit5
6 pages
Ontology-Based Integration of Information - A Survey of Existing Approaches
No ratings yet
Ontology-Based Integration of Information - A Survey of Existing Approaches
10 pages
JMP for Mixed Models
From Everand
JMP for Mixed Models
Ruth Hummel
No ratings yet
Python Regular Expressions Explained: A Practical Guide with Examples
From Everand
Python Regular Expressions Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Efficient Techniques For Online Record Linkage: Debabrata Dey, Member, IEEE, Vijay S. Mookerjee, and Dengpan Liu
No ratings yet
Efficient Techniques For Online Record Linkage: Debabrata Dey, Member, IEEE, Vijay S. Mookerjee, and Dengpan Liu
15 pages
Database Integration
No ratings yet
Database Integration
17 pages
A Framework For The Integration of Big Data From Heterogenous Sources To A Repository
No ratings yet
A Framework For The Integration of Big Data From Heterogenous Sources To A Repository
6 pages
Mastering Data Structures: Core Concepts and Principles
From Everand
Mastering Data Structures: Core Concepts and Principles
Peter Johnson
No ratings yet
Data Integration
No ratings yet
Data Integration
44 pages
Ontology-Based Mediation With Quality Criteria
No ratings yet
Ontology-Based Mediation With Quality Criteria
12 pages
1110
No ratings yet
1110
4 pages
Idq Fiche
No ratings yet
Idq Fiche
2 pages
Recap Through Exercise
No ratings yet
Recap Through Exercise
37 pages
Semantic Conflict Resolution in Heterogeneous Databases: Interaction Protocols For Domain Ontologies Evolution
No ratings yet
Semantic Conflict Resolution in Heterogeneous Databases: Interaction Protocols For Domain Ontologies Evolution
7 pages
Unit 2
No ratings yet
Unit 2
40 pages
Irs Cie-II Notes
No ratings yet
Irs Cie-II Notes
30 pages
Python Data Structures Explained: A Practical Guide with Examples
From Everand
Python Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Web Bot
No ratings yet
Web Bot
3 pages
Cuil
No ratings yet
Cuil
5 pages
Data Science
No ratings yet
Data Science
7 pages
Change Data Capture
No ratings yet
Change Data Capture
4 pages
Web Service
No ratings yet
Web Service
7 pages
Data Curation
No ratings yet
Data Curation
4 pages
Enterprise Application Integration
100% (1)
Enterprise Application Integration
6 pages
Database Model
No ratings yet
Database Model
8 pages
Master Data Management
No ratings yet
Master Data Management
5 pages
Closest Pair of Points Problem
No ratings yet
Closest Pair of Points Problem
3 pages
TLS06F006-C Covidien PB540 PB560 Spec 2982400 Rev 2 - 7
No ratings yet
TLS06F006-C Covidien PB540 PB560 Spec 2982400 Rev 2 - 7
24 pages
Nursing Worksheet
No ratings yet
Nursing Worksheet
45 pages
Title: "Recycling Glass Bottles To Be Used in Place of The Conventional Building Material, Aggregates, Fine or Coarse Alike"
No ratings yet
Title: "Recycling Glass Bottles To Be Used in Place of The Conventional Building Material, Aggregates, Fine or Coarse Alike"
1 page
IAS Transfer List
No ratings yet
IAS Transfer List
5 pages
Gorakhpur Chapter 13052018
No ratings yet
Gorakhpur Chapter 13052018
37 pages
A Semi-Detailed Lesson Plan in Science For Grade-5 LEARNING COMPETENCIES: Investigate Extent of Soil Erosion in The
No ratings yet
A Semi-Detailed Lesson Plan in Science For Grade-5 LEARNING COMPETENCIES: Investigate Extent of Soil Erosion in The
6 pages
Pharmacy Inspection Checklist - SafetyCulture
No ratings yet
Pharmacy Inspection Checklist - SafetyCulture
16 pages
MADAM RIDES THE BUS
No ratings yet
MADAM RIDES THE BUS
4 pages
Events and Issues - Script
No ratings yet
Events and Issues - Script
2 pages
EDP 112 Module Finals
No ratings yet
EDP 112 Module Finals
90 pages
PAN Change Application
No ratings yet
PAN Change Application
1 page
NB 264
No ratings yet
NB 264
12 pages
BCOC 131 Neeraj HelpBook
No ratings yet
BCOC 131 Neeraj HelpBook
374 pages
Download Full Patent Management: Protecting Intellectual Property and Innovation 1st Edition Oliver Gassmann PDF All Chapters
No ratings yet
Download Full Patent Management: Protecting Intellectual Property and Innovation 1st Edition Oliver Gassmann PDF All Chapters
55 pages
Presentation Case Study
No ratings yet
Presentation Case Study
20 pages
Effects of Educational Video Materials To Bachelor of Physical Education Students' Academic Performance of Leyte Normal University
No ratings yet
Effects of Educational Video Materials To Bachelor of Physical Education Students' Academic Performance of Leyte Normal University
95 pages
Seminar Report Blackberry Phones : Submitted To: Submitted by
No ratings yet
Seminar Report Blackberry Phones : Submitted To: Submitted by
16 pages
Copyofluckertbenchmarklesson-Makinginferencessol7 5g
No ratings yet
Copyofluckertbenchmarklesson-Makinginferencessol7 5g
2 pages
TDS S420MX
No ratings yet
TDS S420MX
2 pages
New EASE Focus Project 2023-09-09 19-22
No ratings yet
New EASE Focus Project 2023-09-09 19-22
7 pages
CM Eg0414p
No ratings yet
CM Eg0414p
1 page
Idle Air Control Valve
No ratings yet
Idle Air Control Valve
14 pages
تترا باك_٠٧٢٦١٧كتاب TETRA PAK BOOK
No ratings yet
تترا باك_٠٧٢٦١٧كتاب TETRA PAK BOOK
232 pages
Nahom Tesfaye
No ratings yet
Nahom Tesfaye
69 pages
13 02.25.Fish Terminal-layout1
No ratings yet
13 02.25.Fish Terminal-layout1
1 page
Binsw11 Chapter 7 Effects of The Environment On Organisms
No ratings yet
Binsw11 Chapter 7 Effects of The Environment On Organisms
26 pages
Jurnal Ambroxol Tablet
No ratings yet
Jurnal Ambroxol Tablet
6 pages

Schema Matching

Uploaded by

Schema Matching

Uploaded by

Schema matching

Preintegration — An analysis of schemas is carried out before integration to decide upon

Retrieved from "https://en.wikipedia.org/w/index.php?title=Schema_matching&oldid=1000323062"

You might also like