Effectiveness Bounds for Non-Exhaustive Schema Matching Systems

Authors: 
Smiljanic, M.; van Keulen, M.; Jonker, W.
Author: 
Smiljanic, M
van Keulen, M
Jonker, W
Year: 
2006
Venue: 
ICDCSW 2006
URL: 
http://csdl.computer.org/dl/proceedings/icdew/2006/2571/00/25710083.pdf
Citations: 
2
Citations range: 
1 - 9
AttachmentSize
Smiljanic2006EffectivenessBoundsforNonExh.pdf367.69 KB

Semantic validation of the effectiveness of a schema
matching system is traditionally performed by comparing
system-generated mappings with those of human evaluators.
The human effort required for validation quickly becomes
huge in large scale environments. The performance
of a matching system, however, is not solely determined by
the quality of the mappings, but also by the efficiency with
which it can produce them. Improving efficiency quickly
leads to a trade-off between efficiency and effectiveness.
Establishing or obtaining a large test collection for measuring
this trade-off is often a severe obstacle. In this paper,
we present a technique for determining lower and upper
bounds for effectiveness measures for a certain class
of schema matching system improvements in order to lower
the required validation effort. Effectiveness bounds for a
matching system improvement are solely derived from a
comparison of answer sets of the improved and original
matching system. The technique was developed in the context
of improving efficiency in XML schema matching, but
we believe it to be more generically applicable in other retrieval
systems facing scalability problems.