Bookcover of Proposing Field Matching Similarity Methods
Booktitle:

Proposing Field Matching Similarity Methods

Implementation and Comparison of Field Similarity Metrics with Duplicate Entities Detection Purpose in Database

LAP LAMBERT Academic Publishing (2013-02-09 )

Books loader

Omni badge eligible for voucher
ISBN-13:

978-3-659-34130-4

ISBN-10:
3659341304
EAN:
9783659341304
Book language:
English
Blurb/Shorttext:
Duplicate records do not have a common key but refer to a unit entity. Databases that include these records have often some errors which cause the matching problem in duplicate records becomes a complex problem. These errors are: typing errors, incomplete information such as abbreviations, ignoring of standard formats or a combination of the above factors. In this book, databases are used in which typing errors are more than other errors. This database contains real estate information that includes 4 fields: name, surname, property address and property area. The goals of this book are: a review on existing algorithms in identifying duplicate data in the fields which are: Edit-distance, Smith-waterman, Jaro, Jaro-Winkler, Lcs and N-gram; description of the proposed algorithms was presented to improve the efficiency and increase the precision of identifying duplication which are the proposed token-based algorithm and the proposed algorithm based on typing error; and comparing these algorithms efficiency in a large Persian database.
Publishing house:
LAP LAMBERT Academic Publishing
Website:
https://www.lap-publishing.com/
By (author) :
Solmaz Khatami
Number of pages:
92
Published on:
2013-02-09
Stock:
Available
Category:
Informatics
Price:
49.00 €
Keywords:
n-gram, Jaro, Edit-Distance, Damerau-Leventein, Jaro-winkler, longest common string, Token-based-Jaro, Typological-error-baed-Jaro

Books loader

Newsletter

Adyen::amex Adyen::mc Adyen::visa Adyen::cup Adyen::unionpay Paypal Wire Transfer

  0 products in the shopping cart
Edit cart
Loading frontend
LOADING