ICICS 2017

Home
ICICS 2017

ICICS 2017

International Conference on Intelligent Computing and Systems 2017

Publication Meta:Value
Short Title:ICICS 2017
Publisher:ASDF, India
ISBN 13:978-81-933235-5-7
ISBN 10:81-933235-5-6
Language:English
Type:Hard Bound - Printed Book
Copyrights:ICICS Organizers / DCRC, London, UK
Editor-in-Chief:Dr M Sivaraja
Conference Dates:13 - 14, February 2017
Venue Country:Karur, India
Submitted Papers:456
Acceptance Rate:8.08%
Website:www.icics.asia

Paper 045

Detecting Fuzzy Duplicates in XML Data Using Bayesian Network

J Boopala¹, S Suganya², P Gomathi³

^1,2Assistant Professor/CSE, ³Professor & Dean, NSN College of Engineering and Technology, India

Abstract

Electronic data play an important role in business application and decision making process. The quality of the data can depend on many factors like duplicates, errors, missing values etc. Here it is focused to find fuzzy duplicates in more complex hierarchical structures like XML data the duplicate are classified into the exact duplicates, partially duplicates and set of duplicates. A novel method for XML, duplicate detection called XMLDUP uses Bayesian Network, which is to determine the probability of two XML elements being duplicates.ie by considering two things: information within the elements and the way that the information is structured. Here the classification of the hierarchical data likes parent nodes, child nodes and their values. Then by applying the new conditional and prior probabilities which are easy to identify the duplicates on XML data. The node ordering technique is used which means ordering the contents of data depending upon the features of data. It is used to improve the efficiency of duplicate detection in XML data. Next to derive the automatic pruning factor in order to improve the effectiveness of the duplicates detection. The pruning factor means a certain threshold reached by data means that data's are assumed as duplicates. Thus to improve the efficiency, Network Pruning Strategy is used, which is capable of significant gains over an optimized versions through these experiments will be able to achieve high precision and recall scores in several data sets.

Keywords

Author's Profile

Author profile can be generated and linked through our partners World Book of Researchers. To include your profile online Click Here. After it is approved, please email to edlib @ asdf.res.in to create a link with all the papers.

J Boopala : Profile
S Suganya : Profile
P Gomathi : Profile

Buy Reprints

Download Paper

e-AID

ICICS.2017.045

Cite this Article as Follows

J Boopala, S Suganya, P Gomathi. "Detecting Fuzzy Duplicates in XML Data Using Bayesian Network." International Conference on Intelligent Computing and Systems (2017): 23. Print.

ASDF EDLIB BY Kokula Krishna Hari K, Long CAI & Daniel James