Keywords
automatic noise reduction, artificial neural networks
Abstract
During the data collecting and labeling process it is possible for noise to be introduced into a data set. As a result, the quality of the data set degrades and experiments and inferences derived from the data set become less reliable. In this paper we present an algorithm, called ANR (automatic noise reduction), as a filtering mechanism to identify and remove noisy data items whose classes have been mislabeled. The underlying mechanism behind ANR is based on a framework of multi-layer artificial neural networks. ANR assigns each data item a soft class label in the form of a class probability vector, which is initialized to the original class label and can be modified during training. When the noise level is reasonably small (< 30%), the non-noisy data is dominant in determining the network architecture and its output, and thus a mechanism for correcting mislabeled data can be provided by aligning class probability vector with the network output. With a learning procedure for class probability vector based on its difference from the network output, the probability of a mislabeled class gradually becomes smaller while that of the correct class becomes larger, which eventually causes a correction of mislabeled data after sufficient training. After training, those data items whose classes have been relabeled are then treated as noisy data and removed from the data set. We evaluate the performance of the ANR based on 12 data sets drawn from the UCI data repository. The results show that ANR is capable of identifying a significant portion of noisy data. An average increase in accuracy of 24.5% can be achieved at a noise levelof 25% by using ANR as a training data filter for a nearest neighbor classifier, as compared to the one without using ANR.
Original Publication Citation
Zeng, X., and Martinez, T. R., "A Noise Filtering Method Using Neural Networks", Proceedings of the IEEE International Workshop on Soft-Computing Techniques in Instrumentation, Measurement, and Related Applications, pp. 26-31, 23.
BYU ScholarsArchive Citation
Martinez, Tony R. and Zeng, Xinchuan, "A Noise Filtering Method Using Neural Networks" (2003). Faculty Publications. 1053.
https://scholarsarchive.byu.edu/facpub/1053
Document Type
Peer-Reviewed Article
Publication Date
2003-05-17
Permanent URL
http://hdl.lib.byu.edu/1877/2410
Publisher
IEEE
Language
English
College
Physical and Mathematical Sciences
Department
Computer Science
Copyright Status
© 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Copyright Use Information
http://lib.byu.edu/about/copyright/