- Afyon Kocatepe Üniversitesi Fen Ve Mühendislik Bilimleri Dergisi
- Volume:23 Issue:5
- Determination of the Classification Success of KNN Algorithm Distance Metric Methods on Wheat Seeds ...
Determination of the Classification Success of KNN Algorithm Distance Metric Methods on Wheat Seeds Dataset
Authors : Ahmet Çelik
Pages : 1142-1149
Doi:10.35414/akufemubid.1263900
View : 233 | Download : 252
Publication Date : 2023-10-30
Article Type : Research Paper
Abstract :Machine learning algorithms are widely used in product sorting processes in the food industry. The attributes of the products are used in the classification process. Attributes vary for each product. In this study, using the k nearest neighbor (KNN) algorithm, the classification of the wheat groups of Kama, Rosa and Canada was performed. The Seeds dataset provided in UCI (University of California, Irvine) machine learning open source data storage was used. There are 70 examples of each wheat class in the data set. In addition, the classification estimation success of distance metrics and the number of training data was measured. Each of the wheat samples was randomly selected and a soft X-ray technique was used to visualize the inner core structure of the wheat in the experimental environment with high quality. According to the training rates ranging from 50% to 90% of the data set, the classification success of the KNN algorithm was tested. In the KNN algorithm, the neighborhood values 1, 3 and 5 were selected to affect the classification success. The successes of the Euclidean, Chebyshev, Manhattan and Mahalanobis distance metric methods of the KNN algorithm were tested according to each k neighborhood value. According to the results obtained, with the Mahalanobis metric method, a classification success rate of 0.9924 accuracy was obtained according to the AUC (Area Under the Curve) success metric by using the neighborhood value of k = 3. In the literature, there is no study comparing the KNN algorithm, neighborhood values and distance vectors together on food data sets using varying training and test data. Therefore, it is thought that the study will make an important contribution to the literature.Keywords : Makine öğrenmesi, Sınıflandırma, Seeds veri seti, KNN algoritması, Uzaklık metrik yöntemleri, Rastgele örnekleme