Mengiplementasikan Vector Space Model Similarity Euclidean Distance Menggunakan TFIDF Pada Klasifikasi Teks Bahasa Indonesia
Abstract
Weighting based on the term with stemming techniques to get the basic word form term in question. This will the application of the Indonesian language text classification machine using the K-Nearest Neighbor algorithm and the Vector Space Model method on the TFIDF frequency weighting of the number of words and the Euclidean Distance function. comparison between the test documents and the test sample collection Using news documents as learning documents, a total of 10 (10) documents with 3 (three) categories, produces an Precision and Recall 90.00% for k = 5 using frequency weighting in words with the Euclidean Distance function.
References
Adriani 2007 Adriani, M., NAzief, B., Asian J., & Williaws, H. E. Stemming Indonesia A Confixs Stripping Approach. ACMTransactions on Asian Language Information.
Asian, 2005 Asian, J., & Williams, H. E. Stemming Indonesia. Australia Computer Science Conference.
Barakbah, Ali Ridho 2010, Instance Base Classifier (Nearest Neightbour).
Ellis D, Fummer-Hines J, Willett P 1993, Measuring the degree of similarity between objects in text retrieval systems, Perspectives in Information Manajement, 3(2), 128-149.
Ignatow, G, dan Mihalcea. R, 2017, Text Mining A Guidebook for the Social Sciences, SAGE Publication, Inc, London, UK.
Salton, G. 1983, Introduction to Modern Information Retrieval. McGraw Hill.
Copyright (c) 2022 Jutech: Jurnal Teknologi Informasi
This work is licensed under a Creative Commons Attribution 4.0 International License.