Mengiplementasikan Vector Space Model Similarity Euclidean Distance Menggunakan TFIDF Pada Klasifikasi Teks Bahasa Indonesia

  • Reza Fitriansyah Institut Teknologi dan Bisnis Ahmad Dahlan
  • Ellya Sestri Institut Teknologi dan Bisnis Ahmad Dahlan
  • Vany Terisia Institut Teknologi dan Bisnis Ahmad Dahlan
Keywords: K-Nearest Neighbor Algorithm, Vector Space Model, Euclidean Distance, Precision and Recall

Abstract

Weighting based on the term with stemming techniques to get the basic word form term in question. This will the application of the Indonesian language text classification machine using the K-Nearest Neighbor algorithm and the Vector Space Model method on the TFIDF frequency weighting of the number of words and the Euclidean Distance function. comparison between the test documents and the test sample collection Using news documents as learning documents, a total of 10 (10) documents with 3 (three) categories, produces an Precision and Recall 90.00% for k = 5 using frequency weighting in words with the Euclidean Distance function.

References

Adriani 2007 Adriani, M., NAzief, B., Asian J., & Williaws, H. E. Stemming Indonesia A Confixs Stripping Approach. ACMTransactions on Asian Language Information.

Asian, 2005 Asian, J., & Williams, H. E. Stemming Indonesia. Australia Computer Science Conference.

Barakbah, Ali Ridho 2010, Instance Base Classifier (Nearest Neightbour).

Ellis D, Fummer-Hines J, Willett P 1993, Measuring the degree of similarity between objects in text retrieval systems, Perspectives in Information Manajement, 3(2), 128-149.

Ignatow, G, dan Mihalcea. R, 2017, Text Mining A Guidebook for the Social Sciences, SAGE Publication, Inc, London, UK.

Salton, G. 1983, Introduction to Modern Information Retrieval. McGraw Hill.

Published
2022-12-13
How to Cite
Fitriansyah, R., Sestri, E., & Terisia, V. (2022). Mengiplementasikan Vector Space Model Similarity Euclidean Distance Menggunakan TFIDF Pada Klasifikasi Teks Bahasa Indonesia. Jurnal Teknologi Informasi (JUTECH), 3(2), 158-163. https://doi.org/10.32546/jutech.v3i2.2034