Central Asian Problems of Modern Science and Education
Abstract
In this article, it has been considered the cosine similarity and its application to search for similarities in the Uzbek language texts. The algorithm of cosine similarity has been used to determine similarity of Uzbek texts. We give the application of the program, proposed by the authors, to the texts of the educational portal ziyonet.uz dataset;
First Page
95
Last Page
104
DOI
https://doi.org/10.51348/campse0015
References
[1] Rahutomo, Faisal & Kitasuka, Teruaki & Aritsugi, Masayoshi. (2012). Semantic Cosine Similarity.
[2] Kaya Keleş, Mьmine & Цzel, Selma. (2017). Similarity detection between Turkish text documents with distance metrics. 316-321. 10.1109/UBMK.2017.8093399.
[3] Cakir, Ulas & Guldamlasioglu, Seren. (2016). Text Mining Analysis in Turkish Language Using Big Data Tools. 614-618. 10.1109/COMPSAC.2016.203.
[4] H. İ. Зelenli, S. T. Цztьrk, G. Şahin, A. Gerek and M. C. Ganiz, "Document Embedding Based Supervised Methods for Turkish Text Classification," 2018 3rd International Conference on Computer Science and Engineering (UBMK), Sarajevo, 2018, pp. 477-482, doi: 10.1109/UBMK.2018.8566326.
[5] Savaş Yıldırım and Tuğba Yıldız “Learning Turkish Hypernymy Using Word Embeddings”, International Journal of Computational Intelligence Systems, 11/1, pp. 371-383, doi: https://doi.org/10.2991/ijcis.11.1.28
[6] ICEMIS'20: Proceedings of the 6th International Conference on Engineering & MIS 2020September 2020 Article No.: 96 Pages 1–5 https://doi.org/10.1145/3410352.3410832
[7] Robertson, Stephen. (2004). Understanding Inverse Document Frequency: On Theoretical Arguments for IDF. Journal of Documentation - J DOC. 60. 503- 520. 10.1108/00220410410560582.
[8] Habibulla Madatov, San'atbek Matlatipov. "Plagiat va uni fosh qilish dasturlari haqida". UrDU ILM-SARCHASHMALARI.2014-yil. 78-80 betlar.
[9]. Allan, J., Wade, C., & Bolivar, A. (2003). Retrieval and novelty detection at the sentence level. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval (pp. 314-321). ACM.
Recommended Citation
Matlatipov, Sanatbek Gayratovich
(2020)
"COSINE SIMILARITY AND ITS IMPLEMENTATION TO UZBEK LANGUAGE DATA,"
Central Asian Problems of Modern Science and Education: Vol. 2020
:
Iss.
4
, Article 8.
DOI: https://doi.org/10.51348/campse0015
Available at:
https://uzjournals.edu.uz/capmse/vol2020/iss4/8