Scientific reports of Bukhara State University


Background. Semantic markup is an issue that has been thoroughly studied by experts. If the first generation of language corpora was a collection of electronic texts, then a tool with a query-responsive interface was later formed into literal corporations with linguistic and extralinguistic markings. Linguistically marked corpuses were initially only morphological, then morpho-syntactic, and in recent years the perfect form of linguistic marking - the corpus with morphological, syntactic and semantic markings - has undergone a stage of development. The introduction of semantic markup into the case was initially based on theory, while semantic marking problems were explored. Yu.D. Apresyan, I.M. Boguslavskiy, B.L. Iomdin, E.V. Biryaltsev, A.M. Elizarov, N.G. Jiltsov, V.V. Ivanov, O.A. Nevzorova, V.D. Solovev, I.S. Kononenko, E.A. Sidorova, The research of E.I. Yakovchuk, E.V. Rakhilina, G.I. Kustova, O.N. Lyashevskaya, T.I. Reznikova, O.Yu. Shemanaeva, A.A. Kretov can be included in such works. Methods. The article describes in detail the necessary tools for corpus semantic tagging, additional software tools, a filter that can distinguish poly semantics and homonymy. Also, in the process of semantic tagging are shown ways to develop specific principles of morphological and lexical homonymy, universal vocabulary, words that do not exist in dictionaries, fragmentation, letter-symbolic constructions. The methods of classification, description, comparison, modeling were used to cover the topic of the article. Results. We did not come across any work on the principles of semantic marking of Uzbek language corpus. Lexical-semantic comment system in the corpus interface; there is a system of basic semantic categories used by the user that forms the basis of the search. These categories are the most important element of the corpus because the survey is done on that basis. The corpus's response to a user's request is linked by these characters. The Uzbek semantic markup can be used to create a set of tags and a corpus semantic search interface. Conclusion. In conclusion, it is crystal-clear that they are based on the features of the Russian language, but we have concluded that on the basis of this experience it is possible to create a system of semantic tags specific to the Uzbek language.

First Page


Last Page



1. Zakharov V.P., Bogdanova S.Y. “Corpus linguistics”: a textbook for students of humanitarian universities.- Irkutsk: ISLU, 2011. -161p.

2. Reznikova T.I., Kopotev M.V. “Linguistically annotated corpuses of the Russian language” (overview of publicly available resources) // http://ruscorpora.ru/sbornik2005/04reznikova.pdf

3. Kopotev M.V., Mustajoki A. “Principles of creating the Helsinki annotated corpus of Russian texts (HANCO) on the Internet” // Scientific and technical information. Ser. 2. Information systems and processes. - No. 6.: Corpus linguistics in Russia. 2003. -P. 33-37.

4. Zaliznyak A.A. “Grammar dictionary of the Russian language”. M.: Russian language, 1980. -880 p.

5. Zagorulko M.Yu., Kononenko I.S., Sidorova E.A. “System of semantic markup of the text corpus in a limited subject area” // http://www.dialog-21.ru /media/1372/94.pdf

6. Lyashevskaya O.N., Sichinava D.V., Kobritsov B.P. “Automation of corpus a dictionary on the basis of an array of non-dictionary word forms” // Braslavsky P.I. (editor-in-chief), Internet mathematics - 2007: a collection of works by participants in the competition of scientific projects on information retrieval. - Yekaterinburg: Ural University Publishing House, 2007. - P.130.

7. Akhmedova D.B. “Semantic labeling of language units” // International Journal on Integrated Education. Indonesia. e-ISSN: 2620 3502 p-ISSN:2615 3785. Volume 3, Issue I, Jan 2020. – P.177-179. (SJIF 5.083)

8. Akhmedova D.B. “Set of semantic tags for Uzbek language units: constants and operator/classifier”// International Scientific Journal Theoretical & Applied Science. Philadelphia, USA. Impact Factor,2409-0085(online) Issue: 02 Volume: 82 Published 29.02.2020. – Р.177-179.

9. Bakhtiyor Rajabovich Mengliev, Nigmatova Lolakhon Hamidovna. Problems of language, culture and spirituality in general explanatory dictionaries of Uzbek language / International Journal of Psychosocial Rehabilitation. ISSN: 1475-7192.

10. Mekhrinigor Akhmedova, Bakhtiyor Mengliev. Spirituality in the soul of the language: about linguoma’naviyatshunoslik and its perspektives / American Journal of Research. – USA, Michigan, 2018. – № 9-10.– Р.187-198. (SJIF: 5,065. № 23).

11. Karimov Rustam Abdurasulovich, Mengliev Bakhtiyor Rajabovich. The Role of the Parallel Corpus in Linguistics, the Importance and the Possibilities of Interpretation International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-8, Issue-5S3 July 2019. - Р. 388-391.

Included in

Linguistics Commons



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.