•  
  •  
 

Bulletin of TUIT: Management and Communication Technologies

Abstract

The article describes an implementing a real time speaker identification system by voice for embedded and general purpose computers. A review and analysis of existing speaker identification algorithms are made. The speaker's input speech is recorded in the system, go through the preprocessing stage, extract features and voice parameters for further identification. To recognize the speaker by voice parameters, the Vector quantization and Hidden Markov model algorithms are used. The VQ and HMM algorithms showed recognition accuracy of 96% and 98%, respectively.

References

[1] Musayev M.M. Sovremennyye metody tsifrovoy obrabotki rechevykh signalov.// Vestnik TUIT 2(42)/2017. s. 2-13.

[2] Shukurov K.E. Raspberry pi qurilmasida oʻzbek tili nutq buyruqlarini tanib olish tizimini amalga oshirish.// TATU xabarlari 2(54)/2020. 45-61 b.

[3] Shukurov K.E., Ergashev S.B. Biometrik boshqaruv tizimida suxandonni aniqlash masalalariga bo‘lgan yondashuv.// Iqtisodiyotning tarmoqlarini innovasion rivojlanishida axborot-kommunikasiya texnologiyalarining ahamiyati Respublika ilmiy-texnik anjumanining. Maʼruzalar toʻplami 1-qism. 14-15 mart Toshkent 2019 yil. 458-460 b.

[4] Sahoo, J. K. Deepak R. “Speaker recognition using support vector machines.” International Journal of Electrical, Electronics and Data Communication, ISSN: 2320-2084 Volume-2, Issue-2, Feb.-2014.

[5] Singh, S. and E. Rajan. “Vector Quantization Approach for Speaker Recognition using MFCC and Inverted MFCC.” International Journal of Computer Applications 17 (2011): 1-7.

[6] H. B. Kekre, V. A. Bharadi, A. R. Sawant, O. Kadam, P. Lanke and R. Lodhiya. “Speaker recognition using Vector Quantization by MFCC and KMCG clustering algorithm,” 2012 International Conference on Communication, Information & Computing Technology (ICCICT), 2012, pp. 1-5, doi: 10.1109/ICCICT.2012.6398146.

[7] Barlian H., Dahnial S. “Design of Speaker Verification using Dynamic Time Warping (DTW) on Graphical Programming for Authentication Process.” JITeCS Volume 2, Number 1, 2017, pp 11-18

[8] Hossein, S. “Speaker Verification using Convolutional Neural Networks.” https://doi.org/arXiv:1803.05427v2

[9] S.Gourav, S.Goutam, “Real Time Implementation of Speaker Identification System with Frame Picking Algorithm.” Procedia Computer Science Volume 2, 2010, Pages 173-180. doi:10.1016/j.procs.2010.11.022

[10] Ramos-Lara, R., López-García, M., Cantó-Navarro, E. et al. Real-Time Speaker Verification System Implemented on Reconfigurable Hardware. J Sign Process Syst 71, 89–103 (2013). https://doi.org/10.1007

[11] Flanagan, Dzh.L. Analiz, sintez i vospriyatiye rechi / Dzh.L. Flanagan; per. s angl. A.A. Pirogova. – M. : Svyaz', 1968. – 396 s.

[12] Mariethoz, J. “Speaker Verification Based on User-Customized Password.” / J.Mariethoz, B. Herve, M.F. BenZeghiba // IDIAP Research Report 01-13. – Martigny, 2001. – 22 p.

[13] Pellandini, F. “GSM Speech Coding And Speaker Recognition” / F. Pellandini, M. Ansorge, A. Dufaux [at al.] // International Conference on Acoustics, Speech, and Signal Processing (ICASSP): Book of abstracts. – Istanbul, 2000. – vol. 2. – pp.1085–1088.

[14] Amrouche, A. “Effect of GSM speech coding on the performance of Speaker Recognition System.” / A. Amrouche, A. Krobba, M. Debyeche // 10th International Conference on Information Sciences Signal Processing and their Applications (ISSPA): Book of abstracts. – Kuala Lumpur, 2010. – pp. 137–140.

[15] Dvoryankin, S.V. O neobkhodimosti novykh podkhodov k otsenke effektivnosti tekhnicheskikh sredstv akustozashchity / S.V. Dvoryankin // Informatsiya i bezopasnost'. – 2002. – №2. – s. 244–245.

[16] Sorokin, V.N. Opredeleniye pola diktora po golosu / V.N. Sorokin, I.S. Makarov // Akusticheskiy zhurnal. – 2008. – T. 54. – № 4. – S. 659–668.

[17] Galunov, V.I. O vozmozhnosti opredeleniya emotsional'nogo sostoyaniya govoryashchego po rechi / V.I. Galunov // Rechevyye tekhnologii. – 2008. – № 1. – s.60–66.

[18] Romashkin, Yu.N. Raspoznavaniye pola diktora na osnove gmm-modeli golosa / Yu.N. Romashkin, Yu.O. Petrov // Rechevyye tekhnologii. – 2009. – № 2. – s. 31–38.

[19] Sorokin, V.N. Fundamental'nyye issledovaniya rechi i prikladnyye zadachi rechevykh tekhnologiy / V.N. Sorokin // Rechevyye tekhnologii. – 2008. – № 1. – s.18–48.

[20] Grebnov, S.V. Razrabotka i realizatsiya dvukhurovnevogo metoda golosovogo upravleniya na osnove skrytykh markovskikh modeley / S.V. Grebnov // Informatsionnyye tekhnologii. – 2009. – № 9. – S. 40–46.

[21] Douglas, S.C. “Introduction to Adaptive Filters” Digital Signal Processing Handbook Ed. Vijay K. Madisetti and Douglas B. Williams Boca Raton: CRC Press LLC, 1999

[22] Romanyuk A.G, Smirnov A.N., Antonova V.M. Ispol'zovaniye glubokogo obucheniya neyroseti dlya raspoznavaniya golosovykh komand pol'zovatelya. Zhurnal radioelektroniki [electronic journal]. 2019. № 11. Access: http://jre.cplire.ru/jre/nov19/18/text.pdf. DOI 10.30898/1684-1719.2019.11.18

[23] Rabiner, L., Juang, N-H, “Fundamental of speech recognition.” Pearson Education, 2007.

[24] Reynolds D.A. “An Overview of Automatic Speaker Recognition Technology.” The International Conference on Acoustics, Speech, and Signal Processing ICASSP 02. 2002. P. 4072–4075.

[25] Pervushin YE.A. Obzor osnovnykh metodov raspoznavaniya diktorov / YE.A. Pervushin // Matematicheskiye struktury i modelirovaniye 2011, vyp. 24, s. 41–54.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.