Tauzikh Ibragimov,

Kazan Federal University,

18 Kremlevskaya Str., Kazan, 420008, Russia,


Mansur Saikhunov,

Institute of Computer Science, Tatarstan Academy of Sciences, 

36A Levobulachnaya Str., Kazan, 420111, Russia,



The problem of preserving a language as a way of seeing the world is becoming increasingly urgent under the conditions of globalization. This paper presents the results of the authors’ attempt to penetrate the spiritual world of the nation by using the features of the Tatar language written corpus (The Corpus of Written Tatar). The Corpus in its modern version is a general totality of word forms. It enables the researcher to identify the frequency of their use, as well as restrictions on their compatibility with preceding and subsequent word forms. By choosing appropriate word forms and using the search engine of the Corpus, the researcher can obtain information both about the structure of the language and its speakers, and also reveal ethnic and cultural values of Tatars.

Key words: the Corpus of Written Tatar, search engine, linguistic community, word form, right context, left context, frequency of use. 

