Form of presentation | Articles in international journals and collections |
Year of publication | 2017 |
Язык | английский |
|
Gataullin Ramil Raisovich, author
Gilmullin Rinat Abrekovich, author
Suleymanov Dzhavdet Shevketovich, author
Khakimov Bulat Ernstovich, author
|
Bibliographic description in the original language |
Gataullin R, Khakimov B, Suleymanov D, Context-Based Rules for Grammatical Disambiguation in the Tatar Language//Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 2017. - Vol.10449 LNAI, Is.. - P.529-537. |
Annotation |
The paper is dedicated to the problem of grammatical ambiguity in the Tatar National Corpus and describes the methodology and software used for automation of the disambiguation process. Grammatical ambiguity is widely represented in agglutinative languages like Turkic or Finno-Ugric. Disambiguation in the corpus is based on the context-oriented classification of ambiguity types which has been carried out on corpus data in the Tatar language for the first time. In this study the corpus is used as a source for the research and at the same time as a destination for implementing the results. The grammatical ambiguity types are detected automatically using the finite-state morphological analyzer and then classified. In order to build up the grammatically disambiguated subcorpus, a special software module was developed. It searches for ambiguous tokens in the corpus, collects statistical information and allows creating and implementing the formal context-based disambiguation rules. |
Keywords |
Disambiguation, Grammatical Homonymy, Context-based Rules, Linguistic Software, Turkic Languages, Corpus Linguistics |
The name of the journal |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
|
URL |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85030853927&doi=10.1007%2f978-3-319-67077-5_51&partnerID=40&md5=7b27e8905839820c051a4e511460c4a3 |
Please use this ID to quote from or refer to the card |
https://repository.kpfu.ru/eng/?p_id=166145&p_lang=2 |
Full metadata record |
Field DC |
Value |
Language |
dc.contributor.author |
Gataullin Ramil Raisovich |
ru_RU |
dc.contributor.author |
Gilmullin Rinat Abrekovich |
ru_RU |
dc.contributor.author |
Suleymanov Dzhavdet Shevketovich |
ru_RU |
dc.contributor.author |
Khakimov Bulat Ernstovich |
ru_RU |
dc.date.accessioned |
2017-01-01T00:00:00Z |
ru_RU |
dc.date.available |
2017-01-01T00:00:00Z |
ru_RU |
dc.date.issued |
2017 |
ru_RU |
dc.identifier.citation |
Gataullin R, Khakimov B, Suleymanov D, Context-Based Rules for Grammatical Disambiguation in the Tatar Language//Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 2017. - Vol.10449 LNAI, Is.. - P.529-537. |
ru_RU |
dc.identifier.uri |
https://repository.kpfu.ru/eng/?p_id=166145&p_lang=2 |
ru_RU |
dc.description.abstract |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
ru_RU |
dc.description.abstract |
The paper is dedicated to the problem of grammatical ambiguity in the Tatar National Corpus and describes the methodology and software used for automation of the disambiguation process. Grammatical ambiguity is widely represented in agglutinative languages like Turkic or Finno-Ugric. Disambiguation in the corpus is based on the context-oriented classification of ambiguity types which has been carried out on corpus data in the Tatar language for the first time. In this study the corpus is used as a source for the research and at the same time as a destination for implementing the results. The grammatical ambiguity types are detected automatically using the finite-state morphological analyzer and then classified. In order to build up the grammatically disambiguated subcorpus, a special software module was developed. It searches for ambiguous tokens in the corpus, collects statistical information and allows creating and implementing the formal context-based disambiguation rules. |
ru_RU |
dc.language.iso |
ru |
ru_RU |
dc.subject |
Disambiguation |
ru_RU |
dc.subject |
Grammatical Homonymy |
ru_RU |
dc.subject |
Context-based Rules |
ru_RU |
dc.subject |
Linguistic Software |
ru_RU |
dc.subject |
Turkic Languages |
ru_RU |
dc.subject |
Corpus Linguistics |
ru_RU |
dc.title |
Context-Based Rules for Grammatical Disambiguation in the Tatar Language |
ru_RU |
dc.type |
Articles in international journals and collections |
ru_RU |
|