Ministry of Science and Higher Education of the Republic of Kazakhstan
L.N. Gumilyov Eurasian National University

Corpus of Academic Kazakh Language

The main idea of the project is to strengthen the scientific and practical potential and capabilities of the Kazakh language in academic texts – monographs, articles, reports, theses, abstracts, and dissertations – to establish the concept of academic Kazakh, to select the academic vocabulary in Kazakh texts, and to develop a list of academic Kazakh words.

Creation of an academic corpus

Compilation of a corpus of academic texts – a data source for research

Enhancement of scientific potential

Expanding the possibilities of the Kazakh language in science and practical use

Technological integration

Accelerating the use of digital technologies in the humanities

Expanding the scope of use

Increasing research and teaching of the Kazakh language in the international arena

Corpus of Academic Kazakh Language

Digital humanities: Developing the corpus of academic Kazakh language

This project is an important step aimed at enhancing the academic and scientific potential of the Kazakh language. The main goal of the project is to develop a corpus of written academic texts in Kazakh. It includes a large dataset composed of written texts such as monographs, articles, reports, and dissertations, containing at least 5 million words. The corpus helps to recognize the use of the Kazakh language in academic texts and supports its establishment as an academic language.

Corpus of Academic Kazakh Language

The goal of the scientific project

Within the project, an academic written corpus of the Kazakh language consisting of at least 5,000,000 words will be created, including 50,000 annotated words. This corpus will enable comprehensive research into the current scientific use of the Kazakh language. Thus, the collected data will become an important source reflecting the scientific potential of the Kazakh language within the modern research infrastructure. The corpus serves as a necessary tool for deeper analysis of the use of Kazakh in academic texts, improving linguistic resources, and conducting scientific research in this area.

Corpus of Academic Kazakh Language

Directions of the Academic Kazakh Language Corpus

This project is aimed at enhancing the scientific potential of the Kazakh language and exploring the possibilities of academic Kazakh. Below are the sections of the corpus and their functions.

General information about the project

Developing the scientific potential of the Kazakh language and creating an important database for researchers and language learners

Corpus structure and content

A wide range of academic texts: monographs, articles, dissertations, textbooks, teaching materials

Usage and benefits

Use for study and research purposes by students, researchers, language teachers, and learners

Scientific and educational resources

Elevating the research and teaching of the Kazakh language to the international level and integrating with research fields

Corpus of Academic Kazakh Language

Our mission

The project to develop the academic written corpus of the Kazakh language is an initiative that opens new opportunities to study and analyze the natural use of Kazakh in academic texts as a scientific language, aimed at renewing the research and teaching of the language at both national and international levels.

Teaching

The development of the necessary linguodidactic and applied linguistic foundation for teaching Kazakh as the state language and for instructing representatives of other ethnic groups as a foreign language.

Research

Studying the scientific and applied potential and capabilities of the Kazakh language using information technology opportunities, increasing interdisciplinary scientific research.

Community

An open platform for the international community of researchers.

Management

Research projects and efforts to develop the Kazakh language

Principal Investigator

Gulnar Sarseke

Academic degree: Candidate of Philological Sciences (KazSU, 1998), Master’s in Educational Management (King’s College London, 2013).
Academic title: Associate Professor (2001).
Education: Master’s program at King’s College London (2012-2013); postgraduate studies at KazSU (1996-1998); Pavlodar Pedagogical Institute, Kazakh language and literature (1989-1994).
Achievements: recipient of the Bolashak Scholarship (2010); Honored Worker of Education of the Republic of Kazakhstan (2008); International Scholar Exchange Fellowship participant (Korea, 2017-2018); recipient of the ITEC scholarship (India, 2015); research linguist (University of Maryland, USA, 2020).
Project experience: participation in a six-month research project on developing English and Kazakh language corpora at the University of Maryland; co-investigator of the “Multimedia Corpus of Contemporary Kazakh Spoken Language” project at Nazarbayev University (2021-2023); teaching experience in the course of Corpus Linguistics.

Corpus of Academic Kazakh Language

Partners

Subscribe to news

Dear reader, subscribe to our newsletter! You will be the first to know about the latest news, interesting articles, special offers, and events.

Scroll to Top