1) PhD in Philology, Associate Professor of the Department of Philology, Na-tional Research University “Higher School of Economics”, Russia, Saint-Petersburg, tsherstinova@hse.ru 2) Magister Student, Saint Petersburg State University, Russia, Saint-Petersburg, daveprintseva@edu.hse.ru
The paper concerns three different approaches to studying the topics of everyday conversations: expert thematic annotation, topic modeling, and clustering. The re-search explores transcripts of the Russian spoken language from the ORD corpus, derived from recordings of spontaneous conversations in natural communicative settings (e.g., at home, work, educational institutions, stores, clinics, etc.). The study presents the results of three experiments, each employing a different method for identifying thematic groups: 1) expert thematic annotation of transcripts, providing a detailed and dynamic picture of everyday communication topics, 2) topic modeling, which uncovers latent themes within the corpus of speech transcript, and 3) cluster-ing, used to group conversations by topic based on lexical similarity. The research provides preliminary statistical data on the distribution of topics in everyday speech through expert annotation and automatically identifies thematic classes for various types of communication, such as interactions with colleagues, family members, friends, and within educational contexts. This study assesses the effectiveness of automated methods compared to expert annotation for analyzing a multi-thematic corpus of unstructured everyday speech.
Russian everyday speech; topics of everyday conversations; corpus linguistics; expert annotation; topic modeling; clustering.
Download textFor citing: Sherstinova T.Y.,Veprintseva D.A. (2025) Thematic analysis of everyday conversations: expert approach and automated techniques. Human being: Image and essence. Humanitarian aspects. Moscow. INION RAN.Vol. 2 (62). pp. 89-108. DOI: 10.31249/chel/2025.02.05