PILOT EXPERIMENT ON AUTOMATIC DETECTION OF VIETNAMESE MEDIA NARRATIVES ABOUT THE RUSSIA-UKRAINE WAR UNDER LIMITED RESOURCES
DOI:
https://doi.org/10.24025/2707-0573.12.2025.345012Abstract
Background. Narratives in media texts play a crucial role in shaping public opinion on international conflicts, including the Russia–Ukraine war. Systematic investigation of how Vietnamese media narratives are constructed is relevant both from a linguistic perspective and for developing effective strategies of international communication. Traditional narrative analysis is inefficient for large corpora due to its resource intensity, especially for low-resource languages such as Vietnamese (lack of annotated datasets, complex morpheme tokenization, limited access to multi-layered data). Existing studies are largely confined to qualitative discourse analysis of social media and do not employ scalable NLP-based automation.
Purpose. The aim of the article is to develop and test a hybrid methodology for the automated extraction of narratives from Vietnamese media texts under limited computational resources, combining classical narratology with digital humanities methods (NLP, clustering) in order to identify event-centric narrative axes (events, characters, frames) and provide their interpretation.
Methods. The study presents a pilot experiment on a corpus of 160 news items from Báo tin tức, collected via ParseHub and tokenized with Underthesea. An abductive approach (Burch, 2024) was implemented along two complementary strands: (1) an inductive strand using KeyBERT+PhoBERT/SimCSE-Vietnamese (text embeddings, keyphrase extraction) and GPT-4 (grouping into events/characters/themes); (2) a deductive strand using K-means/HDBSCAN+PhoBERT/SimCSE-Vietnamese (embeddings, clustering) and GPT-4 (cluster interpretation). All automatic outputs were subjected to manual verification.
Results. In the first strand, KeyBERT was used to extract keyphrases, which were subsequently mapped onto narrative labels (events, characters), aggregated into thematic groups and narrative frames with the assistance of GPT-4. In the second strand, parallel clustering was performed with K-means and HDBSCAN. The resulting clusters were interpreted and associated with narrative frames and core lexical items. Comparison of the two-vector approach revealed convergence in the extracted semantic axes and narrative frames. In both strands, Vietnamese news were found to prioritise coverage of the war’s impact on local and global economies, politics, and the humanitarian sphere over detailed analysis of military operations.
Discussion. The study demonstrates the effectiveness of the proposed methodology for automated detection of Vietnamese media narratives about the Russia–Ukraine war under limited resources. The abductive design proved methodologically valid, as both strands produced consistent and mutually reinforcing results. The workflow is robust in resource-limited environments such as Google Colab and is scalable to larger corpora. Future research may include systematic comparison of different corpora, integration of NER for character extraction, and dynamic narrative tracking using LSTM-based models, thereby contributing to the analysis of propagandistic and geopolitical discourses.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Вікторія Мусійчук

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors hold full copyright and at the same time they transfer the publishing rights to the journal. The author of a published article has the right to distribute it, post the work in the electronic repository of his/her institution, publish as a part of a monograph, etc. with a required link to the place (output) of its first publication.
The authors confirm that the scientific article submitted for publication has not previously been published and has not been submitted to the editorial office of other journals.
If you have any questions, please contact us:
email: ukrmova@chdtu.edu.ua, o.pchelintseva@chdtu.edu.ua
Viber / WhatsApp: +38 093 789 09 27