Search 닫기

Regular paper

Split Viewer

Journal of information and communication convergence engineering 2024; 22(4): 336-343

Published online December 31, 2024

https://doi.org/10.56977/jicce.2024.22.4.336

© Korea Institute of Information and Communication Engineering

Computer-Vision-Based Mobile Application for Translating Sundanese Scripts to Modern Indonesian Language With Gamification Strategies

Wanda Gusdya Purnama 1, Handoko Supeno 1, Anggoro Ari Nurcahyo 1, Ayi Purbasari 1, Aria Bisma Wahyutama 2, and Mintae Hwang2*

1Department of Informatics Engineering, Pasundan University, Bandung, 40153, Indonesia
2Department of Information and Communication Engineering, Changwon National University, Changwon, 51140, Republic of Korea

Correspondence to : Mintae Hwang (E-mail: mthwang@cwnu.ac.kr)
Department of Information and Communication Engineering, Changwon National University, Changwon 51140, Republic of Korea

Received: May 9, 2024; Revised: June 21, 2024; Accepted: June 21, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

In the digital era, teaching endangered local languages and scripts to children has become challenging owing to the scarcity of learning media and materials. The present study addresses this problem through the development of a mobile application that classifies and automatically converts Sundanese scripts into Latin using computer vision algorithms. The proposed method represents an innovative solution for capturing children's interest using gamification strategies. We discuss the development, implementation, and evaluation of YOLOv8, a deep learning technology for computer vision in mobile applications. A pilot study conducted on children aged 7-12 years revealed significant improvements in their interest and knowledge of Sundanese scripts, as the children were able to memorize, identify, and write 5-8 words in Sundanese characters out of 10 randomly selected words. Furthermore, the model achieved 80% accuracy for almost all Sundanese-scripted words, indicating satisfactory results. This study combines computer vision with gamification to facilitate the learning of Sundanese scripts, thereby paving the way for future innovation.

Keywords Computer Vision, Mobile Application, Sundanese Scripts, Modern Indonesian Language, Gamification Strategy

The nation of Indonesia comprises over 17,000 islands spanning more than 1.9 million square kilometers. With such a wide area, Indonesia inhabits some of the most exotic tribes and cultures in the world. Among of the most popular of these tribes is the Sunda tribe, populating the western area of Java Island. One legacy of the Sundanese people is a language script that was widely used during Indonesia’s kingdom period, which is very different from Latin. The Sundanese script played a prominent role in the region during ancient times, and has endured over an extensive period. Thus, it encapsulates historical narratives, scientific knowledge, and wisdom from bygone times. The script includes characters such as ngalagena aksara for consonants, swara aksara for vocals, angka aksara for numbers, and rarangkèn for punctuation. Fig. 1 shows that the remnants of these historical legacies persist even today, having been carefully preserved.

Fig. 1. Historical objects and documents inscribed with Sundanese script.

However, as Latin letters have been adopted in the Sundanese language, traditional Sundanese scripts are no longer used due to a lack of public awareness, hindering their preservation. This poses a challenge for future generations of potential learners, incurring a risk of extinction for the language. Although valuable documents written in Sundanese script still exist in historical places such as museums, few individuals can read them. Furthermore, although some elementary schools in West Java Province continue to offer courses in Sundanese script, these curricula face challenges stemming from the lack of appropriate facilities, classroom conditions, and optimized media learning. The declining use of Sundanese scripts in daily life also makes the preservation of these scripts increasingly difficult.

To address the aforementioned problems, we employed machine learning to detect Sundanese scripts and convert them to Latin-based Sundanese, which can then be translated into Indonesian. Moreover, we adopted gamification strategies to capture the interests of young learners, aiming to narrow the gap in the comprehension of Sundanese scripts. We used YOLOv8 object detection, a highly accurate single-shot detector algorithm [1,2], which was combined with the Google Translate API to translate Sundanese to Indonesian through a mobile application, as demonstrated experimentally.

The present study represents the first attempt to implement end-to-end Sundanese script word detection and translation to other languages in a mobile format, improving upon similar works conducted in [3], [4], and [5]. Furthermore, this study is the first to use gamification to teach the Sundanese script. The contributions of this study are as follows:

1. We propose the first end-to-end Sundanese script word detection and translation model designed for mobile devices, based upon computer vision technology.

2. We provide new learning media with a gamification approach for teaching ancient scripts to children.

The following subsections discuss the methodology and design of this study, including a top-down overview of the system architecture that illustrates all necessary components, system workflow, and initial user interface (UI) prior to implementation.

A. System Method

An overview of the system method used throughout this study is presented in Fig. 2.

Fig. 2. Overview of Sundanese detection and translation system.

1) Designing System Architecture and Workflow

A system architecture and workflow were designed to ensure that the modules and components necessary for the application can be efficiently built and effectively integrated.

2) Designing Gamification Concepts and UI for Mobile Application

A gamification process was designed to increase user engagement and enhance the educational utility of the Sundanese script detection and translation system. Furthermore, a well-designed UI can enhance user-friendliness, thereby improving the application’s usability.

3) Model Training

The model was trained to accurately identify, recognize, and interpret the distinctive characteristics of each word written in the Sundanese script.

4) Evaluating Model Performance

Appropriate metrics were utilized to evaluate model performance, assessing the success and effectiveness of the training process.

5) Integration of Model Into Mobile Application

The trained model was integrated into a mobile application, providing seamless operation for detecting and translating Sundanese scripts using a smartphone.

6) Application Testing

The application was deployed with child participants to gather feedback on its usability and features.

B. System Design

This subsection presents the system architecture and workflow, showing the necessary components and activities along with the initial UI design.

1) System Architecture

Fig. 3 shows the system architecture and Fig. 4 illustrates the system workflow. As shown in Fig. 3, the system comprises four major components: mobile application, YOLO algorithm, Sundanese script, and Google Translate API.

Fig. 3. System architecture.

Fig. 4. Workflow of the system.

YOLOv8 was selected for the detection of Sundanese characters because it is faster and more efficient than convolutional neural networks (CNNs) and their competitors. YOLO employs an artificial neural network (ANN) to detect objects within a given image. The network divides the image into several regions, predicts the bounding boxes of objects, and calculates the probabilities within each region. These bounding boxes are then compared with the predicted probabilities.

In the system workflow shown in Fig. 4, after opening the mobile application, the user must use a smartphone camera to obtain an image stream of the Sundanese script, initiating the frame-by-frame preprocessing of the stream. Subsequently, YOLOv8 uses OpenCV to detect the frame and starts looking for the Sundanese script within the preprocessed image stream. If words written in the script are detected, YOLOv8 obtains the corresponding word classes in Latin. The Google Translate API then translates these word classes into Bahasa Indonesia. Finally, users can see the Sundanese script bounding box for each word, along with the word class in Latin letters and translated word in Bahasa.

2) Gamification Concepts and UI

A foundational step for any student of a language is acquiring a strong grasp of vocabulary, which lays the groundwork for effective communication and comprehension. Recognizing the importance and challenges of this task, several research initiatives have focused on leveraging technology to facilitate and enhance the learning of vocabulary [6,7]. To this end, gamification has been employed as a strategy to enhance memorization [8,9]. The integration of game-like elements into learning activities has been shown to significantly improve information retention in learners. This approach is not new, as indicated by prior studies underscoring the effectiveness of gamification in educational settings. The underlying concept hinges on the natural human propensity for games and competitions, turning routine memorization tasks into engaging and enjoyable experiences. Without innovative methods such as gamification to capture their interests, it may be challenging to motivate young learners to use educational products.

We adopted gamification to transform the activity of capturing images and translating text into a series of enjoyable tasks. This method encompasses four main activities:

  • “Write it,” where learners transcribe given words into Sundanese script as requested.

  • “Detect it” involves the detection of words in Sundanese script using computer vision.

  • “Collect it” allows learners to gather and save their achievements by acting as a tangible reward system.

  • “Place it” includes activities to use the right word in the Sundanese script in specific situations.

Fig. 5 illustrates the proposed gamification approach designed to foster a positive and stimulating learning environment that encourages continuous engagement and improvement.

Fig. 5. Gamification concept of mobile application.

The UI design approach for mobile applications differs slightly from that used in [10]. Because our objective is to simplify the translation process from Sundanese to Indonesian, activities from the detection to the translation of Sundanese scripts must occur on the same page.

The following subsections discuss the data acquisition and model training processes, as well as a description of each word class used to develop the detection model.

A. Model Training

The training process initiates with dataset collection, wherein relevant data are compiled to serve as the foundation for model training. Following collection, the dataset undergoes labeling, where the data are annotated with accurate categorical tags to make them comprehensible to the training model. The dataset is split into subsets − typically training, validation, and testing sets − to ensure a robust and unbiased training process. In the subsequent preprocessing stage, the data are cleaned and transformed to an appropriate format and quality for the training algorithm. During model training, the preprocessed data are fed into the model, enabling it to learn from the patterns and relationships within the data. Finally, testing is conducted using the unseen portion of the dataset to evaluate the model’s performance and generalizability to new data, thereby ensuring its practical applicability. Fig. 6 illustrates the model training process.

Fig. 6. Training process and steps.

Prior to model training, we compiled a sufficiently large dataset of words written in Sundanese script. Unlike in other studies conducted to detect words with Latin characters [11,12], finding appropriate data was challenging owing to a scarcity of resources related to the Sundanese script. Therefore, we created a personalized dataset from scratch by writing each word class individually. The overall dataset encompassed 1,200 images of 608 × 800 pixels in JPG format. Additionally, we determined 37 classes corresponding to the mostused words as labels for the images. These classes are listed in Table 1.

Table 1 . Word classes

NoSundaneseBahasaEnglish
1NaonApaWhat
2KunaonKenapaWhy
3SahaSiapaWho
4KumahaBagaimanaHow
5PunteunMaafSorry
6IrahaKapanWhen
7KamanaKe manaWhere
8TimanaDari manaFrom Where
9SabarahaBerapaHow Much
10Aya NaonAda apaWhat Happen
11Nuju NaonSedang apaWhat are you doing
12Sareng SahaBersama siapaWith Whom
13Hatur NuhunTerima kasihThanks
14AbdiSayaMe
15ManehAndaYou
16Wilujeung énjingSelamat pagiGood morning
17Wilujeung wengiSelamat malamGood night
18MirahMurahCheap
19AwisMahalExpensive
20KamariKemarinYesterday
21ArtosUangMoney
22DangukeunDengarkanListen
23GampilMudahEasy
24GeulisCantikBeautiful
25HayuAyoLet’s go
26IndungIbuMother
27IsinMaluShy
28KaduhungMenyesalRegret
29KasepTampanHandsome
30Kumaha DamangApa kabarHow are you
31ManggaSilakanPlease
32MoalTidak akanWill not
33Nu leresYang benarCorrect
35RaosEnakDelicious
36TunduhMengantukSleepy
37MeuliMembeliBuy


The image representations of these word classes were collected in different styles, such as written on a sheet of paper or typed digitally, and captured at different brightness levels. Following collection, the images were labeled using LabelImg, a graphical image annotation tool. Subsequently, the labeled dataset was divided into training, validation, and testing subsets. Fig. 7 presents sample data used in the set.

Fig. 7. Sample data.

For classification, we adopted the YOLOv8 object detection algorithm, an anchor-free model that implements the latest developments in YOLO-based algorithms [2]. YOLOv8 performs classification directly at the center of a given object, instead of at the offset from a known anchor box. Anchor boxes are notoriously tricky components of earlier YOLO algorithms, as they may represent the distribution of boxes of the target benchmark but not that of the custom dataset. We used YOLOv8n, which comprises 255 layers and 3157200 parameters. The model was trained over 100 epochs using an NVIDIA T4 GPU, and the dataset split was 80% for training, 10% for validation, and 10% for testing. The training results are shown in Fig. 8.

Fig. 8. Performance metrics of model training.

B. Model Evaluation

Following model training, we tested the model with a subset of unseen images to generate bounding boxes for Sundanese script, produce corresponding labels, and then translate the words into Indonesian using the Google Translate API. To ensure that the model can run on a standard computer, the was performed using a less powerful computer than that used for training. As in [13], tests were performed to determine the detection and translation accuracies.

Fig. 8 presents training results for the detection of words written on a single sheet of paper, with the model having achieved an accuracy exceeding 80% for almost all Sundanese word classes. To further evaluate model performance, we conducted an additional test using a smartphone camera. Here, the model accurately detected almost all Sundanese scripts with a confidence level of 90-100% as shown in Fig. 9.

Fig. 9. Model testing using smartphone camera.

This section discusses the proposed model’s integration into mobile applications, as well as a study with child learners.

A. Model Integration

We used a wide set of Android libraries to develop the mobile applications, including a native library and Android graphics. These libraries provide tools that handle standard graphic operations such as picture resizing. Such tools are required to deliver images from drawing or camera inputs in a specific format to YOLOv8 image classification, as the YOLO architecture accepts images in a three-channel RGB format for the first layer. Once the images were acquired, the trained model was loaded and fed the images as input. These input images were transformed and analyzed, and the output category was returned to the mobile application. Fig. 10 depicts a screenshot of the mobile application.

Fig. 10. Mobile application UI screenshots.

B. Application Testing

The participants selected to test the proposed system’s effectiveness were elementary school students aged within 7-12 years. The students participated in the following activities:

  • Write words in Sundanese script.

  • Try to obtain the right script by detecting their work.

  • Collect as many words as possible.

  • Play games.

The objective of this test was to determine whether the proposed gamification-based learning process improved the learners’ interest and knowledge. From a learning perspective, we tested the application with elementary school students to determine that they could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Notably, none of the children were familiar with the Sundanese script before the test was conducted. Furthermore, some children continued learning even more words in a mentor-guided session after the test period was over, demonstrating a retained interest in learning the Sundanese script. These results demonstrate that the proposed approach improves students’ interest in learning the script.

To comprehensively interpret the effectiveness of our approach, we compared our results with those of previous studies on Sudanese script recognition [4,14,15]. The results show that the YOLOv8 algorithm achieves an accuracy of 80%, a competitive result compared to those of other methods. The gamification element not only contributes toward improving children’s interest and engagement, but also enhances their learning outcomes in recognizing and writing Sundanese scripts. Overall, our approach demonstrates that combining advanced computer vision techniques with engaging learning methodologies can significantly improve the efficacy of educational tools for endangered languages.

Owing to the continuous and strong flow of modern culture, the Sundanese language and script have become endangered, with the Sundanese script being especially at risk of extinction. In an effort to reduce this risk, we developed a YOLOv8 object detection model that can identify Sundanese scripts, convert them to Latin scripts, and then translate them into Indonesian.

The model was trained using a dataset of 1,200 images with various characteristics and labeled using LabelImg. During our experiments, the model achieved a detection accuracy exceeding 80% for almost all Sundanese word classes. The trained model was integrated into a mobile application designed to teach children to read and write in the Sundanese script.

The gamification concept was embedded in the application to motivate the learning process in students. The application was tested on participants aged 7-12, with results showing that the children could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Furthermore, several children showed increased interest in continuing to learn the script even after the testing period.

Because our training dataset was relatively small, we intend to conduct further studies with larger datasets encompassing a greater variety of camera angles, lighting, writing styles, and other features. In the process, the model will be retrained to enhance its performance in detecting and identifying Sundanese scripts.

  1. D. Reis, J. Kupec, J. Hong, and A. Daoudi, “Real-time flying object detection with YOLOv8,” arXiv preprint arXiv: 2305.09972, May 2023. DOI: 10.48550/ARXIV.2305.09972.
  2. J. Terven, D. Cordova-Esparza, and J. Romero-Gonzalez, “A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLO-NAS,” Machine Learning & Knowledge Extraction, Nov. 2023. DOI: 10.3390/make5040083.
    CrossRef
  3. I. Ikhsan and D. I. Mulyana, “Optimizing the Implementation of the YOLO and data algorithm augmentation in Hanacaraka Javanese script language classification,” JUSIKOM PRIMA, vol. 7, no. 1, pp. 8-16, Aug. 2023. DOI: 10.34012/jurnalsisteminformasidanilmukomputer.v7i1.4062.
    CrossRef
  4. D. Arifadilah, “Sunda script detection using You Only Look Once algorithm,” Journal of Artificial Intelligence and Engineering Applications, vol. 3, no. 2, pp. 606-613, Feb. 2024. DOI: 10.59934/jaiea.v3i2.443.
    CrossRef
  5. A. Prasetiadi, J. Saputra, I. Kresna, and I. Ramadhanti, “YOLOv5 and U-Net-based character detection for Nusantara script,” JURNAL ONLINE INFORMATIKA, vol. 8, no. 2, pp. 232-241, Dec. 2023. DOI: 10.15575/join.v8i2.1180.
    CrossRef
  6. F. Çakmak, E. Namaziandost, and T. Kumar, “CALL-enhanced L2 vocabulary learning: Using spaced exposure through CALL to enhance L2 vocabulary retention,” Education Research International, vol. 2021, pp. 1-8, Sep. 2021. DOI: 10.1155/2021/5848525.
    CrossRef
  7. X. Yang, L.-J. Kuo, Z. R. Eslami, and S. M. Moody, “Theoretical trends of research on technology and L2 vocabulary learning: A systematic review,” Journal of Computers in Education, vol. 8, no. 4, pp. 465-483, May 2021. DOI: 10.1007/s40692-021-00187-8.
    CrossRef
  8. K. Futami, D. Kawahigashi, and K. Murao, “Mindless memorization booster: A method to influence memorization power using attention induction phenomena caused by visual interface modulation and its application to memorization support for English vocabulary learning,” Electronics, vol. 11, no. 14, p. 2276, Jul. 2022. DOI: 10.3390/electronics11142276.
    CrossRef
  9. W. Bancha and N. Tongtep, “Enhancing vocabulary memorization and retention through LMS and MultiEx game platforms among Thai tertiary students,” International Journal of Learning, Teaching and Educational Research, vol. 20, no. 10, pp. 17-192, Oct. 2021. DOI: 10.26803/ijlter.20.10.10.
    CrossRef
  10. D. S. Saputra, D. A. Yonanda, and Y. Yuliati, “The development of android-based mobile learning in learning Sundanese script for elementary school students,” in Proceedings of the 3rd International Conference on Learning Innovation and Quality Education (ICLIQE 2019), Solo Baru, IN, 2020. DOI: 10.2991/assehr.k.200129.086.
    CrossRef
  11. X. Wang, S. Zheng, C. Zhang, R. Li, and L. Gui, “R-YOLO: A Realtime text detector for natural scenes with arbitrary rotation,” Sensors, vol. 21, no. 3, p. 888, Jan. 2021. DOI: 10.3390/s21030888.
    Pubmed KoreaMed CrossRef
  12. R. Mondal, S. Malakar, E. H. Barney Smith, and R. Sarkar, “Handwritten English word recognition using a deep learning based object detection architecture,” Multimedia Tools and Applications, vol. 81, pp. 975-1000, Sep. 2021. DOI: 10.1007/s11042-021-11425-7.
    CrossRef
  13. M. Safran, A. Alajmi, and S. Alfarhood, “Efficient multistage license plate detection and recognition using YOLOv8 and CNN for smart parking systems,” Journal of Sensors, vol. 2024, pp. 1-18, Feb. 2024. DOI: 10.1155/2024/4917097.
    CrossRef
  14. M. A. Prameswari, M. Dwi Sulistiyo, and A. F. Ihsan, “Classification of handwritten Sundanese script via transfer learning on CNN-based architectures,” in in 2023 3rd International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), Yogyakarta, IN, pp. 401-406, 2023. DOI: 10.1109/ICE3IS59323.2023.10335382.
    CrossRef
  15. H. Salsabila, E. Rachmawati, and F. Sthevanie, “Sundanese aksara recognition using histogram of oriented gradients,” in in 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, IN, pp. 253-258, 2019. DOI: 10.1109/ISRITI48646.2019.9034589.
    CrossRef

Wanda Gusdya

He received a Bachelor’s degree in Informatics Engineering from Pasundan University and a Master’s degree in Electrical Engineering from the Bandung Institute of Technology, Bandung, Indonesia. His research interests include deep learning and mobile applications. Since 2012, he has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia.


Handoko Supeno

He received a Bachelor’s degree in Informatics Engineering from Pasundan University and a Master’s degree in Electrical Engineering from the Bandung Institute of Technology, Bandung, Indonesia, where h e is c urrently p ursuing a P hD in Electrical Engineering and Informatics. His research interests include deep learning and computer vision. Since 2015, he has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia.


Anggoro Ari Nurcahyo

He received a Bachelor’s degree in Informatics from Pasundan University and a Master’s degree at LIKMI, Bandung, Indonesia. His research interests include databases and programming. Since 2015, he has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia.


Ayi Purbasari

She received her PhD in Electrical Engineering and Informatics from the School of Electrical Engineering and Informatics, Bandung Institute of Technology, Indonesia. Her current research focus encompasses artificial intelligence, artificial immune systems, optimization problems, and parallel computing. She has served as a reviewer at several IEEE conferences. Her research interests include machine learning, the Internet of Things, and data science. From 2004 to now, she has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia. Currently she is a head of the department at the same institution.


Aria Bisma Wahyutama

He received his BE degree in Informatics Engineering from the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia, in 2020. He then received his MSE degree in Information and Communication Engineering from the Department of Information and Communication Engineering, Changwon National University, Changwon, Republic of Korea, in 2022 and is continuing his PhD studies at the same institution. His research interests include web and mobile programming, database design, digital game-based learning, Internet of Things applications, and related topics.


Mintae Hwang

He received his BS, MS, and PhD degrees in Computer Engineering from the Department of Computer Engineering, Pusan National University, Pusan, Republic of Korea in 1990, 1992, and 1996, respectively. From 1996 to 1999, he worked as a senior research member of the Protocol Engineering Center, Electronics and Telecommunications Research Institute, Daejeon, Republic of Korea. Since 1999, he has been a professor at the Department of Information and Communication Engineering, Changwon National University, Changwon, Republic of Korea. His research interests include communication protocols, database design, Internet of Things applications, machine learning, and smart cities.


Article

Regular paper

Journal of information and communication convergence engineering 2024; 22(4): 336-343

Published online December 31, 2024 https://doi.org/10.56977/jicce.2024.22.4.336

Copyright © Korea Institute of Information and Communication Engineering.

Computer-Vision-Based Mobile Application for Translating Sundanese Scripts to Modern Indonesian Language With Gamification Strategies

Wanda Gusdya Purnama 1, Handoko Supeno 1, Anggoro Ari Nurcahyo 1, Ayi Purbasari 1, Aria Bisma Wahyutama 2, and Mintae Hwang2*

1Department of Informatics Engineering, Pasundan University, Bandung, 40153, Indonesia
2Department of Information and Communication Engineering, Changwon National University, Changwon, 51140, Republic of Korea

Correspondence to:Mintae Hwang (E-mail: mthwang@cwnu.ac.kr)
Department of Information and Communication Engineering, Changwon National University, Changwon 51140, Republic of Korea

Received: May 9, 2024; Revised: June 21, 2024; Accepted: June 21, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

In the digital era, teaching endangered local languages and scripts to children has become challenging owing to the scarcity of learning media and materials. The present study addresses this problem through the development of a mobile application that classifies and automatically converts Sundanese scripts into Latin using computer vision algorithms. The proposed method represents an innovative solution for capturing children's interest using gamification strategies. We discuss the development, implementation, and evaluation of YOLOv8, a deep learning technology for computer vision in mobile applications. A pilot study conducted on children aged 7-12 years revealed significant improvements in their interest and knowledge of Sundanese scripts, as the children were able to memorize, identify, and write 5-8 words in Sundanese characters out of 10 randomly selected words. Furthermore, the model achieved 80% accuracy for almost all Sundanese-scripted words, indicating satisfactory results. This study combines computer vision with gamification to facilitate the learning of Sundanese scripts, thereby paving the way for future innovation.

Keywords: Computer Vision, Mobile Application, Sundanese Scripts, Modern Indonesian Language, Gamification Strategy

I. INTRODUCTION

The nation of Indonesia comprises over 17,000 islands spanning more than 1.9 million square kilometers. With such a wide area, Indonesia inhabits some of the most exotic tribes and cultures in the world. Among of the most popular of these tribes is the Sunda tribe, populating the western area of Java Island. One legacy of the Sundanese people is a language script that was widely used during Indonesia’s kingdom period, which is very different from Latin. The Sundanese script played a prominent role in the region during ancient times, and has endured over an extensive period. Thus, it encapsulates historical narratives, scientific knowledge, and wisdom from bygone times. The script includes characters such as ngalagena aksara for consonants, swara aksara for vocals, angka aksara for numbers, and rarangkèn for punctuation. Fig. 1 shows that the remnants of these historical legacies persist even today, having been carefully preserved.

Figure 1. Historical objects and documents inscribed with Sundanese script.

However, as Latin letters have been adopted in the Sundanese language, traditional Sundanese scripts are no longer used due to a lack of public awareness, hindering their preservation. This poses a challenge for future generations of potential learners, incurring a risk of extinction for the language. Although valuable documents written in Sundanese script still exist in historical places such as museums, few individuals can read them. Furthermore, although some elementary schools in West Java Province continue to offer courses in Sundanese script, these curricula face challenges stemming from the lack of appropriate facilities, classroom conditions, and optimized media learning. The declining use of Sundanese scripts in daily life also makes the preservation of these scripts increasingly difficult.

To address the aforementioned problems, we employed machine learning to detect Sundanese scripts and convert them to Latin-based Sundanese, which can then be translated into Indonesian. Moreover, we adopted gamification strategies to capture the interests of young learners, aiming to narrow the gap in the comprehension of Sundanese scripts. We used YOLOv8 object detection, a highly accurate single-shot detector algorithm [1,2], which was combined with the Google Translate API to translate Sundanese to Indonesian through a mobile application, as demonstrated experimentally.

The present study represents the first attempt to implement end-to-end Sundanese script word detection and translation to other languages in a mobile format, improving upon similar works conducted in [3], [4], and [5]. Furthermore, this study is the first to use gamification to teach the Sundanese script. The contributions of this study are as follows:

1. We propose the first end-to-end Sundanese script word detection and translation model designed for mobile devices, based upon computer vision technology.

2. We provide new learning media with a gamification approach for teaching ancient scripts to children.

II. SYSTEM METHOD AND DESIGN

The following subsections discuss the methodology and design of this study, including a top-down overview of the system architecture that illustrates all necessary components, system workflow, and initial user interface (UI) prior to implementation.

A. System Method

An overview of the system method used throughout this study is presented in Fig. 2.

Figure 2. Overview of Sundanese detection and translation system.

1) Designing System Architecture and Workflow

A system architecture and workflow were designed to ensure that the modules and components necessary for the application can be efficiently built and effectively integrated.

2) Designing Gamification Concepts and UI for Mobile Application

A gamification process was designed to increase user engagement and enhance the educational utility of the Sundanese script detection and translation system. Furthermore, a well-designed UI can enhance user-friendliness, thereby improving the application’s usability.

3) Model Training

The model was trained to accurately identify, recognize, and interpret the distinctive characteristics of each word written in the Sundanese script.

4) Evaluating Model Performance

Appropriate metrics were utilized to evaluate model performance, assessing the success and effectiveness of the training process.

5) Integration of Model Into Mobile Application

The trained model was integrated into a mobile application, providing seamless operation for detecting and translating Sundanese scripts using a smartphone.

6) Application Testing

The application was deployed with child participants to gather feedback on its usability and features.

B. System Design

This subsection presents the system architecture and workflow, showing the necessary components and activities along with the initial UI design.

1) System Architecture

Fig. 3 shows the system architecture and Fig. 4 illustrates the system workflow. As shown in Fig. 3, the system comprises four major components: mobile application, YOLO algorithm, Sundanese script, and Google Translate API.

Figure 3. System architecture.

Figure 4. Workflow of the system.

YOLOv8 was selected for the detection of Sundanese characters because it is faster and more efficient than convolutional neural networks (CNNs) and their competitors. YOLO employs an artificial neural network (ANN) to detect objects within a given image. The network divides the image into several regions, predicts the bounding boxes of objects, and calculates the probabilities within each region. These bounding boxes are then compared with the predicted probabilities.

In the system workflow shown in Fig. 4, after opening the mobile application, the user must use a smartphone camera to obtain an image stream of the Sundanese script, initiating the frame-by-frame preprocessing of the stream. Subsequently, YOLOv8 uses OpenCV to detect the frame and starts looking for the Sundanese script within the preprocessed image stream. If words written in the script are detected, YOLOv8 obtains the corresponding word classes in Latin. The Google Translate API then translates these word classes into Bahasa Indonesia. Finally, users can see the Sundanese script bounding box for each word, along with the word class in Latin letters and translated word in Bahasa.

2) Gamification Concepts and UI

A foundational step for any student of a language is acquiring a strong grasp of vocabulary, which lays the groundwork for effective communication and comprehension. Recognizing the importance and challenges of this task, several research initiatives have focused on leveraging technology to facilitate and enhance the learning of vocabulary [6,7]. To this end, gamification has been employed as a strategy to enhance memorization [8,9]. The integration of game-like elements into learning activities has been shown to significantly improve information retention in learners. This approach is not new, as indicated by prior studies underscoring the effectiveness of gamification in educational settings. The underlying concept hinges on the natural human propensity for games and competitions, turning routine memorization tasks into engaging and enjoyable experiences. Without innovative methods such as gamification to capture their interests, it may be challenging to motivate young learners to use educational products.

We adopted gamification to transform the activity of capturing images and translating text into a series of enjoyable tasks. This method encompasses four main activities:

  • “Write it,” where learners transcribe given words into Sundanese script as requested.

  • “Detect it” involves the detection of words in Sundanese script using computer vision.

  • “Collect it” allows learners to gather and save their achievements by acting as a tangible reward system.

  • “Place it” includes activities to use the right word in the Sundanese script in specific situations.

Fig. 5 illustrates the proposed gamification approach designed to foster a positive and stimulating learning environment that encourages continuous engagement and improvement.

Figure 5. Gamification concept of mobile application.

The UI design approach for mobile applications differs slightly from that used in [10]. Because our objective is to simplify the translation process from Sundanese to Indonesian, activities from the detection to the translation of Sundanese scripts must occur on the same page.

III. MODEL TRAINING AND EVALUATION

The following subsections discuss the data acquisition and model training processes, as well as a description of each word class used to develop the detection model.

A. Model Training

The training process initiates with dataset collection, wherein relevant data are compiled to serve as the foundation for model training. Following collection, the dataset undergoes labeling, where the data are annotated with accurate categorical tags to make them comprehensible to the training model. The dataset is split into subsets − typically training, validation, and testing sets − to ensure a robust and unbiased training process. In the subsequent preprocessing stage, the data are cleaned and transformed to an appropriate format and quality for the training algorithm. During model training, the preprocessed data are fed into the model, enabling it to learn from the patterns and relationships within the data. Finally, testing is conducted using the unseen portion of the dataset to evaluate the model’s performance and generalizability to new data, thereby ensuring its practical applicability. Fig. 6 illustrates the model training process.

Figure 6. Training process and steps.

Prior to model training, we compiled a sufficiently large dataset of words written in Sundanese script. Unlike in other studies conducted to detect words with Latin characters [11,12], finding appropriate data was challenging owing to a scarcity of resources related to the Sundanese script. Therefore, we created a personalized dataset from scratch by writing each word class individually. The overall dataset encompassed 1,200 images of 608 × 800 pixels in JPG format. Additionally, we determined 37 classes corresponding to the mostused words as labels for the images. These classes are listed in Table 1.

Table 1 . Word classes.

NoSundaneseBahasaEnglish
1NaonApaWhat
2KunaonKenapaWhy
3SahaSiapaWho
4KumahaBagaimanaHow
5PunteunMaafSorry
6IrahaKapanWhen
7KamanaKe manaWhere
8TimanaDari manaFrom Where
9SabarahaBerapaHow Much
10Aya NaonAda apaWhat Happen
11Nuju NaonSedang apaWhat are you doing
12Sareng SahaBersama siapaWith Whom
13Hatur NuhunTerima kasihThanks
14AbdiSayaMe
15ManehAndaYou
16Wilujeung énjingSelamat pagiGood morning
17Wilujeung wengiSelamat malamGood night
18MirahMurahCheap
19AwisMahalExpensive
20KamariKemarinYesterday
21ArtosUangMoney
22DangukeunDengarkanListen
23GampilMudahEasy
24GeulisCantikBeautiful
25HayuAyoLet’s go
26IndungIbuMother
27IsinMaluShy
28KaduhungMenyesalRegret
29KasepTampanHandsome
30Kumaha DamangApa kabarHow are you
31ManggaSilakanPlease
32MoalTidak akanWill not
33Nu leresYang benarCorrect
35RaosEnakDelicious
36TunduhMengantukSleepy
37MeuliMembeliBuy


The image representations of these word classes were collected in different styles, such as written on a sheet of paper or typed digitally, and captured at different brightness levels. Following collection, the images were labeled using LabelImg, a graphical image annotation tool. Subsequently, the labeled dataset was divided into training, validation, and testing subsets. Fig. 7 presents sample data used in the set.

Figure 7. Sample data.

For classification, we adopted the YOLOv8 object detection algorithm, an anchor-free model that implements the latest developments in YOLO-based algorithms [2]. YOLOv8 performs classification directly at the center of a given object, instead of at the offset from a known anchor box. Anchor boxes are notoriously tricky components of earlier YOLO algorithms, as they may represent the distribution of boxes of the target benchmark but not that of the custom dataset. We used YOLOv8n, which comprises 255 layers and 3157200 parameters. The model was trained over 100 epochs using an NVIDIA T4 GPU, and the dataset split was 80% for training, 10% for validation, and 10% for testing. The training results are shown in Fig. 8.

Figure 8. Performance metrics of model training.

B. Model Evaluation

Following model training, we tested the model with a subset of unseen images to generate bounding boxes for Sundanese script, produce corresponding labels, and then translate the words into Indonesian using the Google Translate API. To ensure that the model can run on a standard computer, the was performed using a less powerful computer than that used for training. As in [13], tests were performed to determine the detection and translation accuracies.

Fig. 8 presents training results for the detection of words written on a single sheet of paper, with the model having achieved an accuracy exceeding 80% for almost all Sundanese word classes. To further evaluate model performance, we conducted an additional test using a smartphone camera. Here, the model accurately detected almost all Sundanese scripts with a confidence level of 90-100% as shown in Fig. 9.

Figure 9. Model testing using smartphone camera.

IV. MODEL INTEGRATION AND APPLICATION TESTING

This section discusses the proposed model’s integration into mobile applications, as well as a study with child learners.

A. Model Integration

We used a wide set of Android libraries to develop the mobile applications, including a native library and Android graphics. These libraries provide tools that handle standard graphic operations such as picture resizing. Such tools are required to deliver images from drawing or camera inputs in a specific format to YOLOv8 image classification, as the YOLO architecture accepts images in a three-channel RGB format for the first layer. Once the images were acquired, the trained model was loaded and fed the images as input. These input images were transformed and analyzed, and the output category was returned to the mobile application. Fig. 10 depicts a screenshot of the mobile application.

Figure 10. Mobile application UI screenshots.

B. Application Testing

The participants selected to test the proposed system’s effectiveness were elementary school students aged within 7-12 years. The students participated in the following activities:

  • Write words in Sundanese script.

  • Try to obtain the right script by detecting their work.

  • Collect as many words as possible.

  • Play games.

The objective of this test was to determine whether the proposed gamification-based learning process improved the learners’ interest and knowledge. From a learning perspective, we tested the application with elementary school students to determine that they could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Notably, none of the children were familiar with the Sundanese script before the test was conducted. Furthermore, some children continued learning even more words in a mentor-guided session after the test period was over, demonstrating a retained interest in learning the Sundanese script. These results demonstrate that the proposed approach improves students’ interest in learning the script.

To comprehensively interpret the effectiveness of our approach, we compared our results with those of previous studies on Sudanese script recognition [4,14,15]. The results show that the YOLOv8 algorithm achieves an accuracy of 80%, a competitive result compared to those of other methods. The gamification element not only contributes toward improving children’s interest and engagement, but also enhances their learning outcomes in recognizing and writing Sundanese scripts. Overall, our approach demonstrates that combining advanced computer vision techniques with engaging learning methodologies can significantly improve the efficacy of educational tools for endangered languages.

V. CONCLUSIONS AND FUTURE STUDIES

Owing to the continuous and strong flow of modern culture, the Sundanese language and script have become endangered, with the Sundanese script being especially at risk of extinction. In an effort to reduce this risk, we developed a YOLOv8 object detection model that can identify Sundanese scripts, convert them to Latin scripts, and then translate them into Indonesian.

The model was trained using a dataset of 1,200 images with various characteristics and labeled using LabelImg. During our experiments, the model achieved a detection accuracy exceeding 80% for almost all Sundanese word classes. The trained model was integrated into a mobile application designed to teach children to read and write in the Sundanese script.

The gamification concept was embedded in the application to motivate the learning process in students. The application was tested on participants aged 7-12, with results showing that the children could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Furthermore, several children showed increased interest in continuing to learn the script even after the testing period.

Because our training dataset was relatively small, we intend to conduct further studies with larger datasets encompassing a greater variety of camera angles, lighting, writing styles, and other features. In the process, the model will be retrained to enhance its performance in detecting and identifying Sundanese scripts.

ACKNOWLEDGEMENTS

This paper was supported by the Changwon National University Research Fund in 2023.

Fig 1.

Figure 1.Historical objects and documents inscribed with Sundanese script.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 2.

Figure 2.Overview of Sundanese detection and translation system.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 3.

Figure 3.System architecture.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 4.

Figure 4.Workflow of the system.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 5.

Figure 5.Gamification concept of mobile application.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 6.

Figure 6.Training process and steps.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 7.

Figure 7.Sample data.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 8.

Figure 8.Performance metrics of model training.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 9.

Figure 9.Model testing using smartphone camera.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Fig 10.

Figure 10.Mobile application UI screenshots.
Journal of Information and Communication Convergence Engineering 2024; 22: 336-343https://doi.org/10.56977/jicce.2024.22.4.336

Table 1 . Word classes.

NoSundaneseBahasaEnglish
1NaonApaWhat
2KunaonKenapaWhy
3SahaSiapaWho
4KumahaBagaimanaHow
5PunteunMaafSorry
6IrahaKapanWhen
7KamanaKe manaWhere
8TimanaDari manaFrom Where
9SabarahaBerapaHow Much
10Aya NaonAda apaWhat Happen
11Nuju NaonSedang apaWhat are you doing
12Sareng SahaBersama siapaWith Whom
13Hatur NuhunTerima kasihThanks
14AbdiSayaMe
15ManehAndaYou
16Wilujeung énjingSelamat pagiGood morning
17Wilujeung wengiSelamat malamGood night
18MirahMurahCheap
19AwisMahalExpensive
20KamariKemarinYesterday
21ArtosUangMoney
22DangukeunDengarkanListen
23GampilMudahEasy
24GeulisCantikBeautiful
25HayuAyoLet’s go
26IndungIbuMother
27IsinMaluShy
28KaduhungMenyesalRegret
29KasepTampanHandsome
30Kumaha DamangApa kabarHow are you
31ManggaSilakanPlease
32MoalTidak akanWill not
33Nu leresYang benarCorrect
35RaosEnakDelicious
36TunduhMengantukSleepy
37MeuliMembeliBuy

References

  1. D. Reis, J. Kupec, J. Hong, and A. Daoudi, “Real-time flying object detection with YOLOv8,” arXiv preprint arXiv: 2305.09972, May 2023. DOI: 10.48550/ARXIV.2305.09972.
  2. J. Terven, D. Cordova-Esparza, and J. Romero-Gonzalez, “A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLO-NAS,” Machine Learning & Knowledge Extraction, Nov. 2023. DOI: 10.3390/make5040083.
    CrossRef
  3. I. Ikhsan and D. I. Mulyana, “Optimizing the Implementation of the YOLO and data algorithm augmentation in Hanacaraka Javanese script language classification,” JUSIKOM PRIMA, vol. 7, no. 1, pp. 8-16, Aug. 2023. DOI: 10.34012/jurnalsisteminformasidanilmukomputer.v7i1.4062.
    CrossRef
  4. D. Arifadilah, “Sunda script detection using You Only Look Once algorithm,” Journal of Artificial Intelligence and Engineering Applications, vol. 3, no. 2, pp. 606-613, Feb. 2024. DOI: 10.59934/jaiea.v3i2.443.
    CrossRef
  5. A. Prasetiadi, J. Saputra, I. Kresna, and I. Ramadhanti, “YOLOv5 and U-Net-based character detection for Nusantara script,” JURNAL ONLINE INFORMATIKA, vol. 8, no. 2, pp. 232-241, Dec. 2023. DOI: 10.15575/join.v8i2.1180.
    CrossRef
  6. F. Çakmak, E. Namaziandost, and T. Kumar, “CALL-enhanced L2 vocabulary learning: Using spaced exposure through CALL to enhance L2 vocabulary retention,” Education Research International, vol. 2021, pp. 1-8, Sep. 2021. DOI: 10.1155/2021/5848525.
    CrossRef
  7. X. Yang, L.-J. Kuo, Z. R. Eslami, and S. M. Moody, “Theoretical trends of research on technology and L2 vocabulary learning: A systematic review,” Journal of Computers in Education, vol. 8, no. 4, pp. 465-483, May 2021. DOI: 10.1007/s40692-021-00187-8.
    CrossRef
  8. K. Futami, D. Kawahigashi, and K. Murao, “Mindless memorization booster: A method to influence memorization power using attention induction phenomena caused by visual interface modulation and its application to memorization support for English vocabulary learning,” Electronics, vol. 11, no. 14, p. 2276, Jul. 2022. DOI: 10.3390/electronics11142276.
    CrossRef
  9. W. Bancha and N. Tongtep, “Enhancing vocabulary memorization and retention through LMS and MultiEx game platforms among Thai tertiary students,” International Journal of Learning, Teaching and Educational Research, vol. 20, no. 10, pp. 17-192, Oct. 2021. DOI: 10.26803/ijlter.20.10.10.
    CrossRef
  10. D. S. Saputra, D. A. Yonanda, and Y. Yuliati, “The development of android-based mobile learning in learning Sundanese script for elementary school students,” in Proceedings of the 3rd International Conference on Learning Innovation and Quality Education (ICLIQE 2019), Solo Baru, IN, 2020. DOI: 10.2991/assehr.k.200129.086.
    CrossRef
  11. X. Wang, S. Zheng, C. Zhang, R. Li, and L. Gui, “R-YOLO: A Realtime text detector for natural scenes with arbitrary rotation,” Sensors, vol. 21, no. 3, p. 888, Jan. 2021. DOI: 10.3390/s21030888.
    Pubmed KoreaMed CrossRef
  12. R. Mondal, S. Malakar, E. H. Barney Smith, and R. Sarkar, “Handwritten English word recognition using a deep learning based object detection architecture,” Multimedia Tools and Applications, vol. 81, pp. 975-1000, Sep. 2021. DOI: 10.1007/s11042-021-11425-7.
    CrossRef
  13. M. Safran, A. Alajmi, and S. Alfarhood, “Efficient multistage license plate detection and recognition using YOLOv8 and CNN for smart parking systems,” Journal of Sensors, vol. 2024, pp. 1-18, Feb. 2024. DOI: 10.1155/2024/4917097.
    CrossRef
  14. M. A. Prameswari, M. Dwi Sulistiyo, and A. F. Ihsan, “Classification of handwritten Sundanese script via transfer learning on CNN-based architectures,” in in 2023 3rd International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), Yogyakarta, IN, pp. 401-406, 2023. DOI: 10.1109/ICE3IS59323.2023.10335382.
    CrossRef
  15. H. Salsabila, E. Rachmawati, and F. Sthevanie, “Sundanese aksara recognition using histogram of oriented gradients,” in in 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, IN, pp. 253-258, 2019. DOI: 10.1109/ISRITI48646.2019.9034589.
    CrossRef
JICCE
Dec 31, 2024 Vol.22 No.4, pp. 267~343

Stats or Metrics

Share this article on

  • line

Journal of Information and Communication Convergence Engineering Jouranl of information and
communication convergence engineering
(J. Inf. Commun. Converg. Eng.)

eISSN 2234-8883
pISSN 2234-8255