Journal of information and communication convergence engineering 2024; 22(4): 336-343
Published online December 31, 2024
https://doi.org/10.56977/jicce.2024.22.4.336
© Korea Institute of Information and Communication Engineering
Correspondence to : Mintae Hwang (E-mail: mthwang@cwnu.ac.kr)
Department of Information and Communication Engineering, Changwon National University, Changwon 51140, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
In the digital era, teaching endangered local languages and scripts to children has become challenging owing to the scarcity of learning media and materials. The present study addresses this problem through the development of a mobile application that classifies and automatically converts Sundanese scripts into Latin using computer vision algorithms. The proposed method represents an innovative solution for capturing children's interest using gamification strategies. We discuss the development, implementation, and evaluation of YOLOv8, a deep learning technology for computer vision in mobile applications. A pilot study conducted on children aged 7-12 years revealed significant improvements in their interest and knowledge of Sundanese scripts, as the children were able to memorize, identify, and write 5-8 words in Sundanese characters out of 10 randomly selected words. Furthermore, the model achieved 80% accuracy for almost all Sundanese-scripted words, indicating satisfactory results. This study combines computer vision with gamification to facilitate the learning of Sundanese scripts, thereby paving the way for future innovation.
Keywords Computer Vision, Mobile Application, Sundanese Scripts, Modern Indonesian Language, Gamification Strategy
The nation of Indonesia comprises over 17,000 islands spanning more than 1.9 million square kilometers. With such a wide area, Indonesia inhabits some of the most exotic tribes and cultures in the world. Among of the most popular of these tribes is the Sunda tribe, populating the western area of Java Island. One legacy of the Sundanese people is a language script that was widely used during Indonesia’s kingdom period, which is very different from Latin. The Sundanese script played a prominent role in the region during ancient times, and has endured over an extensive period. Thus, it encapsulates historical narratives, scientific knowledge, and wisdom from bygone times. The script includes characters such as ngalagena aksara for consonants, swara aksara for vocals, angka aksara for numbers, and rarangkèn for punctuation. Fig. 1 shows that the remnants of these historical legacies persist even today, having been carefully preserved.
However, as Latin letters have been adopted in the Sundanese language, traditional Sundanese scripts are no longer used due to a lack of public awareness, hindering their preservation. This poses a challenge for future generations of potential learners, incurring a risk of extinction for the language. Although valuable documents written in Sundanese script still exist in historical places such as museums, few individuals can read them. Furthermore, although some elementary schools in West Java Province continue to offer courses in Sundanese script, these curricula face challenges stemming from the lack of appropriate facilities, classroom conditions, and optimized media learning. The declining use of Sundanese scripts in daily life also makes the preservation of these scripts increasingly difficult.
To address the aforementioned problems, we employed machine learning to detect Sundanese scripts and convert them to Latin-based Sundanese, which can then be translated into Indonesian. Moreover, we adopted gamification strategies to capture the interests of young learners, aiming to narrow the gap in the comprehension of Sundanese scripts. We used YOLOv8 object detection, a highly accurate single-shot detector algorithm [1,2], which was combined with the Google Translate API to translate Sundanese to Indonesian through a mobile application, as demonstrated experimentally.
The present study represents the first attempt to implement end-to-end Sundanese script word detection and translation to other languages in a mobile format, improving upon similar works conducted in [3], [4], and [5]. Furthermore, this study is the first to use gamification to teach the Sundanese script. The contributions of this study are as follows:
1. We propose the first end-to-end Sundanese script word detection and translation model designed for mobile devices, based upon computer vision technology.
2. We provide new learning media with a gamification approach for teaching ancient scripts to children.
The following subsections discuss the methodology and design of this study, including a top-down overview of the system architecture that illustrates all necessary components, system workflow, and initial user interface (UI) prior to implementation.
An overview of the system method used throughout this study is presented in Fig. 2.
A system architecture and workflow were designed to ensure that the modules and components necessary for the application can be efficiently built and effectively integrated.
A gamification process was designed to increase user engagement and enhance the educational utility of the Sundanese script detection and translation system. Furthermore, a well-designed UI can enhance user-friendliness, thereby improving the application’s usability.
The model was trained to accurately identify, recognize, and interpret the distinctive characteristics of each word written in the Sundanese script.
Appropriate metrics were utilized to evaluate model performance, assessing the success and effectiveness of the training process.
The trained model was integrated into a mobile application, providing seamless operation for detecting and translating Sundanese scripts using a smartphone.
The application was deployed with child participants to gather feedback on its usability and features.
This subsection presents the system architecture and workflow, showing the necessary components and activities along with the initial UI design.
Fig. 3 shows the system architecture and Fig. 4 illustrates the system workflow. As shown in Fig. 3, the system comprises four major components: mobile application, YOLO algorithm, Sundanese script, and Google Translate API.
YOLOv8 was selected for the detection of Sundanese characters because it is faster and more efficient than convolutional neural networks (CNNs) and their competitors. YOLO employs an artificial neural network (ANN) to detect objects within a given image. The network divides the image into several regions, predicts the bounding boxes of objects, and calculates the probabilities within each region. These bounding boxes are then compared with the predicted probabilities.
In the system workflow shown in Fig. 4, after opening the mobile application, the user must use a smartphone camera to obtain an image stream of the Sundanese script, initiating the frame-by-frame preprocessing of the stream. Subsequently, YOLOv8 uses OpenCV to detect the frame and starts looking for the Sundanese script within the preprocessed image stream. If words written in the script are detected, YOLOv8 obtains the corresponding word classes in Latin. The Google Translate API then translates these word classes into Bahasa Indonesia. Finally, users can see the Sundanese script bounding box for each word, along with the word class in Latin letters and translated word in Bahasa.
A foundational step for any student of a language is acquiring a strong grasp of vocabulary, which lays the groundwork for effective communication and comprehension. Recognizing the importance and challenges of this task, several research initiatives have focused on leveraging technology to facilitate and enhance the learning of vocabulary [6,7]. To this end, gamification has been employed as a strategy to enhance memorization [8,9]. The integration of game-like elements into learning activities has been shown to significantly improve information retention in learners. This approach is not new, as indicated by prior studies underscoring the effectiveness of gamification in educational settings. The underlying concept hinges on the natural human propensity for games and competitions, turning routine memorization tasks into engaging and enjoyable experiences. Without innovative methods such as gamification to capture their interests, it may be challenging to motivate young learners to use educational products.
We adopted gamification to transform the activity of capturing images and translating text into a series of enjoyable tasks. This method encompasses four main activities:
“Write it,” where learners transcribe given words into Sundanese script as requested.
“Detect it” involves the detection of words in Sundanese script using computer vision.
“Collect it” allows learners to gather and save their achievements by acting as a tangible reward system.
“Place it” includes activities to use the right word in the Sundanese script in specific situations.
Fig. 5 illustrates the proposed gamification approach designed to foster a positive and stimulating learning environment that encourages continuous engagement and improvement.
The UI design approach for mobile applications differs slightly from that used in [10]. Because our objective is to simplify the translation process from Sundanese to Indonesian, activities from the detection to the translation of Sundanese scripts must occur on the same page.
The following subsections discuss the data acquisition and model training processes, as well as a description of each word class used to develop the detection model.
The training process initiates with dataset collection, wherein relevant data are compiled to serve as the foundation for model training. Following collection, the dataset undergoes labeling, where the data are annotated with accurate categorical tags to make them comprehensible to the training model. The dataset is split into subsets − typically training, validation, and testing sets − to ensure a robust and unbiased training process. In the subsequent preprocessing stage, the data are cleaned and transformed to an appropriate format and quality for the training algorithm. During model training, the preprocessed data are fed into the model, enabling it to learn from the patterns and relationships within the data. Finally, testing is conducted using the unseen portion of the dataset to evaluate the model’s performance and generalizability to new data, thereby ensuring its practical applicability. Fig. 6 illustrates the model training process.
Prior to model training, we compiled a sufficiently large dataset of words written in Sundanese script. Unlike in other studies conducted to detect words with Latin characters [11,12], finding appropriate data was challenging owing to a scarcity of resources related to the Sundanese script. Therefore, we created a personalized dataset from scratch by writing each word class individually. The overall dataset encompassed 1,200 images of 608 × 800 pixels in JPG format. Additionally, we determined 37 classes corresponding to the mostused words as labels for the images. These classes are listed in Table 1.
Table 1 . Word classes
No | Sundanese | Bahasa | English |
---|---|---|---|
1 | Naon | Apa | What |
2 | Kunaon | Kenapa | Why |
3 | Saha | Siapa | Who |
4 | Kumaha | Bagaimana | How |
5 | Punteun | Maaf | Sorry |
6 | Iraha | Kapan | When |
7 | Kamana | Ke mana | Where |
8 | Timana | Dari mana | From Where |
9 | Sabaraha | Berapa | How Much |
10 | Aya Naon | Ada apa | What Happen |
11 | Nuju Naon | Sedang apa | What are you doing |
12 | Sareng Saha | Bersama siapa | With Whom |
13 | Hatur Nuhun | Terima kasih | Thanks |
14 | Abdi | Saya | Me |
15 | Maneh | Anda | You |
16 | Wilujeung énjing | Selamat pagi | Good morning |
17 | Wilujeung wengi | Selamat malam | Good night |
18 | Mirah | Murah | Cheap |
19 | Awis | Mahal | Expensive |
20 | Kamari | Kemarin | Yesterday |
21 | Artos | Uang | Money |
22 | Dangukeun | Dengarkan | Listen |
23 | Gampil | Mudah | Easy |
24 | Geulis | Cantik | Beautiful |
25 | Hayu | Ayo | Let’s go |
26 | Indung | Ibu | Mother |
27 | Isin | Malu | Shy |
28 | Kaduhung | Menyesal | Regret |
29 | Kasep | Tampan | Handsome |
30 | Kumaha Damang | Apa kabar | How are you |
31 | Mangga | Silakan | Please |
32 | Moal | Tidak akan | Will not |
33 | Nu leres | Yang benar | Correct |
35 | Raos | Enak | Delicious |
36 | Tunduh | Mengantuk | Sleepy |
37 | Meuli | Membeli | Buy |
The image representations of these word classes were collected in different styles, such as written on a sheet of paper or typed digitally, and captured at different brightness levels. Following collection, the images were labeled using LabelImg, a graphical image annotation tool. Subsequently, the labeled dataset was divided into training, validation, and testing subsets. Fig. 7 presents sample data used in the set.
For classification, we adopted the YOLOv8 object detection algorithm, an anchor-free model that implements the latest developments in YOLO-based algorithms [2]. YOLOv8 performs classification directly at the center of a given object, instead of at the offset from a known anchor box. Anchor boxes are notoriously tricky components of earlier YOLO algorithms, as they may represent the distribution of boxes of the target benchmark but not that of the custom dataset. We used YOLOv8n, which comprises 255 layers and 3157200 parameters. The model was trained over 100 epochs using an NVIDIA T4 GPU, and the dataset split was 80% for training, 10% for validation, and 10% for testing. The training results are shown in Fig. 8.
Following model training, we tested the model with a subset of unseen images to generate bounding boxes for Sundanese script, produce corresponding labels, and then translate the words into Indonesian using the Google Translate API. To ensure that the model can run on a standard computer, the was performed using a less powerful computer than that used for training. As in [13], tests were performed to determine the detection and translation accuracies.
Fig. 8 presents training results for the detection of words written on a single sheet of paper, with the model having achieved an accuracy exceeding 80% for almost all Sundanese word classes. To further evaluate model performance, we conducted an additional test using a smartphone camera. Here, the model accurately detected almost all Sundanese scripts with a confidence level of 90-100% as shown in Fig. 9.
This section discusses the proposed model’s integration into mobile applications, as well as a study with child learners.
We used a wide set of Android libraries to develop the mobile applications, including a native library and Android graphics. These libraries provide tools that handle standard graphic operations such as picture resizing. Such tools are required to deliver images from drawing or camera inputs in a specific format to YOLOv8 image classification, as the YOLO architecture accepts images in a three-channel RGB format for the first layer. Once the images were acquired, the trained model was loaded and fed the images as input. These input images were transformed and analyzed, and the output category was returned to the mobile application. Fig. 10 depicts a screenshot of the mobile application.
The participants selected to test the proposed system’s effectiveness were elementary school students aged within 7-12 years. The students participated in the following activities:
Write words in Sundanese script.
Try to obtain the right script by detecting their work.
Collect as many words as possible.
Play games.
The objective of this test was to determine whether the proposed gamification-based learning process improved the learners’ interest and knowledge. From a learning perspective, we tested the application with elementary school students to determine that they could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Notably, none of the children were familiar with the Sundanese script before the test was conducted. Furthermore, some children continued learning even more words in a mentor-guided session after the test period was over, demonstrating a retained interest in learning the Sundanese script. These results demonstrate that the proposed approach improves students’ interest in learning the script.
To comprehensively interpret the effectiveness of our approach, we compared our results with those of previous studies on Sudanese script recognition [4,14,15]. The results show that the YOLOv8 algorithm achieves an accuracy of 80%, a competitive result compared to those of other methods. The gamification element not only contributes toward improving children’s interest and engagement, but also enhances their learning outcomes in recognizing and writing Sundanese scripts. Overall, our approach demonstrates that combining advanced computer vision techniques with engaging learning methodologies can significantly improve the efficacy of educational tools for endangered languages.
Owing to the continuous and strong flow of modern culture, the Sundanese language and script have become endangered, with the Sundanese script being especially at risk of extinction. In an effort to reduce this risk, we developed a YOLOv8 object detection model that can identify Sundanese scripts, convert them to Latin scripts, and then translate them into Indonesian.
The model was trained using a dataset of 1,200 images with various characteristics and labeled using LabelImg. During our experiments, the model achieved a detection accuracy exceeding 80% for almost all Sundanese word classes. The trained model was integrated into a mobile application designed to teach children to read and write in the Sundanese script.
The gamification concept was embedded in the application to motivate the learning process in students. The application was tested on participants aged 7-12, with results showing that the children could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Furthermore, several children showed increased interest in continuing to learn the script even after the testing period.
Because our training dataset was relatively small, we intend to conduct further studies with larger datasets encompassing a greater variety of camera angles, lighting, writing styles, and other features. In the process, the model will be retrained to enhance its performance in detecting and identifying Sundanese scripts.
This paper was supported by the Changwon National University Research Fund in 2023.
Wanda Gusdya
He received a Bachelor’s degree in Informatics Engineering from Pasundan University and a Master’s degree in Electrical Engineering from the Bandung Institute of Technology, Bandung, Indonesia. His research interests include deep learning and mobile applications. Since 2012, he has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia.
Handoko Supeno
He received a Bachelor’s degree in Informatics Engineering from Pasundan University and a Master’s degree in Electrical Engineering from the Bandung Institute of Technology, Bandung, Indonesia, where h e is c urrently p ursuing a P hD in Electrical Engineering and Informatics. His research interests include deep learning and computer vision. Since 2015, he has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia.
Anggoro Ari Nurcahyo
He received a Bachelor’s degree in Informatics from Pasundan University and a Master’s degree at LIKMI, Bandung, Indonesia. His research interests include databases and programming. Since 2015, he has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia.
Ayi Purbasari
She received her PhD in Electrical Engineering and Informatics from the School of Electrical Engineering and Informatics, Bandung Institute of Technology, Indonesia. Her current research focus encompasses artificial intelligence, artificial immune systems, optimization problems, and parallel computing. She has served as a reviewer at several IEEE conferences. Her research interests include machine learning, the Internet of Things, and data science. From 2004 to now, she has been a lecturer at the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia. Currently she is a head of the department at the same institution.
Aria Bisma Wahyutama
He received his BE degree in Informatics Engineering from the Department of Informatics Engineering, Pasundan University, Bandung, Indonesia, in 2020. He then received his MSE degree in Information and Communication Engineering from the Department of Information and Communication Engineering, Changwon National University, Changwon, Republic of Korea, in 2022 and is continuing his PhD studies at the same institution. His research interests include web and mobile programming, database design, digital game-based learning, Internet of Things applications, and related topics.
Mintae Hwang
He received his BS, MS, and PhD degrees in Computer Engineering from the Department of Computer Engineering, Pusan National University, Pusan, Republic of Korea in 1990, 1992, and 1996, respectively. From 1996 to 1999, he worked as a senior research member of the Protocol Engineering Center, Electronics and Telecommunications Research Institute, Daejeon, Republic of Korea. Since 1999, he has been a professor at the Department of Information and Communication Engineering, Changwon National University, Changwon, Republic of Korea. His research interests include communication protocols, database design, Internet of Things applications, machine learning, and smart cities.
Journal of information and communication convergence engineering 2024; 22(4): 336-343
Published online December 31, 2024 https://doi.org/10.56977/jicce.2024.22.4.336
Copyright © Korea Institute of Information and Communication Engineering.
Wanda Gusdya Purnama 1, Handoko Supeno
1, Anggoro Ari Nurcahyo
1, Ayi Purbasari
1, Aria Bisma Wahyutama
2, and Mintae Hwang2*
1Department of Informatics Engineering, Pasundan University, Bandung, 40153, Indonesia
2Department of Information and Communication Engineering, Changwon National University, Changwon, 51140, Republic of Korea
Correspondence to:Mintae Hwang (E-mail: mthwang@cwnu.ac.kr)
Department of Information and Communication Engineering, Changwon National University, Changwon 51140, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
In the digital era, teaching endangered local languages and scripts to children has become challenging owing to the scarcity of learning media and materials. The present study addresses this problem through the development of a mobile application that classifies and automatically converts Sundanese scripts into Latin using computer vision algorithms. The proposed method represents an innovative solution for capturing children's interest using gamification strategies. We discuss the development, implementation, and evaluation of YOLOv8, a deep learning technology for computer vision in mobile applications. A pilot study conducted on children aged 7-12 years revealed significant improvements in their interest and knowledge of Sundanese scripts, as the children were able to memorize, identify, and write 5-8 words in Sundanese characters out of 10 randomly selected words. Furthermore, the model achieved 80% accuracy for almost all Sundanese-scripted words, indicating satisfactory results. This study combines computer vision with gamification to facilitate the learning of Sundanese scripts, thereby paving the way for future innovation.
Keywords: Computer Vision, Mobile Application, Sundanese Scripts, Modern Indonesian Language, Gamification Strategy
The nation of Indonesia comprises over 17,000 islands spanning more than 1.9 million square kilometers. With such a wide area, Indonesia inhabits some of the most exotic tribes and cultures in the world. Among of the most popular of these tribes is the Sunda tribe, populating the western area of Java Island. One legacy of the Sundanese people is a language script that was widely used during Indonesia’s kingdom period, which is very different from Latin. The Sundanese script played a prominent role in the region during ancient times, and has endured over an extensive period. Thus, it encapsulates historical narratives, scientific knowledge, and wisdom from bygone times. The script includes characters such as ngalagena aksara for consonants, swara aksara for vocals, angka aksara for numbers, and rarangkèn for punctuation. Fig. 1 shows that the remnants of these historical legacies persist even today, having been carefully preserved.
However, as Latin letters have been adopted in the Sundanese language, traditional Sundanese scripts are no longer used due to a lack of public awareness, hindering their preservation. This poses a challenge for future generations of potential learners, incurring a risk of extinction for the language. Although valuable documents written in Sundanese script still exist in historical places such as museums, few individuals can read them. Furthermore, although some elementary schools in West Java Province continue to offer courses in Sundanese script, these curricula face challenges stemming from the lack of appropriate facilities, classroom conditions, and optimized media learning. The declining use of Sundanese scripts in daily life also makes the preservation of these scripts increasingly difficult.
To address the aforementioned problems, we employed machine learning to detect Sundanese scripts and convert them to Latin-based Sundanese, which can then be translated into Indonesian. Moreover, we adopted gamification strategies to capture the interests of young learners, aiming to narrow the gap in the comprehension of Sundanese scripts. We used YOLOv8 object detection, a highly accurate single-shot detector algorithm [1,2], which was combined with the Google Translate API to translate Sundanese to Indonesian through a mobile application, as demonstrated experimentally.
The present study represents the first attempt to implement end-to-end Sundanese script word detection and translation to other languages in a mobile format, improving upon similar works conducted in [3], [4], and [5]. Furthermore, this study is the first to use gamification to teach the Sundanese script. The contributions of this study are as follows:
1. We propose the first end-to-end Sundanese script word detection and translation model designed for mobile devices, based upon computer vision technology.
2. We provide new learning media with a gamification approach for teaching ancient scripts to children.
The following subsections discuss the methodology and design of this study, including a top-down overview of the system architecture that illustrates all necessary components, system workflow, and initial user interface (UI) prior to implementation.
An overview of the system method used throughout this study is presented in Fig. 2.
A system architecture and workflow were designed to ensure that the modules and components necessary for the application can be efficiently built and effectively integrated.
A gamification process was designed to increase user engagement and enhance the educational utility of the Sundanese script detection and translation system. Furthermore, a well-designed UI can enhance user-friendliness, thereby improving the application’s usability.
The model was trained to accurately identify, recognize, and interpret the distinctive characteristics of each word written in the Sundanese script.
Appropriate metrics were utilized to evaluate model performance, assessing the success and effectiveness of the training process.
The trained model was integrated into a mobile application, providing seamless operation for detecting and translating Sundanese scripts using a smartphone.
The application was deployed with child participants to gather feedback on its usability and features.
This subsection presents the system architecture and workflow, showing the necessary components and activities along with the initial UI design.
Fig. 3 shows the system architecture and Fig. 4 illustrates the system workflow. As shown in Fig. 3, the system comprises four major components: mobile application, YOLO algorithm, Sundanese script, and Google Translate API.
YOLOv8 was selected for the detection of Sundanese characters because it is faster and more efficient than convolutional neural networks (CNNs) and their competitors. YOLO employs an artificial neural network (ANN) to detect objects within a given image. The network divides the image into several regions, predicts the bounding boxes of objects, and calculates the probabilities within each region. These bounding boxes are then compared with the predicted probabilities.
In the system workflow shown in Fig. 4, after opening the mobile application, the user must use a smartphone camera to obtain an image stream of the Sundanese script, initiating the frame-by-frame preprocessing of the stream. Subsequently, YOLOv8 uses OpenCV to detect the frame and starts looking for the Sundanese script within the preprocessed image stream. If words written in the script are detected, YOLOv8 obtains the corresponding word classes in Latin. The Google Translate API then translates these word classes into Bahasa Indonesia. Finally, users can see the Sundanese script bounding box for each word, along with the word class in Latin letters and translated word in Bahasa.
A foundational step for any student of a language is acquiring a strong grasp of vocabulary, which lays the groundwork for effective communication and comprehension. Recognizing the importance and challenges of this task, several research initiatives have focused on leveraging technology to facilitate and enhance the learning of vocabulary [6,7]. To this end, gamification has been employed as a strategy to enhance memorization [8,9]. The integration of game-like elements into learning activities has been shown to significantly improve information retention in learners. This approach is not new, as indicated by prior studies underscoring the effectiveness of gamification in educational settings. The underlying concept hinges on the natural human propensity for games and competitions, turning routine memorization tasks into engaging and enjoyable experiences. Without innovative methods such as gamification to capture their interests, it may be challenging to motivate young learners to use educational products.
We adopted gamification to transform the activity of capturing images and translating text into a series of enjoyable tasks. This method encompasses four main activities:
“Write it,” where learners transcribe given words into Sundanese script as requested.
“Detect it” involves the detection of words in Sundanese script using computer vision.
“Collect it” allows learners to gather and save their achievements by acting as a tangible reward system.
“Place it” includes activities to use the right word in the Sundanese script in specific situations.
Fig. 5 illustrates the proposed gamification approach designed to foster a positive and stimulating learning environment that encourages continuous engagement and improvement.
The UI design approach for mobile applications differs slightly from that used in [10]. Because our objective is to simplify the translation process from Sundanese to Indonesian, activities from the detection to the translation of Sundanese scripts must occur on the same page.
The following subsections discuss the data acquisition and model training processes, as well as a description of each word class used to develop the detection model.
The training process initiates with dataset collection, wherein relevant data are compiled to serve as the foundation for model training. Following collection, the dataset undergoes labeling, where the data are annotated with accurate categorical tags to make them comprehensible to the training model. The dataset is split into subsets − typically training, validation, and testing sets − to ensure a robust and unbiased training process. In the subsequent preprocessing stage, the data are cleaned and transformed to an appropriate format and quality for the training algorithm. During model training, the preprocessed data are fed into the model, enabling it to learn from the patterns and relationships within the data. Finally, testing is conducted using the unseen portion of the dataset to evaluate the model’s performance and generalizability to new data, thereby ensuring its practical applicability. Fig. 6 illustrates the model training process.
Prior to model training, we compiled a sufficiently large dataset of words written in Sundanese script. Unlike in other studies conducted to detect words with Latin characters [11,12], finding appropriate data was challenging owing to a scarcity of resources related to the Sundanese script. Therefore, we created a personalized dataset from scratch by writing each word class individually. The overall dataset encompassed 1,200 images of 608 × 800 pixels in JPG format. Additionally, we determined 37 classes corresponding to the mostused words as labels for the images. These classes are listed in Table 1.
Table 1 . Word classes.
No | Sundanese | Bahasa | English |
---|---|---|---|
1 | Naon | Apa | What |
2 | Kunaon | Kenapa | Why |
3 | Saha | Siapa | Who |
4 | Kumaha | Bagaimana | How |
5 | Punteun | Maaf | Sorry |
6 | Iraha | Kapan | When |
7 | Kamana | Ke mana | Where |
8 | Timana | Dari mana | From Where |
9 | Sabaraha | Berapa | How Much |
10 | Aya Naon | Ada apa | What Happen |
11 | Nuju Naon | Sedang apa | What are you doing |
12 | Sareng Saha | Bersama siapa | With Whom |
13 | Hatur Nuhun | Terima kasih | Thanks |
14 | Abdi | Saya | Me |
15 | Maneh | Anda | You |
16 | Wilujeung énjing | Selamat pagi | Good morning |
17 | Wilujeung wengi | Selamat malam | Good night |
18 | Mirah | Murah | Cheap |
19 | Awis | Mahal | Expensive |
20 | Kamari | Kemarin | Yesterday |
21 | Artos | Uang | Money |
22 | Dangukeun | Dengarkan | Listen |
23 | Gampil | Mudah | Easy |
24 | Geulis | Cantik | Beautiful |
25 | Hayu | Ayo | Let’s go |
26 | Indung | Ibu | Mother |
27 | Isin | Malu | Shy |
28 | Kaduhung | Menyesal | Regret |
29 | Kasep | Tampan | Handsome |
30 | Kumaha Damang | Apa kabar | How are you |
31 | Mangga | Silakan | Please |
32 | Moal | Tidak akan | Will not |
33 | Nu leres | Yang benar | Correct |
35 | Raos | Enak | Delicious |
36 | Tunduh | Mengantuk | Sleepy |
37 | Meuli | Membeli | Buy |
The image representations of these word classes were collected in different styles, such as written on a sheet of paper or typed digitally, and captured at different brightness levels. Following collection, the images were labeled using LabelImg, a graphical image annotation tool. Subsequently, the labeled dataset was divided into training, validation, and testing subsets. Fig. 7 presents sample data used in the set.
For classification, we adopted the YOLOv8 object detection algorithm, an anchor-free model that implements the latest developments in YOLO-based algorithms [2]. YOLOv8 performs classification directly at the center of a given object, instead of at the offset from a known anchor box. Anchor boxes are notoriously tricky components of earlier YOLO algorithms, as they may represent the distribution of boxes of the target benchmark but not that of the custom dataset. We used YOLOv8n, which comprises 255 layers and 3157200 parameters. The model was trained over 100 epochs using an NVIDIA T4 GPU, and the dataset split was 80% for training, 10% for validation, and 10% for testing. The training results are shown in Fig. 8.
Following model training, we tested the model with a subset of unseen images to generate bounding boxes for Sundanese script, produce corresponding labels, and then translate the words into Indonesian using the Google Translate API. To ensure that the model can run on a standard computer, the was performed using a less powerful computer than that used for training. As in [13], tests were performed to determine the detection and translation accuracies.
Fig. 8 presents training results for the detection of words written on a single sheet of paper, with the model having achieved an accuracy exceeding 80% for almost all Sundanese word classes. To further evaluate model performance, we conducted an additional test using a smartphone camera. Here, the model accurately detected almost all Sundanese scripts with a confidence level of 90-100% as shown in Fig. 9.
This section discusses the proposed model’s integration into mobile applications, as well as a study with child learners.
We used a wide set of Android libraries to develop the mobile applications, including a native library and Android graphics. These libraries provide tools that handle standard graphic operations such as picture resizing. Such tools are required to deliver images from drawing or camera inputs in a specific format to YOLOv8 image classification, as the YOLO architecture accepts images in a three-channel RGB format for the first layer. Once the images were acquired, the trained model was loaded and fed the images as input. These input images were transformed and analyzed, and the output category was returned to the mobile application. Fig. 10 depicts a screenshot of the mobile application.
The participants selected to test the proposed system’s effectiveness were elementary school students aged within 7-12 years. The students participated in the following activities:
Write words in Sundanese script.
Try to obtain the right script by detecting their work.
Collect as many words as possible.
Play games.
The objective of this test was to determine whether the proposed gamification-based learning process improved the learners’ interest and knowledge. From a learning perspective, we tested the application with elementary school students to determine that they could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Notably, none of the children were familiar with the Sundanese script before the test was conducted. Furthermore, some children continued learning even more words in a mentor-guided session after the test period was over, demonstrating a retained interest in learning the Sundanese script. These results demonstrate that the proposed approach improves students’ interest in learning the script.
To comprehensively interpret the effectiveness of our approach, we compared our results with those of previous studies on Sudanese script recognition [4,14,15]. The results show that the YOLOv8 algorithm achieves an accuracy of 80%, a competitive result compared to those of other methods. The gamification element not only contributes toward improving children’s interest and engagement, but also enhances their learning outcomes in recognizing and writing Sundanese scripts. Overall, our approach demonstrates that combining advanced computer vision techniques with engaging learning methodologies can significantly improve the efficacy of educational tools for endangered languages.
Owing to the continuous and strong flow of modern culture, the Sundanese language and script have become endangered, with the Sundanese script being especially at risk of extinction. In an effort to reduce this risk, we developed a YOLOv8 object detection model that can identify Sundanese scripts, convert them to Latin scripts, and then translate them into Indonesian.
The model was trained using a dataset of 1,200 images with various characteristics and labeled using LabelImg. During our experiments, the model achieved a detection accuracy exceeding 80% for almost all Sundanese word classes. The trained model was integrated into a mobile application designed to teach children to read and write in the Sundanese script.
The gamification concept was embedded in the application to motivate the learning process in students. The application was tested on participants aged 7-12, with results showing that the children could identify, memorize, and write 5-8 words in Sundanese script from 10 randomly selected words. Furthermore, several children showed increased interest in continuing to learn the script even after the testing period.
Because our training dataset was relatively small, we intend to conduct further studies with larger datasets encompassing a greater variety of camera angles, lighting, writing styles, and other features. In the process, the model will be retrained to enhance its performance in detecting and identifying Sundanese scripts.
This paper was supported by the Changwon National University Research Fund in 2023.
Table 1 . Word classes.
No | Sundanese | Bahasa | English |
---|---|---|---|
1 | Naon | Apa | What |
2 | Kunaon | Kenapa | Why |
3 | Saha | Siapa | Who |
4 | Kumaha | Bagaimana | How |
5 | Punteun | Maaf | Sorry |
6 | Iraha | Kapan | When |
7 | Kamana | Ke mana | Where |
8 | Timana | Dari mana | From Where |
9 | Sabaraha | Berapa | How Much |
10 | Aya Naon | Ada apa | What Happen |
11 | Nuju Naon | Sedang apa | What are you doing |
12 | Sareng Saha | Bersama siapa | With Whom |
13 | Hatur Nuhun | Terima kasih | Thanks |
14 | Abdi | Saya | Me |
15 | Maneh | Anda | You |
16 | Wilujeung énjing | Selamat pagi | Good morning |
17 | Wilujeung wengi | Selamat malam | Good night |
18 | Mirah | Murah | Cheap |
19 | Awis | Mahal | Expensive |
20 | Kamari | Kemarin | Yesterday |
21 | Artos | Uang | Money |
22 | Dangukeun | Dengarkan | Listen |
23 | Gampil | Mudah | Easy |
24 | Geulis | Cantik | Beautiful |
25 | Hayu | Ayo | Let’s go |
26 | Indung | Ibu | Mother |
27 | Isin | Malu | Shy |
28 | Kaduhung | Menyesal | Regret |
29 | Kasep | Tampan | Handsome |
30 | Kumaha Damang | Apa kabar | How are you |
31 | Mangga | Silakan | Please |
32 | Moal | Tidak akan | Will not |
33 | Nu leres | Yang benar | Correct |
35 | Raos | Enak | Delicious |
36 | Tunduh | Mengantuk | Sleepy |
37 | Meuli | Membeli | Buy |