Journal of information and communication convergence engineering 2024; 22(4): 310-315
Published online December 31, 2024
https://doi.org/10.56977/jicce.2024.22.4.310
© Korea Institute of Information and Communication Engineering
Correspondence to : Seong-Yoon Shin (E-mail: s3397220@kunsan.ac.kr) Department of Computer Science and Engineering, Kunsan National University
Gwanghyun Jo (E-mail: gwanghyun@hanyang.ac.kr) Department of Mathematical Data Science, Hanyang University, ERICA
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
In this study, we conducted a data-driven analysis of lottery purchase behavior by using the XGBoost algorithm to predict future lottery purchase amounts based on purchase patterns of the previous four weeks. We began by judiciously defining key features including the weekly average purchase amount and variance in purchase amount. Subsequently, we evaluated the proposed method’s performance, finding the predicted future purchase amounts to match the actual purchase amounts. A key strength of this study was the interpretability of feature variables. Through the feature importance score from XGBoost, we found that features that capture impulsive patterns in purchases (e.g., variability in purchase amount) are strongly correlated with future spending, which agrees with conventional behavior analysis. Our study can be extended to the development of early warning systems designed to identify at-risk and potentially addicted purchasers on online lottery platforms.
Keywords XGBoost, Lottery Purchase, Behavior Analysis, Feature Importance
The proliferation of online lottery platforms has significantly increased the accessibility of lottery tickets, raising concerns about the potential for gambling addiction [1-2]. For example, the relationship between online gambling and problematic behaviors was analyzed in [1]. Accordingly, efforts have been made to develop objective indicators in order to identify individuals at high risk of addictive behaviors [4-7]. Traditional approaches typically rely on statistical correlation analyses of carefully selected survey questions, or focus on identifying key risk factors associated with addictive behaviors. For example, factors such as gambling frequency, betting amounts, total money spent, and variability in purchase amounts are commonly linked to addictive tendencies. More recently, advancements in machine learning technologies have offered new opportunities to predict lottery purchasing patterns and address addiction-related issues using data-driven methods [8-11]. A key advantage of these algorithms is their ability to autonomously identify the factors (features) that influence addictive behaviors, even without direct input from predefined survey items. However, a major challenge associated with data-driven algorithms is the loss of explainability, as machine-learning models are often regarded as black-box methods.
This study proposes a prediction algorithm to analyze lottery purchase patterns using a dataset provided by the Dong-Hang Lottery Co., Ltd., which consists of user purchase histories. Given that overall purchase amounts are a critical factor related to lottery addiction, we predicted future purchase amounts based on purchase history. After accumulating historical user purchase data, the prediction model can be used to develop an alarm system for addictive behavior on lottery platforms. Another primary focus of this study was the explainability of the prediction model. We employed XGBoost [12], a highly efficient and accurate tree-based ensemble model. XGBoost-based methods have been successfully applied in various fields including purchasing behavior analysis [13], risk prediction [14-15], and clinical detection [16-17]. One key advantage of this algorithm is its ability to provide feature importance scores, which show correlations between the features and the target variable. Therefore, XGBoost can provide insights into purchase history patterns that correlate with addictive behavior.
Because the performance of a data-driven algorithm is affected by feature variables, we judiciously defined these variables from the purchase history. For example, the weekly defined average purchase amount for each event and variability in purchase amounts play an important role in our analysis, with both defined as feature variables. Compared to well-established theories in behavior analysis, our features can be used to capture impulsive and addictive purchasing patterns in individuals. After the feature vectors are selected, the hyper-parameters in XGBoost are determined heuristically. In the results section, we report the R2 and Pearson’s coefficient scores for the proposed method. Furthermore, we discuss the features that correlate with addictive purchasing behaviors based on the feature importance scores obtained using XGBoost.
The remainder of this paper is organized as follows. In Section 2, we describe the overall algorithm workflow, including feature selection. Section 3 presents the experimental results. Finally, Section 4 concludes the paper.
We developed an XGBoost-based prediction model for lottery purchase amounts. In the following subsections, we describe the data collection and preprocessing methods, define the features used for the XGBoost-based algorithm, and finally present the algorithm itself.
Historical purchase data from users, obtained from January to April of 2024, were provided by DongHang Lottery Co., Ltd. To ensure data quality and focus on active users, we restricted the dataset to users with more than 300 purchase transactions during this period. For each user, we calculated the daily number of purchases and total purchase amount. Thus, our dataset captures both the frequency and monetary volume of purchases, providing a comprehensive overview of each user’s purchasing behavior. To formalize the data, we introduce the following notation: for a given user u and day t, let
The objective of this study was to predict the future total purchase amount over the following week using purchase patterns from the preceding τ number of days:
An appropriate value of τ was crucial for this analysis. If τ is too large, the feature vectors of each sample may become inefficiently bulky. Conversely, if τ is too small, XGBoost would be rendered unable to capture the user’s purchase pattern. To determine τ, we calculated the mutual information (MI) between
The graph of mutual information,
The input data, representing the purchase history, are structured as
Thus, the primary dataset consisting of pairs (X0, Y) can be directly used for supervised learning. However, we defined additional features from X0 to enhance interpretability,.
In this subsection, we define the new features derived from a given user’s purchase history. The primary objective is to capture meaningful features that relate to future purchase amounts. The first feature is the weekly sum of purchase amounts:
The second feature is the weekly number of purchase events:
Next, the average purchase amount per event is defined as
We also consider the number of days with at least one purchase per week, denoted as
Together with the purchase history vector X0, the features defined above form the final input variables:
where Fj represents [Fj1, ...Fj4] for each j = 1, ...4. Because we have four months of data for each individual, we can generate 13 data points of the form (X,Y) by shifting the time t in Eq. (1) by a one-week interval. We may then construct the training dataset by aggregating each user’s data from Eqs. (1) and (3):
where Ns=44,967 represents the total number of samples.
Before concluding this subsection, we discuss the role of each feature in comparison to those in prior studies on behavioral analysis, which were not necessarily data-driven. The frequency of and amount of money spent on gambling have long been recognized as key predictors of addictive behavior. According to Hodgnis et al. [18], the higher an individual’s gambling frequency, the greater the risk of gambling addiction. Similarly, Marzar et al. [19] found that individuals who engage in gambling more frequently are more likely to experience addictions. The results of these studies support the idea that features F1, F2, and F4 are major indicators for predicting gambling behavior one week into the future. Naturally, we can expect the target variable Y to increase alongside F1, F2, and F4.
We now discuss F3, which represents the average amount spent per purchase. Labrie et al. [20] reported that the act of placing larger bets is associated with a higher likelihood of gambling addiction. Features F5 and F6, which capture impulsive purchasing patterns, are also meaningful for the behavioral analysis. Blaszczynski and Nower [21] suggested that individuals who make impulsive purchases tend to gamble at shorter intervals, potentially leading to addiction. Similarly, Labrie et al. [20] provided insights into the time periods associated with gambling purchases, as addicted gamblers exhibited patterns of concentrated high betting within short periods. Therefore, [20,21] justify our selection of the features F5 and F6, as it is likely that people exhibiting frequent purchases and high variability in purchase amounts tend to impulsively buy lottery items.
XGBoost is a modified tree-based algorithm that improves upon the traditional greedy tree algorithm by incorporating a regularized objective function [12]. We now briefly describe XGBoost for the sake of completeness. Suppose that in the training stage of the tree algorithm, k trees
In a conventional greedy-tree-based algorithm [22], a new tree fk+1 is added to reduce the residual rk based on a loss function
In XGBoost, the new tree fk+1 is added to optimize a new objective function
Therefore, instead of simply reducing the residual error, XGBoost adds a new tree that maintains a balance between predictive accuracy and model complexity, thereby preventing overfitting on the training set.
Once XGBoost is trained on the dataset defined by Eq. (4), it can be used to predict future purchase amounts. One primary goal of this study is to capture the correlation between future purchase amounts and past purchase patterns. To achieve this, we utilized gain-type feature importance, which evaluates the contribution of each feature toward improving the model’s performance. Whenever a feature is used to split a node, XGBoost calculates the extent to which the split reduces the loss function. This reduction accumulates across all splits involving the features, allowing us to identify those features that most directly influence the prediction of future purchase amounts.
We now examine the performance of XGBoost in predicting future lottery purchases based on a four-week purchase history. The dataset includes the purchase histories of 3,459 users from January to April of 2024. Of these, 70% were used as the training set, with the remaining 30% allocated to the test set. In Subsection A, we report the performance of the model across various hyperparameter configurations. In Subsection B, we present the feature importance scores, highlighting the most influential features in predicting future purchase amounts.
The evaluation of model performance was based on R2 and Pearson’s correlation coefficient (p), which are defined as follows:
Both metrics range from 0 to 1, with values closer to 1 indicating better prediction performance.
The XGBoost parameters were selected heuristically. In Table 1, we report the performance of XGBoost for different combinations of learning rate α, tree depth d, and number of trees n. From these results, we selected the following parameters: number of trees=1000, learning rate=0.01, tree depth=6. This configuration resulted in a Person correlation coefficient of 0.86 on the test set. Additionally, we set γ=2 and λ=1 for the regularization parameters in Eq. (5).
Table 1 . Predictive accuracy results in terms of R2 and p score
(α, d, n) | R2 score | p score | ||
---|---|---|---|---|
(e-2,6,800) | Train set | Test set | Train set | Test set |
(e-2,6,1000) | 0.831 | 0.743 | 0.912 | 0.862 |
(2e-2,6,1500) | 0.841 | 0.743 | 0.918 | 0.862 |
(2e-2,6,800) | 0.864 | 0.743 | 0.930 | 0.862 |
(2e-2,6,1000) | 0.867 | 0.742 | 0.932 | 0.862 |
(2e-2,6,1500) | 0.882 | 0.742 | 0.940 | 0.861 |
(e-2,8,800) | 0.911 | 0.740 | 0.956 | 0.861 |
(e-2,8,1000) | 0.900 | 0.742 | 0.950 | 0.861 |
(e-2,8,1500) | 0.913 | 0.741 | 0.957 | 0.861 |
Using these optimized parameters, we plotted the actual future purchase amounts against the predicted amounts, as shown in Fig. 2. These results demonstrate an overall match between the predicted and actual purchase amounts for both the training and test sets.
In this subsection, we report the feature importance scores generated by XGBoost following model training. The gaintype feature importance scores are presented in Fig. 3 in descending order. The weekly purchase amount is associated with the highest score, which is natural because the target variable is the total purchase amount for the following week. The average purchase amount per event had the secondhighest score, indicating a strong correlation with future purchase amounts. This finding coincides with that of [20], wherein placing larger bets in a single game was associated with a higher likelihood of addiction. The third highest scoring feature is variance in weekly purchase amounts. This finding is consistent with those of [20,21], as individuals with impulsive purchasing behaviors for lottery items are more likely to exhibit higher variance in their spending. Therefore, we suggest that monitoring both the average bet amount and variance in purchase amounts could be effective in identifying problematic gambling behavior.
In this study, we employed XGBoost to conduct a behavior analysis, using data from the previous four weeks of purchasing patterns to predict future lottery purchase amounts. A feature importance analysis revealed that the features that capture impulsive purchase patterns are strongly correlated with future purchase amounts, which agrees with the behavior analyses conducted in [20,21]. The present study relies on real purchasing behavior data – rather than subjective data, such as surveys – making the analysis robust and objective. Furthermore, by identifying impulsive purchasing behaviors in an explainable manner, this study offers practical insights into the early detection of and intervention in gambling problems. This approach paves the way for the development of warning systems for at-risk users of online lottery platforms, which could contribute significantly to preventing and managing gambling addiction through data-driven technological solutions.
This study was conducted using data provided by Dong-Hang Lottery Co., Ltd. for research purposes.
Esther Kim
received her M.A. and Ph.D. degrees at the Department of Counseling and Clinical Psychology from the Korea Baptist Theological University in 2019 and 2024, respectively. She is currently a postdoctoral researcher at the same university. Her research interests include understanding addiction problems and mental health issues, as well as the application of AI in these areas.
She can be contacted via email at:
Yunjun Park
is an undergraduate student at the Department of Mathematical Data Science of Hanyang University ERICA. His research interests include computer vision, numerical analysis, machine learning, and generative AI (Diffusion, GAN).
He can be contacted at email:
Seong Yoon Shin
received his M.S. and Ph.D. degrees from the Dept. of Computer Information Engineering at Kunsan National University, Gunsan, Republic of Korea, in 1997 and 2003, respectively. From 2006 to the present, he has been a professor at the School of Computer Science and Engineering. His research interests include image processing, computer vision, and virtual reality. He can be contacted at email:
Gwanghyun Jo
received his M.S. and Ph. D. degrees from the Department of Mathematical Science, KAIST in 2013 and 2018, respectively. From 2019 to 2023, he was a faculty member of the Department of Mathematics at Kunsan university, Republic of Korea. From 2023 to the present, he has been a faculty member of the Department of Mathematical Data Science, Hanyang university ERICA. His research interests include numerical analysis, computational fluid dynamics, and machine learning.
He can be contacted at email:
Journal of information and communication convergence engineering 2024; 22(4): 310-315
Published online December 31, 2024 https://doi.org/10.56977/jicce.2024.22.4.310
Copyright © Korea Institute of Information and Communication Engineering.
Esther Kim 1, Yunjun Park
2, Gwanghyun Jo
2*, and Seong-Yoon Shin3*
1Department of Counselling Psychology, Korea Baptist Theological University/Seminary
2Department of mathematical data science, Hanyang university ERICA, Ansan, Republic of Korea
3Department of Computer Science and Engineering, Kunsan National University, Guansan-si, Republic of Korea
Correspondence to:Seong-Yoon Shin (E-mail: s3397220@kunsan.ac.kr) Department of Computer Science and Engineering, Kunsan National University
Gwanghyun Jo (E-mail: gwanghyun@hanyang.ac.kr) Department of Mathematical Data Science, Hanyang University, ERICA
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
In this study, we conducted a data-driven analysis of lottery purchase behavior by using the XGBoost algorithm to predict future lottery purchase amounts based on purchase patterns of the previous four weeks. We began by judiciously defining key features including the weekly average purchase amount and variance in purchase amount. Subsequently, we evaluated the proposed method’s performance, finding the predicted future purchase amounts to match the actual purchase amounts. A key strength of this study was the interpretability of feature variables. Through the feature importance score from XGBoost, we found that features that capture impulsive patterns in purchases (e.g., variability in purchase amount) are strongly correlated with future spending, which agrees with conventional behavior analysis. Our study can be extended to the development of early warning systems designed to identify at-risk and potentially addicted purchasers on online lottery platforms.
Keywords: XGBoost, Lottery Purchase, Behavior Analysis, Feature Importance
The proliferation of online lottery platforms has significantly increased the accessibility of lottery tickets, raising concerns about the potential for gambling addiction [1-2]. For example, the relationship between online gambling and problematic behaviors was analyzed in [1]. Accordingly, efforts have been made to develop objective indicators in order to identify individuals at high risk of addictive behaviors [4-7]. Traditional approaches typically rely on statistical correlation analyses of carefully selected survey questions, or focus on identifying key risk factors associated with addictive behaviors. For example, factors such as gambling frequency, betting amounts, total money spent, and variability in purchase amounts are commonly linked to addictive tendencies. More recently, advancements in machine learning technologies have offered new opportunities to predict lottery purchasing patterns and address addiction-related issues using data-driven methods [8-11]. A key advantage of these algorithms is their ability to autonomously identify the factors (features) that influence addictive behaviors, even without direct input from predefined survey items. However, a major challenge associated with data-driven algorithms is the loss of explainability, as machine-learning models are often regarded as black-box methods.
This study proposes a prediction algorithm to analyze lottery purchase patterns using a dataset provided by the Dong-Hang Lottery Co., Ltd., which consists of user purchase histories. Given that overall purchase amounts are a critical factor related to lottery addiction, we predicted future purchase amounts based on purchase history. After accumulating historical user purchase data, the prediction model can be used to develop an alarm system for addictive behavior on lottery platforms. Another primary focus of this study was the explainability of the prediction model. We employed XGBoost [12], a highly efficient and accurate tree-based ensemble model. XGBoost-based methods have been successfully applied in various fields including purchasing behavior analysis [13], risk prediction [14-15], and clinical detection [16-17]. One key advantage of this algorithm is its ability to provide feature importance scores, which show correlations between the features and the target variable. Therefore, XGBoost can provide insights into purchase history patterns that correlate with addictive behavior.
Because the performance of a data-driven algorithm is affected by feature variables, we judiciously defined these variables from the purchase history. For example, the weekly defined average purchase amount for each event and variability in purchase amounts play an important role in our analysis, with both defined as feature variables. Compared to well-established theories in behavior analysis, our features can be used to capture impulsive and addictive purchasing patterns in individuals. After the feature vectors are selected, the hyper-parameters in XGBoost are determined heuristically. In the results section, we report the R2 and Pearson’s coefficient scores for the proposed method. Furthermore, we discuss the features that correlate with addictive purchasing behaviors based on the feature importance scores obtained using XGBoost.
The remainder of this paper is organized as follows. In Section 2, we describe the overall algorithm workflow, including feature selection. Section 3 presents the experimental results. Finally, Section 4 concludes the paper.
We developed an XGBoost-based prediction model for lottery purchase amounts. In the following subsections, we describe the data collection and preprocessing methods, define the features used for the XGBoost-based algorithm, and finally present the algorithm itself.
Historical purchase data from users, obtained from January to April of 2024, were provided by DongHang Lottery Co., Ltd. To ensure data quality and focus on active users, we restricted the dataset to users with more than 300 purchase transactions during this period. For each user, we calculated the daily number of purchases and total purchase amount. Thus, our dataset captures both the frequency and monetary volume of purchases, providing a comprehensive overview of each user’s purchasing behavior. To formalize the data, we introduce the following notation: for a given user u and day t, let
The objective of this study was to predict the future total purchase amount over the following week using purchase patterns from the preceding τ number of days:
An appropriate value of τ was crucial for this analysis. If τ is too large, the feature vectors of each sample may become inefficiently bulky. Conversely, if τ is too small, XGBoost would be rendered unable to capture the user’s purchase pattern. To determine τ, we calculated the mutual information (MI) between
The graph of mutual information,
The input data, representing the purchase history, are structured as
Thus, the primary dataset consisting of pairs (X0, Y) can be directly used for supervised learning. However, we defined additional features from X0 to enhance interpretability,.
In this subsection, we define the new features derived from a given user’s purchase history. The primary objective is to capture meaningful features that relate to future purchase amounts. The first feature is the weekly sum of purchase amounts:
The second feature is the weekly number of purchase events:
Next, the average purchase amount per event is defined as
We also consider the number of days with at least one purchase per week, denoted as
Together with the purchase history vector X0, the features defined above form the final input variables:
where Fj represents [Fj1, ...Fj4] for each j = 1, ...4. Because we have four months of data for each individual, we can generate 13 data points of the form (X,Y) by shifting the time t in Eq. (1) by a one-week interval. We may then construct the training dataset by aggregating each user’s data from Eqs. (1) and (3):
where Ns=44,967 represents the total number of samples.
Before concluding this subsection, we discuss the role of each feature in comparison to those in prior studies on behavioral analysis, which were not necessarily data-driven. The frequency of and amount of money spent on gambling have long been recognized as key predictors of addictive behavior. According to Hodgnis et al. [18], the higher an individual’s gambling frequency, the greater the risk of gambling addiction. Similarly, Marzar et al. [19] found that individuals who engage in gambling more frequently are more likely to experience addictions. The results of these studies support the idea that features F1, F2, and F4 are major indicators for predicting gambling behavior one week into the future. Naturally, we can expect the target variable Y to increase alongside F1, F2, and F4.
We now discuss F3, which represents the average amount spent per purchase. Labrie et al. [20] reported that the act of placing larger bets is associated with a higher likelihood of gambling addiction. Features F5 and F6, which capture impulsive purchasing patterns, are also meaningful for the behavioral analysis. Blaszczynski and Nower [21] suggested that individuals who make impulsive purchases tend to gamble at shorter intervals, potentially leading to addiction. Similarly, Labrie et al. [20] provided insights into the time periods associated with gambling purchases, as addicted gamblers exhibited patterns of concentrated high betting within short periods. Therefore, [20,21] justify our selection of the features F5 and F6, as it is likely that people exhibiting frequent purchases and high variability in purchase amounts tend to impulsively buy lottery items.
XGBoost is a modified tree-based algorithm that improves upon the traditional greedy tree algorithm by incorporating a regularized objective function [12]. We now briefly describe XGBoost for the sake of completeness. Suppose that in the training stage of the tree algorithm, k trees
In a conventional greedy-tree-based algorithm [22], a new tree fk+1 is added to reduce the residual rk based on a loss function
In XGBoost, the new tree fk+1 is added to optimize a new objective function
Therefore, instead of simply reducing the residual error, XGBoost adds a new tree that maintains a balance between predictive accuracy and model complexity, thereby preventing overfitting on the training set.
Once XGBoost is trained on the dataset defined by Eq. (4), it can be used to predict future purchase amounts. One primary goal of this study is to capture the correlation between future purchase amounts and past purchase patterns. To achieve this, we utilized gain-type feature importance, which evaluates the contribution of each feature toward improving the model’s performance. Whenever a feature is used to split a node, XGBoost calculates the extent to which the split reduces the loss function. This reduction accumulates across all splits involving the features, allowing us to identify those features that most directly influence the prediction of future purchase amounts.
We now examine the performance of XGBoost in predicting future lottery purchases based on a four-week purchase history. The dataset includes the purchase histories of 3,459 users from January to April of 2024. Of these, 70% were used as the training set, with the remaining 30% allocated to the test set. In Subsection A, we report the performance of the model across various hyperparameter configurations. In Subsection B, we present the feature importance scores, highlighting the most influential features in predicting future purchase amounts.
The evaluation of model performance was based on R2 and Pearson’s correlation coefficient (p), which are defined as follows:
Both metrics range from 0 to 1, with values closer to 1 indicating better prediction performance.
The XGBoost parameters were selected heuristically. In Table 1, we report the performance of XGBoost for different combinations of learning rate α, tree depth d, and number of trees n. From these results, we selected the following parameters: number of trees=1000, learning rate=0.01, tree depth=6. This configuration resulted in a Person correlation coefficient of 0.86 on the test set. Additionally, we set γ=2 and λ=1 for the regularization parameters in Eq. (5).
Table 1 . Predictive accuracy results in terms of R2 and p score.
(α, d, n) | R2 score | p score | ||
---|---|---|---|---|
(e-2,6,800) | Train set | Test set | Train set | Test set |
(e-2,6,1000) | 0.831 | 0.743 | 0.912 | 0.862 |
(2e-2,6,1500) | 0.841 | 0.743 | 0.918 | 0.862 |
(2e-2,6,800) | 0.864 | 0.743 | 0.930 | 0.862 |
(2e-2,6,1000) | 0.867 | 0.742 | 0.932 | 0.862 |
(2e-2,6,1500) | 0.882 | 0.742 | 0.940 | 0.861 |
(e-2,8,800) | 0.911 | 0.740 | 0.956 | 0.861 |
(e-2,8,1000) | 0.900 | 0.742 | 0.950 | 0.861 |
(e-2,8,1500) | 0.913 | 0.741 | 0.957 | 0.861 |
Using these optimized parameters, we plotted the actual future purchase amounts against the predicted amounts, as shown in Fig. 2. These results demonstrate an overall match between the predicted and actual purchase amounts for both the training and test sets.
In this subsection, we report the feature importance scores generated by XGBoost following model training. The gaintype feature importance scores are presented in Fig. 3 in descending order. The weekly purchase amount is associated with the highest score, which is natural because the target variable is the total purchase amount for the following week. The average purchase amount per event had the secondhighest score, indicating a strong correlation with future purchase amounts. This finding coincides with that of [20], wherein placing larger bets in a single game was associated with a higher likelihood of addiction. The third highest scoring feature is variance in weekly purchase amounts. This finding is consistent with those of [20,21], as individuals with impulsive purchasing behaviors for lottery items are more likely to exhibit higher variance in their spending. Therefore, we suggest that monitoring both the average bet amount and variance in purchase amounts could be effective in identifying problematic gambling behavior.
In this study, we employed XGBoost to conduct a behavior analysis, using data from the previous four weeks of purchasing patterns to predict future lottery purchase amounts. A feature importance analysis revealed that the features that capture impulsive purchase patterns are strongly correlated with future purchase amounts, which agrees with the behavior analyses conducted in [20,21]. The present study relies on real purchasing behavior data – rather than subjective data, such as surveys – making the analysis robust and objective. Furthermore, by identifying impulsive purchasing behaviors in an explainable manner, this study offers practical insights into the early detection of and intervention in gambling problems. This approach paves the way for the development of warning systems for at-risk users of online lottery platforms, which could contribute significantly to preventing and managing gambling addiction through data-driven technological solutions.
This study was conducted using data provided by Dong-Hang Lottery Co., Ltd. for research purposes.
Table 1 . Predictive accuracy results in terms of R2 and p score.
(α, d, n) | R2 score | p score | ||
---|---|---|---|---|
(e-2,6,800) | Train set | Test set | Train set | Test set |
(e-2,6,1000) | 0.831 | 0.743 | 0.912 | 0.862 |
(2e-2,6,1500) | 0.841 | 0.743 | 0.918 | 0.862 |
(2e-2,6,800) | 0.864 | 0.743 | 0.930 | 0.862 |
(2e-2,6,1000) | 0.867 | 0.742 | 0.932 | 0.862 |
(2e-2,6,1500) | 0.882 | 0.742 | 0.940 | 0.861 |
(e-2,8,800) | 0.911 | 0.740 | 0.956 | 0.861 |
(e-2,8,1000) | 0.900 | 0.742 | 0.950 | 0.861 |
(e-2,8,1500) | 0.913 | 0.741 | 0.957 | 0.861 |
Keun Young Lee, Bomchul Kim, and Gwanghyun Jo
Journal of information and communication convergence engineering 2024; 22(2): 133-138 https://doi.org/10.56977/jicce.2024.22.2.133Samyuktha Muralidharan, Savita Yadav, Jungwoo Huh, Sanghoon Lee, and Jongwook Woo
Journal of information and communication convergence engineering 2022; 20(2): 96-102 https://doi.org/10.6109/jicce.2022.20.2.96