Journal of information and communication convergence engineering 2024; 22(4): 303-309
Published online December 31, 2024
https://doi.org/10.56977/jicce.2024.22.4.303
© Korea Institute of Information and Communication Engineering
Correspondence to : Daehee Kim (E-mail: Daeheekim@sch.ac.kr)
Department of Future Convergence Technology, Soonchunhyang University, Asan 31538, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Gait analysis plays a pivotal role in clinical diagnostics and aids in the detection and evaluation of various disorders and disabilities. Traditional methods often rely on intricate video systems or pressure mats to assess gait. Previous studies have demonstrated the potential of artificial intelligence (AI) in gait analysis using techniques, such as convolutional neural networks (CNN) and long short-term memory (LSTM) networks. However, these methods often encounter challenges related to high dimensionality, temporal dependencies, and variability in gait patterns, making accurate and efficient classification difficult. To address these challenges, this study introduces a simple one-dimensional (1D) CNN model designed to analyze ground reaction force (GRF) patterns and classify individuals as healthy or suffering from gait disorders. The model achieved a remarkable classification accuracy of 98.65% in distinguishing healthy individuals from those with gait disorders, demonstrating significant improvements over the existing models. This performance is bolstered by the attention mechanism and standardization techniques that enhance robustness and accuracy.
Keywords Attention mechanism, Convolutional neural networks (CNN), Gait analysis, Ground reaction force (GRF), Gaiter dataset
Detecting and quantifying ground reaction forces (GRF) is a fundamental method used by physicians to impartially assess human locomotion and conduct a thorough analysis of a patient's gait. Neurologists, orthopedists, and physiotherapists have extensively analyzed human locomotion to evaluate patient conditions and guide recovery and treatment [1]. Clinical gait analyses generate large amounts of data that are challenging to interpret and assess owing to their unpredictability, nonlinear linkages and correlations, high complexity, and temporal dependencies. Therefore, experienced clinicians must draw valid conclusions from these data. Its primary goal is to identify issues impacting a patient's gait pattern [2], making it a valuable tool across several disciplines, such as influence analysis, kinesiology, wellness, and user recognition. Wearable sensors, such as accelerometers, gyroscopes, and force and pressure sensors, as well as vision-based gait recognition systems using image sensors, have been employed to record gait data remotely and without participant cooperation. Among these, floor sensors stand out as a notable option for detecting GRF measurements and pressure distributions across the foot surface [3]. GRF quantification is a common technique that enables physicians to objectively delineate human locomotion and meticulously evaluate the performance of patients with gait-related concerns [4]. Gait has the potential to indicate various mental disorders, including dementia, depression, intellectual disability, and locomotor issues, such as joint deformities [5]. In recent years, artificial intelligence (AI) methods, such as convolutional neural networks (CNN), long short-term memory (LSTM), Support Vector Machines (SVM), nearestneighbor classifiers, and other clustering algorithms, have been increasingly employed in this field. The efficacy of these approaches is significantly influenced by the format of the input data. Pataky et al. [6] employed dynamic plantar pressure data, image analysis, and feature extraction to achieve remarkable subject identification precision (99.6%). Gul et al. [7] developed a 3D CNN structure for gait recognition using GEI, which are simplified representations that capture the form and movement characteristics of human gait. Khokhlova et al. [8] proposed a gait model based on Kinect v.2 sensor data, segmenting biomechanical patterns into movement sequences. To gain insight into the distinction between normal and pathological gaits based on kinematic properties, Lee et al. [9] proposed training an ensemble LSTM architecture using cohort data. By employing a DCNN, they extracted features from the gait pattern and achieved over 90% recognition accuracy across seven distinct gait types, including walking, running, and stair climbing. Farah et al. [10] employed thigh kinematics to detect gait patterns in 31 able-bodied participants while walking. Additionally, Di Nardo et al. [11] proposed a new approach utilizing deep learning assessment of transverse knee joint inclination for multimodal gait cycle classification and gait event prediction. Horsak et al. [12] utilized neural networks to discern imitated gait patterns, whereas Manap et al. [13] classified general and diseased gaits by employing footground response pressure data from pressure platforms. Although some methods, such as kernel-based principal component analysis (KPCA) and SVM classification [14], have limitations in categorizing data with multiple classes or variations in gait patterns, the collective findings underscore the necessity for highly effective automated methods for the early detection of gait abnormalities in real-time settings. Akter et al. used a 2D CNN [15] combined with an attention mechanism to recognize human activity with high accuracy on various datasets. However, the above literature still has room for improvement in various cases in terms of overall model simplification and projected accuracy, and little work has been done to identify diseases associated with gait.
Although much research has been carried out to produce a viable model that can identify gait patterns to classify healthy patients as unhealthy, this research utilized the GaitRec dataset, which is a huge dataset comprising data from 2085 individuals with different types of gait abnormalities (GD) and 211 Healthy Controls (HCs). It covers both HCs and four specific GD classes: hip, knee, ankle, and calcaneus. The proposed method introduces a simple approach using AI, a self-attention mechanism, and data standardization to comprehensively analyze GRF patterns for classifying individuals with healthy gaits and those with gait disorders. Using a one-dimensional (1D) CNN model, the research successfully with a 98.65% accuracy differentiated between patients with healthy gait patterns and those with gait disorders based on only GRF data which may prove to be a viable model while using other datasets. The contributions of this study are as follows:
• Creating a model that is lightweight, accurate, and requires little pre-processing so that it can be tuned to operate on a variety of datasets.
• To enhance and achieve increased precision within the data, the outcomes were compared with available designs.
• Evaluating the resilience of the proposed model by examining its operational model.
The remainder of this paper is organized as follows. The proposed methodology is outlined in Section 2. The findings of the experiments are presented in Section 3, and the conclusion is presented in Section 4, with suggestions for future research.
The findings of this investigation were obtained from a distinctive pathological dataset known as GaitRec and already accessible clinical gait records [12]. The dataset comprised de-identified GRF measurements from 2,085 patients with GDs and 211 HCs, with data descriptors for walking pace, shoe condition, lifespan, and sex. Whilst healthy individuals strolled barefoot or in regular shoes, patients walked without shoes, in orthopaedic or regular shoes, and with or without orthopaedic insoles at three speeds: slow (0.98±0.14 m/s), self-selected (1.27±0.13 m/s), and fast (1.55±0.15 m/s). The details of the dataset are listed in Table 1.
Table 1 . Total Datapoints in GaitRec Dataset
Class | Subjects | Male | Female | Age (yrs.) | Bi-lateral Trials |
---|---|---|---|---|---|
Healthy (HC) | 211 | 104 | 107 | 34.7 | 7,755 |
Hip (GD) | 450 | 373 | 77 | 42.6 | 12,748 |
Knee (GD) | 625 | 426 | 199 | 41.6 | 19,873 |
Ankle (GD) | 627 | 498 | 129 | 41.6 | 21,386 |
Calcaneus (GD) | 382 | 339 | 43 | 43.5 | 13,970 |
Total | 2295 | 1,740 | 555 | 41.5 | 75,732 |
The convolutional block attention module (CBAM) was integrated after the 2nd CNN block, and the deep learning model was reoriented to refine features to obtain better results. In deep networks, CBAM enhances convolutional blocks by transforming the “Input Intermediate Feature Maps” into “Refined Feature Maps” through the channel attention module (CAM). The basic configuration is illustrated in Fig. 1. The CAM operates as a 1D attention mechanism, denoted as F CAM ∈ ℝ 1×1×1, and is responsible for identifying and emphasizing significant routes found in the preliminary profile map A, thereby generating a channelrefined feature map B. Mathematically, this process is represented as: B = F CAM (A) ⊗ A.
CAM enhances CNNs by selectively emphasizing informative channels and suppressing less relevant ones, thereby improving the network’s ability to discern complex patterns. This selective attention leads to a more efficient use of the network’s representational capacity, reducing redundancy, and creating more compact and informative feature maps. CAM also improves the generalization performance by focusing on salient features and reducing overfitting. Additionally, its lightweight design allows seamless integration into existing architectures with minimal computational overhead, making it a valuable component for enhancing deep learning models in various applications.
Standardization was applied to the feature matrices X_train and X_test using the ‘StandardScaler’ from ‘sklearn.preprocessing’. This step involved fitting the scaler on the training data to compute the mean (μ) and standard deviation (σ) for each feature. The transformation to standardized features x' is given by
where x is the original feature value, μ the mean, and σ the standard deviation. This ensured that each feature had a mean of zero and standard deviation of one, thereby improving the convergence rate of the model during training. To prepare the data for the CNN, the standardized feature matrices were reshaped to introduce an explicit channel dimension by changing their shapes from (ntrain,d) and (ntest,d) to (ntrain,d,1) and (ntest,d,1) respectively, where d is the number of features. This reshaping is crucial for compatibility with the convolutional layers which expect three-dimensional input: samples, features, and channels, which in turn increases the overall accuracy.
In this study, a CNN was employed for the binary classification of gait patterns using the GaitRec dataset. The methodology involved several pre-processing and model training steps to ensure effective learning and evaluation, as shown in Fig. 2.
Initially, the dataset was split into training and testing subsets using the train_test_split function from sklearn.model_selection, with 70% and 30% of the data allocated to the training and testing sets, respectively. This split was performed with test_size = 0.3, random_state = 42, and shuffle = True to ensure reproducibility and randomness of the data distribution. Mathematically, the training and testing sets can be expressed as |Xtrain| = 0.7·n and |Xtest| = 0.3·n, where n is the total number of samples. The CNN architecture was constructed using the Sequential class from tensorflow.keras. models. The model comprises several layers, which are described below and listed in Table 2. A convolutional layer (Conv1D) with 64 filters, a kernel size of 3, Rectified Linear Unit (ReLU) activation function, and an input shape of (102, 1) were used. This layer applies 64 convolutional filters of size three to the input data, which have 104 features and one channel, extracting local features from the gait sequences. The convolution operation is given by
Table 2 . Model Summary
Layer No. | Layer (type) | Kernel Size | Activation | Output Shape | Parameters |
---|---|---|---|---|---|
0 | Conv1D | 3 | ReLU | (None, 102, 64) | 256 |
1 | Dropout | -- | -- | (None, 102, 64) | 0 |
2 | MaxPooling1D | -- | -- | (None, 51, 64) | 0 |
3 | Flatten | -- | -- | (None, 3264) | 0 |
4 | Dense | 3 | ReLU | (None, 100) | 326500 |
5 | Conv1D | 3 | -- | (None, 36) | 100 |
6 | Activation | -- | -- | (None, 18) | 0 |
7 | Dense | -- | -- | (None, 1) | 101 |
Total parameters | 326,857 | ||||
Trainable parameters | 326,857 | ||||
Non-trainable parameters | 0 |
where, K is the kernel size, wk is the weight, and b is the bias term.
Subsequently, a dropout layer with a dropout rate of 0.3 was used. This layer helps prevent overfitting by randomly setting 30% of the input units to zero during each update during training. A max pooling layer (MaxPooling1D) with a pool size of two was used to perform downsampling by taking the maximum value over a window of size 2, reducing the dimensionality of the feature maps and retaining the most significant information. The pooling operation is given by
Subsequently, a flatten layer was used to flatten the input, transforming the 2D feature maps into a 1D feature vector to prepare for the fully connected layers, followed by a fully connected layer (Dense) with 100 units and a ReLU activation function. This dense layer enforces a linear transformation followed by the ReLU activation function on the input feature vector to learn higher-level representations. The transformation is given by
where W is the weight matrix, and b is the bias vector.
Another CNN layer with 36 layers with a kernel size of three was added along with the CBAM. The CNN layer was added to reduce overfitting. The output layer (Dense) comprises one unit, a sigmoid activation function, and an add layer for the activation function. For binary classification, this layer uses a sigmoid activation function to produce an output probability between 0 and 1. The sigmoid function is defined as
Finally, the model was compiled using the adam optimizer with a learning rate of 0.0001, binary cross-entropy as the loss function, and accuracy as the evaluation metric. The binary cross-entropy loss function is given by
where yi is the true label and pi is the predicted probability.
The model architecture was summarized and the training process was initiated using the fit method, training on the reshaped and standardized training data (x_train, y_train) and validated on the testing data (x_test, y_test) over 100 epochs with a batch size of 16. The accuracy of the model was calculated as the ratio of the correctly predicted samples to the total number of samples.
The model achieved an accuracy of 98.65% on the test data. This comprehensive methodology ensured robust training and evaluation of the CNN on the GaitRec dataset, leveraging convolutional layers to effectively capture and classify gait patterns.
The evaluation of the CNN model for the binary classification of gait patterns using the GaitRec dataset involved multiple performance metrics and visualizations to substantiate the efficacy of the model. Experiments and model training were conducted using the Keras deep learning framework, version 2.6.0, and Python version 3.8.5. The Adam optimizer was employed with a learning rate of 0.001, binary cross-entropy as the loss function, a batch size of 16, and 100 epochs. The computational setup included a machine running Windows 11 with 32 GB RAM, an Intel Core i9-7900X processor clocked at 4.30 GHz, and an NVIDIA GeForce RTX 2080Ti graphics card. To address the supervised learning task, the training and testing datasets were arbitrarily divided into subsets that accounted for 70% and 30% of the entire sample, respectively. This section details the experimental setup and the results produced by the designed classification model.
The accuracy and loss curves for both the training and validation datasets, shown in Figs. 3 and 4, respectively, provide insights into the performance of the model over 100 epochs. The model achieved high overall accuracy, reaching approximately 99% within the first few epochs and maintaining stability thereafter. This rapid convergence towards high accuracy indicates the robustness of the model in effectively learning the distinguishing features of the gait patterns. Despite achieving a high accuracy early, the model continued to improve, reaching approximately 98% accuracy by the 19th epoch.
The accuracy curve shows consistently high training accuracy of 98.5% with minor fluctuations across epochs, and similarly high validation accuracy, mirroring the training accuracy, which signifies minimal overfitting and good generalization performance.
Specifically, the proposed model achieved an accuracy of approximately 98% on the 25th epoch, demonstrating its stability and effectiveness. In addition, the model achieved a peak accuracy of approximately 98.85%, highlighting its superior performance.
The loss curve depicted in Fig. 4 denotes the reduction in error rates as the training progresses, with an initial steep decline in both the training and validation losses, indicating effective learning. The curves plateaued with minimal loss, further validating the efficiency of the proposed model. By the 20th epoch, the loss function is significantly reduced, maintaining a low and stable value throughout the remaining epochs.
The confusion matrix depicted in Fig. 5 provides a comprehensive summary of the predicted results. The matrix illustrates the ability of the model to accurately classify gait patterns, with the diagonal elements representing the number of correct predictions and the off-diagonal elements indicating misclassifications. In this experiment, the confusion matrix indicates that the model correctly classified 23,150 samples of one class, but incorrectly classified 204,046 samples as belonging to the incorrect class.
The proposed method, evaluated on the GaitRec dataset, achieved an impressive accuracy of 98.65%, demonstrating its superior performance compared with other models documented in the literature. Table 3 presents a comparative analysis of the proposed method with various state-of-the-art techniques. For instance, the proposed method outperformed models such as the CNN applied to a private dataset by Fricke et al. [16], which achieved 91.90% accuracy, and the MARS model by Wang et al. [17], which achieved 88.30% accuracy.
Table 3 . Comparison Table
Reference | Dataset | Methodology | Subjects | Accuracy |
---|---|---|---|---|
[16] | Private dataset | CNN | 37 | 91.90% |
[17] | Private dataset | MARS | 8 | 88.30% |
[18] | MFC data | SVM | 58 | 83.30% |
[19] | Private dataset | SVM | 440 | 90.80% |
[20] | Private dataset | ANN | 239 | 90% |
[21] | GaitRec dataset | 1D CNN | 2295 | 91.62% |
Proposed Method | GaitRec dataset | 1D CNN | 2085 | 98.65% |
In addition, the SVM applied to the MFC data in Begg et al. [18] reached 83.30%, and the linear SVM in Slijepcevic et al. [19] achieved 90.80%. Furthermore, even advanced models such as ANN by Zhou et al. [20], which attained 90% accuracy, were outperformed by the proposed 1D CNN model on the GaitRec dataset. GaitRecNet [21] also applied a 1D CNN to the GaitRec dataset, achieving 91.62% accuracy, but still fell short of the proposed method’s accuracy of 98.65%.
The CNN model demonstrated exceptional performance in the classification of gait patterns, as evidenced by its high accuracy and low loss values across both training and validation datasets. The confusion matrix, although with a significant number of misclassifications in one class, provides critical insights into areas requiring further investigation or potential class imbalances in the dataset. Overall, the model achieved an accuracy of 98.65%, demonstrating its ability to effectively discern and classify gait patterns. This high accuracy was achieved through the clear and meticulous use of attention mechanisms and data standardization techniques. Because the dataset used was GRF data, the proposed model leveraged this by using a standard scaler and stabilizing the learning process, enabling the model to effectively capture and distinguish the intricate features inherent in gait patterns. The use of the CBAM module after the 2nd CNN block took the feature sets and enhanced them before feeding into the last dense layer by transforming the “Input Intermediate Feature Maps” into “Refined Feature Maps” through the CAM. Data standardization enhances convergence rates, stability, and the overall learning process, whereas the CBAM module ensures that the model focuses on the most relevant features of the GRF data. Despite its high performance, the proposed model has certain limitations. The high accuracy and low loss values may not fully capture the performance of the model on more diverse or unseen data, indicating the potential need for further testing and validation on more varied datasets. These shortcomings highlight areas for future improvements and model refinement. A comprehensive analysis of the confusion matrix, accuracy, and loss curves collectively underscores the capability of the model to effectively discern and classify gait patterns, making it a viable tool for further application and research in gait analysis and related fields. The comparative analysis further highlights the superiority of the proposed model over the existing methods, establishing it as a significant advancement in the domain of gait pattern classification.
In this study, we developed and evaluated a CNN for binary classification of gait patterns using the GaitRec dataset. The experimental setup was designed to ensure robust and reproducible results. Several criteria were used to evaluate the effectiveness of the algorithm, particularly the accuracy, loss, and confusion matrix, providing a comprehensive evaluation of its efficacy. The proposed CNN model achieved an impressive accuracy of 98.65%, significantly outperforming other state-of-the-art models reported in the literature. The accuracy and loss curves demonstrated rapid convergence towards high accuracy and low loss values, highlighting the robustness and efficiency of the model in learning the distinguishing features of gait patterns. This underscores the effectiveness of the proposed 1D CNN model on the GaitRec dataset, thereby establishing a new benchmark for gait pattern classification. However, the model could benefit from further refinement, particularly when handling class imbalances. Additionally, the need for testing and validation on more diverse datasets was highlighted to ensure the generalizability and robustness of the model across different scenarios. The insights gained from this study provide a solid foundation for further improvements and extensions to address the identified limitations, and enhance the performance of the model in more diverse and complex settings. This study contributes to the expanding corpus of information on this topic, paving the way for more accurate and reliable gait analyses.
This study was supported by the 2023 Sabbatical Year of the Soonchunhyang University.
Ansary Shafew
Born in Dhaka, Bangladesh, Ansary Shafew earned a Bachelor of Science degree in Electronic Engineering from the American International University-Bangladesh (AIUB), Dhaka, Bangladesh, in 2018. He later completed a Master of Science degree in Electronic Engineering at Dong-A University, Busan, South Korea, in 2024. His research interests include deep learning and IoT devices.
Dongwan Kim
Dongwan Kim (Member, IEEE) received the B.S. degree in Electronics Engineering from Korea University, Seoul, South Korea, in 2003, the M.S. degree in Information and Communication Engineering from POSTECH, Pohang, South Korea, in 2006, and the Ph.D. Degree in Electronics and Computer Engineering from Korea University in 2015. He is currently an Assistant Professor with the Department of Electronics Engineering, Dong-A University, Busan, South Korea. From 2006 to 2016, he was a Senior Engineer with Samsung Electronics Company Ltd., Suwon, South Korea. His current research interests include the efficient design of communication system and wireless power transmission.
Daehee Kim
His research interests include deep learning and IoT devices. Daehee Kim (Member, IEEE) received the B.S. degree in Electrical and Electronic Engineering from Yonsei University, Seoul, South Korea, in 2003, and the M.S. and Ph.D. degrees in Electrical and Electronic Engineering from Korea University, Seoul, in 2006 and 2016, respectively. He is currently an Associate Professor with the Department of Internet of Things, Soonchunhyang University, Asan, South Korea. From 2006 to 2016, he was a Senior Engineer with Samsung Electronics, Suwon, South Korea, where he conducted research on WiMAX and LTE systems. His research interest includes the Internet of Things, energy management, blockchain, 5G/6G, and security for Internet of Vehicles.
Journal of information and communication convergence engineering 2024; 22(4): 303-309
Published online December 31, 2024 https://doi.org/10.56977/jicce.2024.22.4.303
Copyright © Korea Institute of Information and Communication Engineering.
Ansary Shafew 1, Dongwan Kim
1, and Daehee Kim2*
1Department of Electronic Engineering, Dong-A University, Busan 602760, Republic of Korea
2Department of Future Convergence Technology, Soonchunhyang University, Asan 31538, Republic of Korea
Correspondence to:Daehee Kim (E-mail: Daeheekim@sch.ac.kr)
Department of Future Convergence Technology, Soonchunhyang University, Asan 31538, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Gait analysis plays a pivotal role in clinical diagnostics and aids in the detection and evaluation of various disorders and disabilities. Traditional methods often rely on intricate video systems or pressure mats to assess gait. Previous studies have demonstrated the potential of artificial intelligence (AI) in gait analysis using techniques, such as convolutional neural networks (CNN) and long short-term memory (LSTM) networks. However, these methods often encounter challenges related to high dimensionality, temporal dependencies, and variability in gait patterns, making accurate and efficient classification difficult. To address these challenges, this study introduces a simple one-dimensional (1D) CNN model designed to analyze ground reaction force (GRF) patterns and classify individuals as healthy or suffering from gait disorders. The model achieved a remarkable classification accuracy of 98.65% in distinguishing healthy individuals from those with gait disorders, demonstrating significant improvements over the existing models. This performance is bolstered by the attention mechanism and standardization techniques that enhance robustness and accuracy.
Keywords: Attention mechanism, Convolutional neural networks (CNN), Gait analysis, Ground reaction force (GRF), Gaiter dataset
Detecting and quantifying ground reaction forces (GRF) is a fundamental method used by physicians to impartially assess human locomotion and conduct a thorough analysis of a patient's gait. Neurologists, orthopedists, and physiotherapists have extensively analyzed human locomotion to evaluate patient conditions and guide recovery and treatment [1]. Clinical gait analyses generate large amounts of data that are challenging to interpret and assess owing to their unpredictability, nonlinear linkages and correlations, high complexity, and temporal dependencies. Therefore, experienced clinicians must draw valid conclusions from these data. Its primary goal is to identify issues impacting a patient's gait pattern [2], making it a valuable tool across several disciplines, such as influence analysis, kinesiology, wellness, and user recognition. Wearable sensors, such as accelerometers, gyroscopes, and force and pressure sensors, as well as vision-based gait recognition systems using image sensors, have been employed to record gait data remotely and without participant cooperation. Among these, floor sensors stand out as a notable option for detecting GRF measurements and pressure distributions across the foot surface [3]. GRF quantification is a common technique that enables physicians to objectively delineate human locomotion and meticulously evaluate the performance of patients with gait-related concerns [4]. Gait has the potential to indicate various mental disorders, including dementia, depression, intellectual disability, and locomotor issues, such as joint deformities [5]. In recent years, artificial intelligence (AI) methods, such as convolutional neural networks (CNN), long short-term memory (LSTM), Support Vector Machines (SVM), nearestneighbor classifiers, and other clustering algorithms, have been increasingly employed in this field. The efficacy of these approaches is significantly influenced by the format of the input data. Pataky et al. [6] employed dynamic plantar pressure data, image analysis, and feature extraction to achieve remarkable subject identification precision (99.6%). Gul et al. [7] developed a 3D CNN structure for gait recognition using GEI, which are simplified representations that capture the form and movement characteristics of human gait. Khokhlova et al. [8] proposed a gait model based on Kinect v.2 sensor data, segmenting biomechanical patterns into movement sequences. To gain insight into the distinction between normal and pathological gaits based on kinematic properties, Lee et al. [9] proposed training an ensemble LSTM architecture using cohort data. By employing a DCNN, they extracted features from the gait pattern and achieved over 90% recognition accuracy across seven distinct gait types, including walking, running, and stair climbing. Farah et al. [10] employed thigh kinematics to detect gait patterns in 31 able-bodied participants while walking. Additionally, Di Nardo et al. [11] proposed a new approach utilizing deep learning assessment of transverse knee joint inclination for multimodal gait cycle classification and gait event prediction. Horsak et al. [12] utilized neural networks to discern imitated gait patterns, whereas Manap et al. [13] classified general and diseased gaits by employing footground response pressure data from pressure platforms. Although some methods, such as kernel-based principal component analysis (KPCA) and SVM classification [14], have limitations in categorizing data with multiple classes or variations in gait patterns, the collective findings underscore the necessity for highly effective automated methods for the early detection of gait abnormalities in real-time settings. Akter et al. used a 2D CNN [15] combined with an attention mechanism to recognize human activity with high accuracy on various datasets. However, the above literature still has room for improvement in various cases in terms of overall model simplification and projected accuracy, and little work has been done to identify diseases associated with gait.
Although much research has been carried out to produce a viable model that can identify gait patterns to classify healthy patients as unhealthy, this research utilized the GaitRec dataset, which is a huge dataset comprising data from 2085 individuals with different types of gait abnormalities (GD) and 211 Healthy Controls (HCs). It covers both HCs and four specific GD classes: hip, knee, ankle, and calcaneus. The proposed method introduces a simple approach using AI, a self-attention mechanism, and data standardization to comprehensively analyze GRF patterns for classifying individuals with healthy gaits and those with gait disorders. Using a one-dimensional (1D) CNN model, the research successfully with a 98.65% accuracy differentiated between patients with healthy gait patterns and those with gait disorders based on only GRF data which may prove to be a viable model while using other datasets. The contributions of this study are as follows:
• Creating a model that is lightweight, accurate, and requires little pre-processing so that it can be tuned to operate on a variety of datasets.
• To enhance and achieve increased precision within the data, the outcomes were compared with available designs.
• Evaluating the resilience of the proposed model by examining its operational model.
The remainder of this paper is organized as follows. The proposed methodology is outlined in Section 2. The findings of the experiments are presented in Section 3, and the conclusion is presented in Section 4, with suggestions for future research.
The findings of this investigation were obtained from a distinctive pathological dataset known as GaitRec and already accessible clinical gait records [12]. The dataset comprised de-identified GRF measurements from 2,085 patients with GDs and 211 HCs, with data descriptors for walking pace, shoe condition, lifespan, and sex. Whilst healthy individuals strolled barefoot or in regular shoes, patients walked without shoes, in orthopaedic or regular shoes, and with or without orthopaedic insoles at three speeds: slow (0.98±0.14 m/s), self-selected (1.27±0.13 m/s), and fast (1.55±0.15 m/s). The details of the dataset are listed in Table 1.
Table 1 . Total Datapoints in GaitRec Dataset.
Class | Subjects | Male | Female | Age (yrs.) | Bi-lateral Trials |
---|---|---|---|---|---|
Healthy (HC) | 211 | 104 | 107 | 34.7 | 7,755 |
Hip (GD) | 450 | 373 | 77 | 42.6 | 12,748 |
Knee (GD) | 625 | 426 | 199 | 41.6 | 19,873 |
Ankle (GD) | 627 | 498 | 129 | 41.6 | 21,386 |
Calcaneus (GD) | 382 | 339 | 43 | 43.5 | 13,970 |
Total | 2295 | 1,740 | 555 | 41.5 | 75,732 |
The convolutional block attention module (CBAM) was integrated after the 2nd CNN block, and the deep learning model was reoriented to refine features to obtain better results. In deep networks, CBAM enhances convolutional blocks by transforming the “Input Intermediate Feature Maps” into “Refined Feature Maps” through the channel attention module (CAM). The basic configuration is illustrated in Fig. 1. The CAM operates as a 1D attention mechanism, denoted as F CAM ∈ ℝ 1×1×1, and is responsible for identifying and emphasizing significant routes found in the preliminary profile map A, thereby generating a channelrefined feature map B. Mathematically, this process is represented as: B = F CAM (A) ⊗ A.
CAM enhances CNNs by selectively emphasizing informative channels and suppressing less relevant ones, thereby improving the network’s ability to discern complex patterns. This selective attention leads to a more efficient use of the network’s representational capacity, reducing redundancy, and creating more compact and informative feature maps. CAM also improves the generalization performance by focusing on salient features and reducing overfitting. Additionally, its lightweight design allows seamless integration into existing architectures with minimal computational overhead, making it a valuable component for enhancing deep learning models in various applications.
Standardization was applied to the feature matrices X_train and X_test using the ‘StandardScaler’ from ‘sklearn.preprocessing’. This step involved fitting the scaler on the training data to compute the mean (μ) and standard deviation (σ) for each feature. The transformation to standardized features x' is given by
where x is the original feature value, μ the mean, and σ the standard deviation. This ensured that each feature had a mean of zero and standard deviation of one, thereby improving the convergence rate of the model during training. To prepare the data for the CNN, the standardized feature matrices were reshaped to introduce an explicit channel dimension by changing their shapes from (ntrain,d) and (ntest,d) to (ntrain,d,1) and (ntest,d,1) respectively, where d is the number of features. This reshaping is crucial for compatibility with the convolutional layers which expect three-dimensional input: samples, features, and channels, which in turn increases the overall accuracy.
In this study, a CNN was employed for the binary classification of gait patterns using the GaitRec dataset. The methodology involved several pre-processing and model training steps to ensure effective learning and evaluation, as shown in Fig. 2.
Initially, the dataset was split into training and testing subsets using the train_test_split function from sklearn.model_selection, with 70% and 30% of the data allocated to the training and testing sets, respectively. This split was performed with test_size = 0.3, random_state = 42, and shuffle = True to ensure reproducibility and randomness of the data distribution. Mathematically, the training and testing sets can be expressed as |Xtrain| = 0.7·n and |Xtest| = 0.3·n, where n is the total number of samples. The CNN architecture was constructed using the Sequential class from tensorflow.keras. models. The model comprises several layers, which are described below and listed in Table 2. A convolutional layer (Conv1D) with 64 filters, a kernel size of 3, Rectified Linear Unit (ReLU) activation function, and an input shape of (102, 1) were used. This layer applies 64 convolutional filters of size three to the input data, which have 104 features and one channel, extracting local features from the gait sequences. The convolution operation is given by
Table 2 . Model Summary.
Layer No. | Layer (type) | Kernel Size | Activation | Output Shape | Parameters |
---|---|---|---|---|---|
0 | Conv1D | 3 | ReLU | (None, 102, 64) | 256 |
1 | Dropout | -- | -- | (None, 102, 64) | 0 |
2 | MaxPooling1D | -- | -- | (None, 51, 64) | 0 |
3 | Flatten | -- | -- | (None, 3264) | 0 |
4 | Dense | 3 | ReLU | (None, 100) | 326500 |
5 | Conv1D | 3 | -- | (None, 36) | 100 |
6 | Activation | -- | -- | (None, 18) | 0 |
7 | Dense | -- | -- | (None, 1) | 101 |
Total parameters | 326,857 | ||||
Trainable parameters | 326,857 | ||||
Non-trainable parameters | 0 |
where, K is the kernel size, wk is the weight, and b is the bias term.
Subsequently, a dropout layer with a dropout rate of 0.3 was used. This layer helps prevent overfitting by randomly setting 30% of the input units to zero during each update during training. A max pooling layer (MaxPooling1D) with a pool size of two was used to perform downsampling by taking the maximum value over a window of size 2, reducing the dimensionality of the feature maps and retaining the most significant information. The pooling operation is given by
Subsequently, a flatten layer was used to flatten the input, transforming the 2D feature maps into a 1D feature vector to prepare for the fully connected layers, followed by a fully connected layer (Dense) with 100 units and a ReLU activation function. This dense layer enforces a linear transformation followed by the ReLU activation function on the input feature vector to learn higher-level representations. The transformation is given by
where W is the weight matrix, and b is the bias vector.
Another CNN layer with 36 layers with a kernel size of three was added along with the CBAM. The CNN layer was added to reduce overfitting. The output layer (Dense) comprises one unit, a sigmoid activation function, and an add layer for the activation function. For binary classification, this layer uses a sigmoid activation function to produce an output probability between 0 and 1. The sigmoid function is defined as
Finally, the model was compiled using the adam optimizer with a learning rate of 0.0001, binary cross-entropy as the loss function, and accuracy as the evaluation metric. The binary cross-entropy loss function is given by
where yi is the true label and pi is the predicted probability.
The model architecture was summarized and the training process was initiated using the fit method, training on the reshaped and standardized training data (x_train, y_train) and validated on the testing data (x_test, y_test) over 100 epochs with a batch size of 16. The accuracy of the model was calculated as the ratio of the correctly predicted samples to the total number of samples.
The model achieved an accuracy of 98.65% on the test data. This comprehensive methodology ensured robust training and evaluation of the CNN on the GaitRec dataset, leveraging convolutional layers to effectively capture and classify gait patterns.
The evaluation of the CNN model for the binary classification of gait patterns using the GaitRec dataset involved multiple performance metrics and visualizations to substantiate the efficacy of the model. Experiments and model training were conducted using the Keras deep learning framework, version 2.6.0, and Python version 3.8.5. The Adam optimizer was employed with a learning rate of 0.001, binary cross-entropy as the loss function, a batch size of 16, and 100 epochs. The computational setup included a machine running Windows 11 with 32 GB RAM, an Intel Core i9-7900X processor clocked at 4.30 GHz, and an NVIDIA GeForce RTX 2080Ti graphics card. To address the supervised learning task, the training and testing datasets were arbitrarily divided into subsets that accounted for 70% and 30% of the entire sample, respectively. This section details the experimental setup and the results produced by the designed classification model.
The accuracy and loss curves for both the training and validation datasets, shown in Figs. 3 and 4, respectively, provide insights into the performance of the model over 100 epochs. The model achieved high overall accuracy, reaching approximately 99% within the first few epochs and maintaining stability thereafter. This rapid convergence towards high accuracy indicates the robustness of the model in effectively learning the distinguishing features of the gait patterns. Despite achieving a high accuracy early, the model continued to improve, reaching approximately 98% accuracy by the 19th epoch.
The accuracy curve shows consistently high training accuracy of 98.5% with minor fluctuations across epochs, and similarly high validation accuracy, mirroring the training accuracy, which signifies minimal overfitting and good generalization performance.
Specifically, the proposed model achieved an accuracy of approximately 98% on the 25th epoch, demonstrating its stability and effectiveness. In addition, the model achieved a peak accuracy of approximately 98.85%, highlighting its superior performance.
The loss curve depicted in Fig. 4 denotes the reduction in error rates as the training progresses, with an initial steep decline in both the training and validation losses, indicating effective learning. The curves plateaued with minimal loss, further validating the efficiency of the proposed model. By the 20th epoch, the loss function is significantly reduced, maintaining a low and stable value throughout the remaining epochs.
The confusion matrix depicted in Fig. 5 provides a comprehensive summary of the predicted results. The matrix illustrates the ability of the model to accurately classify gait patterns, with the diagonal elements representing the number of correct predictions and the off-diagonal elements indicating misclassifications. In this experiment, the confusion matrix indicates that the model correctly classified 23,150 samples of one class, but incorrectly classified 204,046 samples as belonging to the incorrect class.
The proposed method, evaluated on the GaitRec dataset, achieved an impressive accuracy of 98.65%, demonstrating its superior performance compared with other models documented in the literature. Table 3 presents a comparative analysis of the proposed method with various state-of-the-art techniques. For instance, the proposed method outperformed models such as the CNN applied to a private dataset by Fricke et al. [16], which achieved 91.90% accuracy, and the MARS model by Wang et al. [17], which achieved 88.30% accuracy.
Table 3 . Comparison Table.
Reference | Dataset | Methodology | Subjects | Accuracy |
---|---|---|---|---|
[16] | Private dataset | CNN | 37 | 91.90% |
[17] | Private dataset | MARS | 8 | 88.30% |
[18] | MFC data | SVM | 58 | 83.30% |
[19] | Private dataset | SVM | 440 | 90.80% |
[20] | Private dataset | ANN | 239 | 90% |
[21] | GaitRec dataset | 1D CNN | 2295 | 91.62% |
Proposed Method | GaitRec dataset | 1D CNN | 2085 | 98.65% |
In addition, the SVM applied to the MFC data in Begg et al. [18] reached 83.30%, and the linear SVM in Slijepcevic et al. [19] achieved 90.80%. Furthermore, even advanced models such as ANN by Zhou et al. [20], which attained 90% accuracy, were outperformed by the proposed 1D CNN model on the GaitRec dataset. GaitRecNet [21] also applied a 1D CNN to the GaitRec dataset, achieving 91.62% accuracy, but still fell short of the proposed method’s accuracy of 98.65%.
The CNN model demonstrated exceptional performance in the classification of gait patterns, as evidenced by its high accuracy and low loss values across both training and validation datasets. The confusion matrix, although with a significant number of misclassifications in one class, provides critical insights into areas requiring further investigation or potential class imbalances in the dataset. Overall, the model achieved an accuracy of 98.65%, demonstrating its ability to effectively discern and classify gait patterns. This high accuracy was achieved through the clear and meticulous use of attention mechanisms and data standardization techniques. Because the dataset used was GRF data, the proposed model leveraged this by using a standard scaler and stabilizing the learning process, enabling the model to effectively capture and distinguish the intricate features inherent in gait patterns. The use of the CBAM module after the 2nd CNN block took the feature sets and enhanced them before feeding into the last dense layer by transforming the “Input Intermediate Feature Maps” into “Refined Feature Maps” through the CAM. Data standardization enhances convergence rates, stability, and the overall learning process, whereas the CBAM module ensures that the model focuses on the most relevant features of the GRF data. Despite its high performance, the proposed model has certain limitations. The high accuracy and low loss values may not fully capture the performance of the model on more diverse or unseen data, indicating the potential need for further testing and validation on more varied datasets. These shortcomings highlight areas for future improvements and model refinement. A comprehensive analysis of the confusion matrix, accuracy, and loss curves collectively underscores the capability of the model to effectively discern and classify gait patterns, making it a viable tool for further application and research in gait analysis and related fields. The comparative analysis further highlights the superiority of the proposed model over the existing methods, establishing it as a significant advancement in the domain of gait pattern classification.
In this study, we developed and evaluated a CNN for binary classification of gait patterns using the GaitRec dataset. The experimental setup was designed to ensure robust and reproducible results. Several criteria were used to evaluate the effectiveness of the algorithm, particularly the accuracy, loss, and confusion matrix, providing a comprehensive evaluation of its efficacy. The proposed CNN model achieved an impressive accuracy of 98.65%, significantly outperforming other state-of-the-art models reported in the literature. The accuracy and loss curves demonstrated rapid convergence towards high accuracy and low loss values, highlighting the robustness and efficiency of the model in learning the distinguishing features of gait patterns. This underscores the effectiveness of the proposed 1D CNN model on the GaitRec dataset, thereby establishing a new benchmark for gait pattern classification. However, the model could benefit from further refinement, particularly when handling class imbalances. Additionally, the need for testing and validation on more diverse datasets was highlighted to ensure the generalizability and robustness of the model across different scenarios. The insights gained from this study provide a solid foundation for further improvements and extensions to address the identified limitations, and enhance the performance of the model in more diverse and complex settings. This study contributes to the expanding corpus of information on this topic, paving the way for more accurate and reliable gait analyses.
This study was supported by the 2023 Sabbatical Year of the Soonchunhyang University.
Table 1 . Total Datapoints in GaitRec Dataset.
Class | Subjects | Male | Female | Age (yrs.) | Bi-lateral Trials |
---|---|---|---|---|---|
Healthy (HC) | 211 | 104 | 107 | 34.7 | 7,755 |
Hip (GD) | 450 | 373 | 77 | 42.6 | 12,748 |
Knee (GD) | 625 | 426 | 199 | 41.6 | 19,873 |
Ankle (GD) | 627 | 498 | 129 | 41.6 | 21,386 |
Calcaneus (GD) | 382 | 339 | 43 | 43.5 | 13,970 |
Total | 2295 | 1,740 | 555 | 41.5 | 75,732 |
Table 2 . Model Summary.
Layer No. | Layer (type) | Kernel Size | Activation | Output Shape | Parameters |
---|---|---|---|---|---|
0 | Conv1D | 3 | ReLU | (None, 102, 64) | 256 |
1 | Dropout | -- | -- | (None, 102, 64) | 0 |
2 | MaxPooling1D | -- | -- | (None, 51, 64) | 0 |
3 | Flatten | -- | -- | (None, 3264) | 0 |
4 | Dense | 3 | ReLU | (None, 100) | 326500 |
5 | Conv1D | 3 | -- | (None, 36) | 100 |
6 | Activation | -- | -- | (None, 18) | 0 |
7 | Dense | -- | -- | (None, 1) | 101 |
Total parameters | 326,857 | ||||
Trainable parameters | 326,857 | ||||
Non-trainable parameters | 0 |
Khang Nhut Lam, My-Khanh Thi Nguyen, Huu Trong Nguyen, Vi Trieu Huynh, Van Lam Le, and Jugal Kalita
Journal of information and communication convergence engineering 2024; 22(4): 288-295 https://doi.org/10.56977/jicce.2024.22.4.288Ik-Hyun Youn*, Kwanghee Won, Jong-Hoon Youn, and Jeremy Scheffler
The Korea Institute of Information and Commucation Engineering 2016; 14(1): 45-50 https://doi.org/10.6109/jicce.2016.14.1.045