Journal of information and communication convergence engineering 2021; 19(1): 22-28
Published online March 31, 2021
https://doi.org/10.6109/jicce.2021.19.1.22
© Korea Institute of Information and Communication Engineering
Convolutional neural networks (CNNs) are one of the most frequently used artificial intelligence techniques. Among CNN-based applications, small and timing-sensitive applications have emerged, which must be reliable to prevent severe accidents. However, as the small and timing-sensitive systems do not have sufficient system resources, they do not possess proper error protection schemes. In this paper, we propose MATE, which is a low-cost CNN weight error correction technique. Based on the observation that all mantissa bits are not closely related to the accuracy, MATE replaces some mantissa bits in the weight with error correction codes. Therefore, MATE can provide high data protection without requiring additional memory space or modifying the memory architecture. The experimental results demonstrate that MATE retains nearly the same accuracy as the ideal error-free case on erroneous DRAM and has approximately 60% accuracy, even with extremely high bit error rates.
Keywords Convolutional neural network, Error correction codes, Main memory, Reliability, Weight data
The use of deep neural networks (DNNs) was first proposed in the early 21st century and has become the most popular and practical artificial intelligence technology. Many DNN-related studies have been conducted, out of which many systems with real-world applications have emerged. Among various DNNs, a convolutional neural network (CNN) is a popular form of neural network for image processing. Supported by a feature extraction technique called convolution, it increases the overall image processing accuracy. The latest CNN winner of the ImageNet Large Scale Vision Recognition Challenge (ILSVRC) [1] officially exceeds human accuracy for image processing. With this technological advancement, many companies have accepted CNN-based image processing as the core technique for their products or services.
In particular, several CNN-based small and timing-sensitive systems have also recently emerged, such as drones, robots, and Internet of Things (IoT) edge devices. These systems typically use CNN object–based recognition to accomplish their missions. Therefore, the accuracy of CNN is a key factor in determining the success of a system, and thus maintaining high accuracy in various situations is very important. However, maintaining accuracy is difficult. Many studies have reported that memory data integrity becomes weaker with memory scaling, and the bit error rate (BER) can be up to 10^{−4}[2, 3]. In addition, low-power DRAM architecture, which reduces memory power consumption by decreasing the data sensing voltage or time, considerably decreases data reliability [4]. Therefore, adequate memory data protection is required to maintain the inference accuracy, considering the resource constraints of small and timing-sensitive systems.
To provide cost-effective protection for such systems, we identified the following four design constraints that must be considered. First, the CNN accuracy must be maintained during the repetitive inference process. A small loss of accuracy may be acceptable in normal systems, but not in timing-sensitive systems. Second, hardware constraints should be considered. Our target systems have many physical limitations, such as area, power consumption, and weight; thus, their error protection components must be small. Third, the error correction process must be sufficiently short to satisfy real-time constraints. Fourth, memory errors should be detected and corrected as soon as possible before completing the CNN inference to prevent error. Because weight data are repetitively referenced, few-bit errors may cause a significant loss of accuracy.
To address all these considerations, we propose MATE, a memory- and retraining-free error correction technique for CNN weights. MATE protects CNN weight data stored in the main memory (DRAM) during the repetitive inference process. Based on observations that not all mantissa bits of the 32-bit floating-point (FP32) weight data affect the CNN accuracy, MATE utilizes a part of mantissa bits as error correction codes (ECCs), such as triple modular redundancy (TMR), single error correction, and double error detection (SECDED) codes. Experimental results demonstrated that MATE achieves nearly the same accuracy as the ideal error-free case at 10^{−8} to 10^{−5} BER, without requiring additional memory space and modifications to the memory architecture. Furthermore, even in a harsh environment with a BER of 10^{−4}, the proposed scheme maintains an ideal accuracy of 99.77%. Compared with the previously proposed competitive scheme, St-DRC [5], MATE shows 31.35% higher normalized accuracy at 10^{−4} BER with fewer hardware overheads and CNN transformation. We summarize our contributions as follows:
• For the FP32 datatype of CNN weights, we observe that the lower 19 bits of the 23 mantissa bits do not affect the inference accuracy.
• The proposed scheme preserves the ideal accuracy from 10^{−8} to 10^{−4} BER, and approximately 60% of the ideal accuracy even at 10^{−3} BER where the competitive scheme completely fails to inference.
• Because the proposed scheme operates on the basis of the conventional 32-bit system, it can be applied to small, timing-sensitive systems where it is difficult to modify the system architecture or utilize the dedicated GPUs.
The remainder of this paper is organized as follows. In Section II, we present the background of the FP32 datatype (based on the IEEE 754 standard). We explain our key observations and the proposed MATE scheme in Section III. The experimental evaluation of the MATE is presented in Section IV. Related works on CNN reliability are briefly introduced in Section V. Finally, we present our conclusions in Section VI.
The floating-point datatype represents a wide dynamic range of numeric values using a floating radix point. Compared with the fixed-point value of the same bit width, the floating point can represent a wider range of numbers at the cost of value precision. Fig. 1 shows the schematic layout of the FP32 datatype on the IEEE 754 standard; the most significant bit (MSB) represents a sign, the next 8 bits are called the “exponent” bits and collectively interpreted as the exponential value, and the remaining 23 bits are called the “mantissa” bits, which conclude the detailed logical value.
Equation (1) shows the process of FP32 datatype translation to real values based on the IEEE 754 standard. As shown in this equation, changes in the exponent bits can result in a completely different value as the exponent bits determine the logical magnitude of the data value. Furthermore, the exponent bits near the MSB position have a significant effect on the real value. For example, a bit flip on the 30^{th} bit changes the value up to 2^{128} times. In addition, if all the exponent bits become one (i.e.,
Fig. 2 shows the data distribution of the nine CNNs supported by Darknet [6]. Here, Inputs indicates the number of input image pixels.
In addition, small and timing-sensitive systems usually do not perform both the training and inference processes because of latency and computing resource limitations. They employ pre-trained CNNs for inference, and the training is usually performed on a large-scale remote server. A retrained CNN can be obtained through periodic communication with the server. Because communication causes much less overhead than self-training, we only consider the inference process by assuming that well-trained CNN weights are already available. During the inference process, multiple repeated multiplication and accumulation (MAC) operations are performed on an input image. This process is repeated for every image of every layer, and finally, the outputs of the last layer become the results of the inference. Hence, if there are errors in the weights, the faulty results can be accumulated across multiple layers resulting in a significant accuracy loss.
Most previous schemes that focus on high-performance or low-power system architectures do not consider data vulnerability by relying on CNN error robustness. Although several studies have provided error protection for CNN inference, they are usually based on retraining, which is not suitable for small and timing-sensitive IoT/embedded systems because of the large resource requirements. Therefore, we propose a cost-effective CNN weight error correction scheme, MATE, which does not require additional memory area or retraining to retain CNN accuracy, based on the following two key observations.
id="s3a"First, all exponent bit patterns in the CNN weights were closely related to the CNN accuracy. Table 1 shows the number of weights in terms of the upper three-bit patterns of the exponent in Resnet50 (the 28^{th}, 29^{th}, and 30^{th} bits in Fig. 1). The results show that most weights (more than 99.97%) have the same bit pattern,
Bit pattern | Weights | Weights (%) | Biases | Biases (%) |
---|---|---|---|---|
000 | 404 | 0.00178% | 0 | 0% |
001 | 268 | 0.00118% | 0 | 0% |
010 | 6,913 | 0.03041% | 0 | 0% |
011 | 22,726,431 | 99.9664% | 20,483 | 86.35% |
100 | 0 | 0% | 3,237 | 13.64% |
101 | 0 | 0% | 0 | 0% |
110 | 0 | 0% | 0 | 0% |
111 | 0 | 0% | 0 | 0% |
Except 011 | 7,585 | 0.03336% | 3,237 | 13.64% |
Total | 22,734,016 | 100.0000% | 23,720 | 100.00% |
Second, not all mantissa bits in the CNN weights are closely related to the CNN accuracy. Fig. 4 shows the relationship between the accuracy loss and the mantissa bit reduction. As shown in this figure, there was approximately the same accuracy as the ideal even if 18 of the 23 mantissa bits were removed. Even with only four mantissa bits, the average accuracy was 99.20%. Extraction suffers the greatest accuracy losses, but is still within 5% of the ideal. Based on these two observations, our proposed scheme protects the sign bit, all the exponent bits, and only four or five bits of 23 mantissa bits of the CNN weights.
By using the removed mantissa area as a storage space for ECCs and metadata for error correction, MATE effectively protects the weight data without any accuracy losses and modifications to the conventional memory architecture. Fig. 5 shows a conceptual view of the proposed scheme. MATE requires three simple components in the memory controller: an encoder, an error checker, and a decoder. Note that the conventional DRAM architecture does not need to be modified. The MATE encoder encodes the raw weight data read from the secondary storage to include an error correction code for a DRAM write operation (WR). After generating ECCs without considering the lower 19 mantissa bits, the ECCs are stored in the position where the excluded 19 mantissa bits are located. Therefore, all the weights stored in the main memory have an encoded form. In contrast, the MATE error checker and decoder are used for DRAM read operation (RD). The MATE error checker checks the data integrity and corrects any errors, and decodes the read weight data as the reverse of the encoding process.
To maximize the error protection efficiency, MATE distinguishes the CNN weights into two parts, exponent bits starting with
First, when the FP32 datatype weight w[31:0] is loaded from a secondary storage, the MATE encoder checks whether the upper three bits of the exponent (i.e., w[30:28]) are
Otherwise, if the upper three exponent bits are not
Read operations require both error checking and data decoding. First, the MATE error checker checks the flag bits to determine which ECCs are used. If the flag bits are intact (
After error correction, a decoding process is performed, which is the reverse of the encoding process. However, even after completing the error correction process, some erroneous bits may remain because the ECCs cannot correct all errors. In particular, if the defected bits are in the exponent part, the weight value differs significantly from the original value. Therefore, to enhance the weight data integrity, we performed post-processing for the exponent value. According to previous studies on weight quantization, more than 99.99% weight values were located between -2.0 and +2.0 [7-9]; the MATE decoder verifies whether the weight value is over 512, and simply flips the 30^{th} bit to zero to make a small weight value. Although it is not a correct error correction process, it prevents significant CNN accuracy loss by avoiding suspicious large weight values.
To verify the reliability of the CNN, we injected bit errors on Darknet [6], and the errors were injected into the weights according to the given BERs. All the CNNs were pre-trained with the ImageNet 2012 [1] training dataset, and all the experiments used the ImageNet 2012 validation dataset to measure the CNN accuracy. Intel Core i7-6850K CPU and NVIDIA GeForce TITAN X Pascal GPU were used for the simulation. The experimental results are shown in terms of the normalized accuracy, which indicates the ratio of the erroneous weight-based accuracy to the ideal error-free weight-based accuracy. For example, if the ideal error-free accuracy was 60% and the accuracy obtained with erroneous weights was 40%, then the normalized accuracy was 66.7%.
Before confirming the CNN reliability with MATE, we first conducted an experiment to verify the error robustness of the nine CNNs. Previous studies [10-13] reported that a CNN has some error robustness, and thus its accuracy cannot further decrease when there are only resistible faults. However, as shown in Fig. 7, which shows the normalized average accuracy with various BERs from to 10^{−10} to 10^{−5}, all CNNs lose their accuracy without a proper error protection scheme. Every CNN gradually loses its accuracy as the BER increases, and its inference begins to fail at a BER of 10^{−7}, except for Darknet19 and VGG-16. Resnet152 was the most error-sensitive CNN and started to fail at a BER of 10^{−9}. Although the current JEDEC standard is 10^{−15} BER [14], we need to consider the various environments in which the safety- and/or timing-critical systems and the memory vulnerability trend. Fig. 8 shows the accuracy losses in the nine CNNs after adopting the proposed MATE with various BERs from 10^{−8}. On average, MATE preserves almost the same accuracy up to 10^{−5} BER and shows only a 1% accuracy loss at 10^{−4} BER. Extraction has the highest accuracy loss, but it is still less than 2%. Even with an extremely high BER of 10^{−3}, the proposed error protection scheme maintains an inference accuracy of approximately 60%.
For state-of-the-art CNNs, having more layers does not mean having more weights. Current CNN designers increase the number of convolutional layers with fewer weights for better training performance. However, this approach worsens the error robustness of the weight, and thus makes CNN weight protection more important. Hence, MATE can be an effective solution for the reliability of future CNN inferences. In addition, low-power DRAM architecture, which reduces DRAM power consumption by scaling down the data sensing voltage or time (tRC), has a much higher BER than the conventional DRAM. Because the increased BER is one of the major obstacles to further scaling [4], the high fault-tolerance of MATE enables the use of low-power DRAM with significantly reduced timing or voltage parameters.
As shown in Fig. 3, the competitive scheme, St-DRC [5], which uniformly replaces the exponent bits to
Since the 1990s, reliability has been one of the main research areas in the field of CNNs. In the early years, most studies were conducted at the algorithm level. Bolt [10] summarized various algorithm-level studies and classified the factors potentially affecting the CNN accuracy. After the 2000s, CNN reliability became an important issue at the system and architecture levels. Pullum et al. [15] discussed the importance of risk analysis for neural networks and emphasized the importance of a reliable CNN for autonomous spacecraft flight systems. Mittal [16] summarized architecture-level studies conducted in the 2010s. Li et al. [17] classified several factors that affect CNN accuracy and proposed optimized error correction techniques for each factor.
Most previous studies have used CNN’s error robustness to improve performance or energy savings with approximate computing, such as quantization. However, these approaches are impractical for upcoming timing-critical or resource-limited systems because of their accuracy losses and retraining overhead. In this paper, we proposed MATE, a low-cost error protection technique for CNN weights. Based on several bit-level observations, MATE utilizes insignificant mantissa bits to store ECC bits. The removed mantissa bits do not affect the CNN accuracy, and the conventional DRAM can be used without any modification by using the empty space. Experimental results show that MATE preserves the ideal inference accuracy at up to 10^{−4} BERs for nine state-of-the-art CNNs without retraining.
Myeungjae Jang received his master’s degree from the School of Computing, Korea Advanced Institute of Science and Technology (KAIST). He is currently pursuing his Ph.D. degree with KAIST. His current research interests include computer architecture for machine learning and neural processing unit.
Jeongkyu Hong received the B.S. degree from the College of Information and Communications, Korea University, in 2011, the M.S. degree from the Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), in 2013, and the Ph.D. degree from the School of Computing, KAIST, in 2017. He has been with the Department of Computer Engineering, Yeungnam University, since 2018, as an Associate Professor. His current research interests include the design of low-power, reliable, and high-performance processor and memory systems.
Journal of information and communication convergence engineering 2021; 19(1): 22-28
Published online March 31, 2021 https://doi.org/10.6109/jicce.2021.19.1.22
Copyright © Korea Institute of Information and Communication Engineering.
Jang, Myeungjae;Hong, Jeongkyu;
Korea Advanced Institute of Science and Technology, Yeungnam University
Convolutional neural networks (CNNs) are one of the most frequently used artificial intelligence techniques. Among CNN-based applications, small and timing-sensitive applications have emerged, which must be reliable to prevent severe accidents. However, as the small and timing-sensitive systems do not have sufficient system resources, they do not possess proper error protection schemes. In this paper, we propose MATE, which is a low-cost CNN weight error correction technique. Based on the observation that all mantissa bits are not closely related to the accuracy, MATE replaces some mantissa bits in the weight with error correction codes. Therefore, MATE can provide high data protection without requiring additional memory space or modifying the memory architecture. The experimental results demonstrate that MATE retains nearly the same accuracy as the ideal error-free case on erroneous DRAM and has approximately 60% accuracy, even with extremely high bit error rates.
Keywords: Convolutional neural network, Error correction codes, Main memory, Reliability, Weight data
The use of deep neural networks (DNNs) was first proposed in the early 21st century and has become the most popular and practical artificial intelligence technology. Many DNN-related studies have been conducted, out of which many systems with real-world applications have emerged. Among various DNNs, a convolutional neural network (CNN) is a popular form of neural network for image processing. Supported by a feature extraction technique called convolution, it increases the overall image processing accuracy. The latest CNN winner of the ImageNet Large Scale Vision Recognition Challenge (ILSVRC) [1] officially exceeds human accuracy for image processing. With this technological advancement, many companies have accepted CNN-based image processing as the core technique for their products or services.
In particular, several CNN-based small and timing-sensitive systems have also recently emerged, such as drones, robots, and Internet of Things (IoT) edge devices. These systems typically use CNN object–based recognition to accomplish their missions. Therefore, the accuracy of CNN is a key factor in determining the success of a system, and thus maintaining high accuracy in various situations is very important. However, maintaining accuracy is difficult. Many studies have reported that memory data integrity becomes weaker with memory scaling, and the bit error rate (BER) can be up to 10^{−4}[2, 3]. In addition, low-power DRAM architecture, which reduces memory power consumption by decreasing the data sensing voltage or time, considerably decreases data reliability [4]. Therefore, adequate memory data protection is required to maintain the inference accuracy, considering the resource constraints of small and timing-sensitive systems.
To provide cost-effective protection for such systems, we identified the following four design constraints that must be considered. First, the CNN accuracy must be maintained during the repetitive inference process. A small loss of accuracy may be acceptable in normal systems, but not in timing-sensitive systems. Second, hardware constraints should be considered. Our target systems have many physical limitations, such as area, power consumption, and weight; thus, their error protection components must be small. Third, the error correction process must be sufficiently short to satisfy real-time constraints. Fourth, memory errors should be detected and corrected as soon as possible before completing the CNN inference to prevent error. Because weight data are repetitively referenced, few-bit errors may cause a significant loss of accuracy.
To address all these considerations, we propose MATE, a memory- and retraining-free error correction technique for CNN weights. MATE protects CNN weight data stored in the main memory (DRAM) during the repetitive inference process. Based on observations that not all mantissa bits of the 32-bit floating-point (FP32) weight data affect the CNN accuracy, MATE utilizes a part of mantissa bits as error correction codes (ECCs), such as triple modular redundancy (TMR), single error correction, and double error detection (SECDED) codes. Experimental results demonstrated that MATE achieves nearly the same accuracy as the ideal error-free case at 10^{−8} to 10^{−5} BER, without requiring additional memory space and modifications to the memory architecture. Furthermore, even in a harsh environment with a BER of 10^{−4}, the proposed scheme maintains an ideal accuracy of 99.77%. Compared with the previously proposed competitive scheme, St-DRC [5], MATE shows 31.35% higher normalized accuracy at 10^{−4} BER with fewer hardware overheads and CNN transformation. We summarize our contributions as follows:
• For the FP32 datatype of CNN weights, we observe that the lower 19 bits of the 23 mantissa bits do not affect the inference accuracy.
• The proposed scheme preserves the ideal accuracy from 10^{−8} to 10^{−4} BER, and approximately 60% of the ideal accuracy even at 10^{−3} BER where the competitive scheme completely fails to inference.
• Because the proposed scheme operates on the basis of the conventional 32-bit system, it can be applied to small, timing-sensitive systems where it is difficult to modify the system architecture or utilize the dedicated GPUs.
The remainder of this paper is organized as follows. In Section II, we present the background of the FP32 datatype (based on the IEEE 754 standard). We explain our key observations and the proposed MATE scheme in Section III. The experimental evaluation of the MATE is presented in Section IV. Related works on CNN reliability are briefly introduced in Section V. Finally, we present our conclusions in Section VI.
The floating-point datatype represents a wide dynamic range of numeric values using a floating radix point. Compared with the fixed-point value of the same bit width, the floating point can represent a wider range of numbers at the cost of value precision. Fig. 1 shows the schematic layout of the FP32 datatype on the IEEE 754 standard; the most significant bit (MSB) represents a sign, the next 8 bits are called the “exponent” bits and collectively interpreted as the exponential value, and the remaining 23 bits are called the “mantissa” bits, which conclude the detailed logical value.
Equation (1) shows the process of FP32 datatype translation to real values based on the IEEE 754 standard. As shown in this equation, changes in the exponent bits can result in a completely different value as the exponent bits determine the logical magnitude of the data value. Furthermore, the exponent bits near the MSB position have a significant effect on the real value. For example, a bit flip on the 30^{th} bit changes the value up to 2^{128} times. In addition, if all the exponent bits become one (i.e.,
Fig. 2 shows the data distribution of the nine CNNs supported by Darknet [6]. Here, Inputs indicates the number of input image pixels.
In addition, small and timing-sensitive systems usually do not perform both the training and inference processes because of latency and computing resource limitations. They employ pre-trained CNNs for inference, and the training is usually performed on a large-scale remote server. A retrained CNN can be obtained through periodic communication with the server. Because communication causes much less overhead than self-training, we only consider the inference process by assuming that well-trained CNN weights are already available. During the inference process, multiple repeated multiplication and accumulation (MAC) operations are performed on an input image. This process is repeated for every image of every layer, and finally, the outputs of the last layer become the results of the inference. Hence, if there are errors in the weights, the faulty results can be accumulated across multiple layers resulting in a significant accuracy loss.
Most previous schemes that focus on high-performance or low-power system architectures do not consider data vulnerability by relying on CNN error robustness. Although several studies have provided error protection for CNN inference, they are usually based on retraining, which is not suitable for small and timing-sensitive IoT/embedded systems because of the large resource requirements. Therefore, we propose a cost-effective CNN weight error correction scheme, MATE, which does not require additional memory area or retraining to retain CNN accuracy, based on the following two key observations.
id="s3a"First, all exponent bit patterns in the CNN weights were closely related to the CNN accuracy. Table 1 shows the number of weights in terms of the upper three-bit patterns of the exponent in Resnet50 (the 28^{th}, 29^{th}, and 30^{th} bits in Fig. 1). The results show that most weights (more than 99.97%) have the same bit pattern,
Bit pattern | Weights | Weights (%) | Biases | Biases (%) |
---|---|---|---|---|
000 | 404 | 0.00178% | 0 | 0% |
001 | 268 | 0.00118% | 0 | 0% |
010 | 6,913 | 0.03041% | 0 | 0% |
011 | 22,726,431 | 99.9664% | 20,483 | 86.35% |
100 | 0 | 0% | 3,237 | 13.64% |
101 | 0 | 0% | 0 | 0% |
110 | 0 | 0% | 0 | 0% |
111 | 0 | 0% | 0 | 0% |
Except 011 | 7,585 | 0.03336% | 3,237 | 13.64% |
Total | 22,734,016 | 100.0000% | 23,720 | 100.00% |
Second, not all mantissa bits in the CNN weights are closely related to the CNN accuracy. Fig. 4 shows the relationship between the accuracy loss and the mantissa bit reduction. As shown in this figure, there was approximately the same accuracy as the ideal even if 18 of the 23 mantissa bits were removed. Even with only four mantissa bits, the average accuracy was 99.20%. Extraction suffers the greatest accuracy losses, but is still within 5% of the ideal. Based on these two observations, our proposed scheme protects the sign bit, all the exponent bits, and only four or five bits of 23 mantissa bits of the CNN weights.
By using the removed mantissa area as a storage space for ECCs and metadata for error correction, MATE effectively protects the weight data without any accuracy losses and modifications to the conventional memory architecture. Fig. 5 shows a conceptual view of the proposed scheme. MATE requires three simple components in the memory controller: an encoder, an error checker, and a decoder. Note that the conventional DRAM architecture does not need to be modified. The MATE encoder encodes the raw weight data read from the secondary storage to include an error correction code for a DRAM write operation (WR). After generating ECCs without considering the lower 19 mantissa bits, the ECCs are stored in the position where the excluded 19 mantissa bits are located. Therefore, all the weights stored in the main memory have an encoded form. In contrast, the MATE error checker and decoder are used for DRAM read operation (RD). The MATE error checker checks the data integrity and corrects any errors, and decodes the read weight data as the reverse of the encoding process.
To maximize the error protection efficiency, MATE distinguishes the CNN weights into two parts, exponent bits starting with
First, when the FP32 datatype weight w[31:0] is loaded from a secondary storage, the MATE encoder checks whether the upper three bits of the exponent (i.e., w[30:28]) are
Otherwise, if the upper three exponent bits are not
Read operations require both error checking and data decoding. First, the MATE error checker checks the flag bits to determine which ECCs are used. If the flag bits are intact (
After error correction, a decoding process is performed, which is the reverse of the encoding process. However, even after completing the error correction process, some erroneous bits may remain because the ECCs cannot correct all errors. In particular, if the defected bits are in the exponent part, the weight value differs significantly from the original value. Therefore, to enhance the weight data integrity, we performed post-processing for the exponent value. According to previous studies on weight quantization, more than 99.99% weight values were located between -2.0 and +2.0 [7-9]; the MATE decoder verifies whether the weight value is over 512, and simply flips the 30^{th} bit to zero to make a small weight value. Although it is not a correct error correction process, it prevents significant CNN accuracy loss by avoiding suspicious large weight values.
To verify the reliability of the CNN, we injected bit errors on Darknet [6], and the errors were injected into the weights according to the given BERs. All the CNNs were pre-trained with the ImageNet 2012 [1] training dataset, and all the experiments used the ImageNet 2012 validation dataset to measure the CNN accuracy. Intel Core i7-6850K CPU and NVIDIA GeForce TITAN X Pascal GPU were used for the simulation. The experimental results are shown in terms of the normalized accuracy, which indicates the ratio of the erroneous weight-based accuracy to the ideal error-free weight-based accuracy. For example, if the ideal error-free accuracy was 60% and the accuracy obtained with erroneous weights was 40%, then the normalized accuracy was 66.7%.
Before confirming the CNN reliability with MATE, we first conducted an experiment to verify the error robustness of the nine CNNs. Previous studies [10-13] reported that a CNN has some error robustness, and thus its accuracy cannot further decrease when there are only resistible faults. However, as shown in Fig. 7, which shows the normalized average accuracy with various BERs from to 10^{−10} to 10^{−5}, all CNNs lose their accuracy without a proper error protection scheme. Every CNN gradually loses its accuracy as the BER increases, and its inference begins to fail at a BER of 10^{−7}, except for Darknet19 and VGG-16. Resnet152 was the most error-sensitive CNN and started to fail at a BER of 10^{−9}. Although the current JEDEC standard is 10^{−15} BER [14], we need to consider the various environments in which the safety- and/or timing-critical systems and the memory vulnerability trend. Fig. 8 shows the accuracy losses in the nine CNNs after adopting the proposed MATE with various BERs from 10^{−8}. On average, MATE preserves almost the same accuracy up to 10^{−5} BER and shows only a 1% accuracy loss at 10^{−4} BER. Extraction has the highest accuracy loss, but it is still less than 2%. Even with an extremely high BER of 10^{−3}, the proposed error protection scheme maintains an inference accuracy of approximately 60%.
For state-of-the-art CNNs, having more layers does not mean having more weights. Current CNN designers increase the number of convolutional layers with fewer weights for better training performance. However, this approach worsens the error robustness of the weight, and thus makes CNN weight protection more important. Hence, MATE can be an effective solution for the reliability of future CNN inferences. In addition, low-power DRAM architecture, which reduces DRAM power consumption by scaling down the data sensing voltage or time (tRC), has a much higher BER than the conventional DRAM. Because the increased BER is one of the major obstacles to further scaling [4], the high fault-tolerance of MATE enables the use of low-power DRAM with significantly reduced timing or voltage parameters.
As shown in Fig. 3, the competitive scheme, St-DRC [5], which uniformly replaces the exponent bits to
Since the 1990s, reliability has been one of the main research areas in the field of CNNs. In the early years, most studies were conducted at the algorithm level. Bolt [10] summarized various algorithm-level studies and classified the factors potentially affecting the CNN accuracy. After the 2000s, CNN reliability became an important issue at the system and architecture levels. Pullum et al. [15] discussed the importance of risk analysis for neural networks and emphasized the importance of a reliable CNN for autonomous spacecraft flight systems. Mittal [16] summarized architecture-level studies conducted in the 2010s. Li et al. [17] classified several factors that affect CNN accuracy and proposed optimized error correction techniques for each factor.
Most previous studies have used CNN’s error robustness to improve performance or energy savings with approximate computing, such as quantization. However, these approaches are impractical for upcoming timing-critical or resource-limited systems because of their accuracy losses and retraining overhead. In this paper, we proposed MATE, a low-cost error protection technique for CNN weights. Based on several bit-level observations, MATE utilizes insignificant mantissa bits to store ECC bits. The removed mantissa bits do not affect the CNN accuracy, and the conventional DRAM can be used without any modification by using the empty space. Experimental results show that MATE preserves the ideal inference accuracy at up to 10^{−4} BERs for nine state-of-the-art CNNs without retraining.
Bit pattern | Weights | Weights (%) | Biases | Biases (%) |
---|---|---|---|---|
000 | 404 | 0.00178% | 0 | 0% |
001 | 268 | 0.00118% | 0 | 0% |
010 | 6,913 | 0.03041% | 0 | 0% |
011 | 22,726,431 | 99.9664% | 20,483 | 86.35% |
100 | 0 | 0% | 3,237 | 13.64% |
101 | 0 | 0% | 0 | 0% |
110 | 0 | 0% | 0 | 0% |
111 | 0 | 0% | 0 | 0% |
Except 011 | 7,585 | 0.03336% | 3,237 | 13.64% |
Total | 22,734,016 | 100.0000% | 23,720 | 100.00% |