Journal of information and communication convergence engineering 2024; 22(1): 70-79
Published online March 31, 2024
https://doi.org/10.56977/jicce.2024.22.1.70
© Korea Institute of Information and Communication Engineering
Correspondence to : Nam-Ho Kim (E-mail: nhk@pknu.ac.kr, Tel: +82-51-629-6328)
School of Electrical Engineering, Pukyong National University, Busan 48513, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Bad weather conditions such as haze lead to a significant lack of visibility in images, which can affect the functioning and reliability of image processing systems. Accordingly, various single-image dehazing (SID) methods have recently been proposed. Existing SID methods have introduced effective visibility improvement algorithms, but they do not reflect the image’s perspective, and thus have limitations that distort the sky area and nearby objects. This study proposes a new SID method that reflects the sense of space by defining the correlation between image brightness and haze. The proposed method defines the haze intensity by calculating the airlight brightness deviation and sets the weight factor of the depth map by classifying images based on the defined haze intensity into images with a large sense of space, images with high intensity, and general images. Consequently, it emphasizes the contrast of nearby images where haze is present and naturally smooths the sky region to preserve the image’s perspective.
Keywords Haze concentration, Dehazing, Single-image dehazing, Depth map
Systems based on outdoor image processing are used in several fields, including intelligent transportation systems, outdoor object recognition systems, and remote sensing systems. However, unlike normal images, images obtained during harsh weather conditions containing atmospheric particles such as haze and smog exhibit a significant lack of visibility owing to light scattering and absorption. These hazed images may hinder the execution of image processing systems and reduce their reliability [1,2]. Therefore, single-image dehazing (SID) methods are required to restore outdoor images obtained during bad weather.
Early dehazing algorithms used traditional image processing methods based on histogram equalization (HE) [3] or algorithms that dehaze the same scene by capturing images with different degrees of depth and polarization [4]. However, HEbased methods do not reflect the physical characteristics of haze; moreover, methods that use multiple images have limitations that require additional research to overcome. Accordingly, SID has been actively studied in recent dehazing research as an alternative to HE [2].
As a classical SID method, He et al. [5] proposed the dark channel prior (DCP) because haze-free regions generally have color channels of very low intensity. Because DCP can remove haze simply and effectively, several improved algorithms [6-8] based on it have been proposed. In particular, guided image filtering (GIF) [6] applies a guided filter to the transmittance map of the DCP to effectively preserve the image quality in hazy images, whereas weighted GIF (WGIF) [7] incorporates edge-aware weighting into GIF to prevent the halo effect and preserve edge regions. Additionally, Tarel et al. [8] proposed fast visibility restoration (FVR) using a modified median filter to reduce the long soft mapping time of the DCP. In recent years, SID has been studied using machine-learning models. For example, a color attenuation prior (CAP)-based method [9] was used to modify a depth map as a linear model for hue, saturation, and value (HSV) through training, thereby producing results without noise. However, the CAP method slightly lowers the image visibility owing to soft edge processing; ICAP [10] suggests a new depth map through a secondary mode to compensate for it. Dana. et al. [11] proposed an efficient depth-map estimation method by defining haze lines within color clusters, and Ngo et al. [12] proposed the machine-learning approach based on the quad decomposition method and the Batcher network to compensate for background noise and color distortion. The SID methods developed to date have demonstrated excellent performance in terms of visibility and edge preservation; however, dehazing based on haze depth is difficult, resulting in an excessive increase in saturation and contrast in light haze regions. Furthermore, dehazing the entire image causes severe color distortion in the sky region and disturbs the perspective of natural images.
In this study, we set up a depth map based on pixel brightness to preserve the visibility and perspective in hazy images. We used the correlation between the value (V) and depth of the haze in the HSV color space to describe the spatial sense of the hazy image and calculated the deviation in V of the airlight to define the haze concentration (HC). Based on the defined HC, a weight map was designed to preserve the sky and light-haze regions; this map was applied to the depth map defined by the CAP for dehazing. The dehazing method proposed in this study provides highly visible and natural results by emphasizing the contrast of nearby objects in the image while softening the sky region.
The remainder of this paper is organized as follows. Section 2 describes related research, and Section 3 describes the proposed method. Section 4 presents the experimental results and analysis and Section 5 presents our conclusions.
Numerous studies have used the atmospheric scattering model proposed by Narasimhan and Nayar [13] for dehazing [5-7,9]. The atmospheric scattering model [13] defines the hazy image to be dehazed, I, as follows:
where x is the spatial position of the pixel in the image, J is the scene brightness of the image without haze, A is the airlight, and t is the transmittance. t is generally defined as follows:
where β denotes the atmospheric scattering coefficient and d denotes the depth of the scene. Here, d(x) has a large value close to ∞ when the medium is extremely far away from the observer. In this case, as per Eq. (2), t has a small value close to zero; therefore, A has a value similar to I [5], which can be shown by the following equation:
To estimate A using Eq. (3), we need to know the position of a sufficiently large d(x). Generally, as the haze region is located far away from the observer, it has a sufficiently larged(x) [2]; based on this, we can estimate the depth map of the scene to calculate A.
Furthermore, if d is known, t can be calculated using Eq. (2). Based on this, the SID methods [5-7,9] estimate A and t or d, substitute them into Eq. (1), and output a restored image of the original J.
The CAP [9] describes the relationship between brightness and saturation in a hazy image based on statistical and atmospheric scattering model equations. According to this algorithm, the denser the haze concentration, the stronger is the effect of the additional airlight, resulting in a decrease in saturation and an increase in brightness. Because the haze concentration is affected by the scene depth d(x) according to Eq. (2), the brightness and saturation of the hazy image have the following correlation:
where c(x) denotes the concentration of haze, and v(x) and s(x) represent the brightness and saturation of the scene in the HSV color space, respectively. Based on this equation, a new linear model dc(x) is proposed using machine learning for the CAP[9]. The depth map dc(x) is represented as follows with respect to v(x) and s(x):
Here, a0, a1, and a2 represent parameters, and ε(x) denotes a variable representing the model’s random error. As the parameters can vary according to the image, Zhu et al. [9] designed 500 training datasets consisting of hazy images and their corresponding scene depths to enable estimation by generalizing Eq. (5). The trained datasets were used to measure the scene depth and airlight by matching 500 clear images to the depth map of Eq. (5), with random brightness and saturation inputs. The designed 500 training data sets estimated parameters a0, a1, and a2 using the MLE supervised learning model, and Eq. (5) can be expressed as a generalized depth equation for brightness and saturation. The parameters estimated through this process were generalized and defined as a0 = 0.12178, a1 = 0.95971, and a2 = 0.780245.
Moreover, ε(x) can be calculated using a Gaussian distribution model N(0, σ2) with a mean of 0 and a distribution of σ. Zhu et al. [9] expressed it using parameters a0, a1, and a2 as follows:
Here, using the values a0, a1, and a2, σ was calculated as 0.041337.
Fig. 1 illustrates the process of extracting A. According to Eq. (3), A has a value similar to I(x) in an image with an extremely dense haze, and d(x) has a large value. Similarly, many existing studies have estimated the airlight A using I(x) according to the definition of the atmospheric scattering model in Eq. (3) [14]: The well-known and effective method is to detect the airlight A using a set of the brightest higher-level pixels [1-2,5-7,9]. For example, He et al. [5] detected the brightest pixel in the local area of the DCP, and Zhu et al. [9] detected a set of the most brilliant pixels in the depth map defined in Eq. (5). The proposed method also detects airlight A in the HSV space using the depth map of the CAP [9].
Fig. 1(a) shows the depth map dA to output A. In Fig. 1 (a), A represents the largest value (brightest pixel) in the depth map.
Because A is calculated using pixel alignment, as in Eq. (7), it can be determined using the variables V(x) and S(x) according to the pixel position in Eq. (5). Therefore, the following equation was applied by approximating a1 and a2 in Eq. (5) to 0.95 and 0.7, respectively:
Moreover, the depth map dA may contain objects brighter than the haze region, such as the t-shirt shown in Fig. 1. Therefore, a 7×7 window is used to perform median filtering of the depth map dA [15, 16]. Fig. 1 (b) shows the median filtering results for dA. Because A is located at the brightest pixel of dA, we define A using the sorting matrix
where w and h are the width and height of the entire image I, respectively, and xA is the spatial coordinate where A is located. Specifically, xA represents the top 0.1%-pixel position of dA, and A represents the I(x) image when the haze is very dense. I(x) corresponding to A is represented by the dot in Fig. 1 (c).
Fig. 2 shows the effect of dehazing based on the haze deviation H and HC.
Generally, the haze concentration increases with distance, as described in Section 2 and shown in Fig. 2 (a), where all regions are affected by haze. However, most conventional filters apply the same equation to the haze region [5-9], which leads to the problems shown in Fig. 2 (d) and (e). Specifically, they show patterns in which the haze remains, as shown in (d), or the contrast in the short range becomes severe, as shown in (c).
Therefore, we calculate the deviation of V and airlight to estimate the effect of the haze added to each pixel and define HC. As shown in Fig. 2 (f), the proposed method improves visibility while preserving close objects.
Fig. 2 (b) shows the brightness V of the HSV space shown in Fig. 2 (a). In (b), the bright region represents the added haze and the black dot at the top represents V(xA), the brightness value of A.
As mentioned above, the airlight A with a distance A represents the set of the brightest high-level pixels for the scene [1-2,5-7,9]. That is, the brightness V(x) of a scene generally shows a smaller difference than the brightness V(xA) of A when the haze is thicker. Thus, we calculate the haze deviation H(x) as follows:
where xA is the spatial coordinate of A, and H is as shown in Fig. 2 (c). According to Eq. (9), H(x) has higher values in closer regions and lower values in farther regions. The road illustrated in Fig. (c) has a lower value when it is the farther away, and the closest object, the bicycle, has the highest white color.
However, because Eq. (9) is defined according to the general airlight assumption, it can generate errors in scenes where highly bright objects are present. For example, Fig. 2 (c) shows a very low value for yellow clothes. Therefore, this study applies the DCP [5] assumption that in pixels without haze, the brightness of at least one of the three RGB channels contains a dark value close to 0 [5] and the airlight [1-2, 5-7, 9] assumption that the haze area comprises a set of the brightest pixels to obtain the maximum and minimum values of H(x) and defines the difference between them as the haze concentration (HC).
HC is calculated using the maximum and minimum values of H as shown below.
where max(H) and min(H) denote the H(x) values of the least and most dense haze, respectively. According to Eq. (10), HC represents the difference between the pixel with almost no haze and that with heavy haze, that is, street haze concentration.
In images, the sense of space felt based on the far and near fields is represented, whereas in hazy images, the far field shows denser haze than other objects, often resulting in images in which only the far field is affected by haze. Owing to this sense of space, the HC has a large value in the far field and normal images, thus degrading the image reconstruction performance. However, as an exception, an image with extremely dense haze, where even close objects are heavily affected by the haze, shows a low HC. In other words, the HC shows a tendency to gradually decrease in the order of distantview, general, and dense images.
Although an actual haze dataset that can accurately determine haze intensity is necessary to verify the accuracy of HC calculations, no study has yet suggested such a dataset. Therefore, this study compares the results of the Synthetic Objective Testing Set (SOTS) of RESIDE (REalistic Single Image DEHazing) [17] and Dense-HAZE [18].
SOTS provides indoor and outdoor composite haze image data. This study uses indoor SOTS data as images of general density because they provide 50 scene images of haze density intensifying in levels 1-10. Moreover, Dense-HAZE provides a dataset of 55 actual dense haze images, enabling the checking of the HC characteristics in thick haze images. The HC results for each dataset are presented in Fig. 3.
Fig. 3 (a) shows a visualization of 500 indoor datasets by averaging 50 images of levels 1-10. Indoor refers to a hazy image with a low density, whereas Dense-HAZE refers to a high-density hazy image. Fig. 3 shows the HC trends for these images.
Fig. 3 (a) shows the tendency of HC to decrease as the haze density increases, matching the HC characteristics described. Moreover, the HC calculation results for the indoor image shows that all datasets exceeded 0.4; thus, the mean HC values in Fig. 3 (a) are all 0.68 or higher. By contrast, for dense haze, all images except one show 0.6 or less. These results confirm that HC tends to have a lower value in images with higher concentration.
In this study, hazy images were classified into three categories according to the HC for dehazing in terms of preserving the sense of space, as shown in Fig. 4, where (a) is a scene with haze only in the far field, (b) is a scene with severe haze, and (c) is a normal scene. The haze deviation in Fig. 4 shows the H map, where the two dots represent min(H) and max(H), respectively. They are represented visually and numerically in terms of deviations for easy comparison.
When the deviations shown in Fig. 4 were substituted into Eq. (10), Cases 1, 2, and 4 show HC = 0.8095, 0.4431, and 0.63918, respectively. Case 1 shows a high value of max(H) and the largest HC because the object has almost no haze. In contrast, Case 2, with dense haze in all regions, has the lowest HC, with max(H) showing a dark value of less than or equal to 0.5. Case 3 shows a value between those of Cases 1 and 2.
We classified hazy images according to their HC and designed a weight map G that reflects this condition. The initial value of G was set as a zero matrix with weights applied only to H(x) to satisfy this condition.
First, data statistics were used to set a reference value for the HC. In Fig. 3, all 54 scenes, except one in Dense-HAZE [18], exhibit HC values of 0.6 or less. Based on these results, the dense haze images from DenseHAZE and LIVE [19] were simulated to set the reference value for the actual dense haze image. LIVE is an actual hazy image dataset with images of various densities and sizes, from which 45 dense hazy images were visually selected.
The HC calculation of the dense haze datasets of Dense-HAZE and LIVE showed 10 out of 100 images having values of 0.8 or higher, and 91 having 0.6 or lower. Analysis of images with values of 0.8 showed that nine types of exceptional photos contained close-range objects with very little haze, generating significant deviations between images with dense haze and those with little haze.
Second, clustering was performed to determine the criteria for images with large deviations. Similar to the above statistical method, the HC values of all 465 actual haze images from LIVE were calculated, and the k-means clustering algorithm [20,21] was implemented to classify them into two groups. The medians of the two classified clusters were 0.86362 and 0.5660, respectively, and the two clusters were classified into images with high deviation and normal images. These measured results were similar to the reference values obtained in the first step.
Therefore, based on the results of the two steps, the criterion was set to 0.8 or more for images with high haze deviation and 0.5 or less for images with dense haze. In this case, the reference value 0.5 is a value which compensates for the criterion for dense haze images to prevent excessive visibility enhancement for some general images.
For a highly deviated fog image, distant regions close to the sky and regions close to the camera must be preserved to preserve the spatial information. This problem is solved by defining H(x), excluding the region to be preserved, as G(x) according to Eq. (11).
where θ1 and θ2 represent the near and far regions that need to be preserved.
In Case 2, the entire image, except for the airlight region is dehazed because a high reconstruction performance is required even for close-distance regions. G in Case 2 is calculated by specifying the airlight region to be preserved through min(H) as follows:
If extremely dense haze is observed in Case 2, G can cause noise in the image because a weight is assigned to the haze. Therefore, we compensate for this by multiplying Eq. (13) by HC.
In normal images, because the damage in the close-distance region is the main factor, it is specified as the region to be preserved. G in Case 3 is set as follows:
where β is the scattering coefficient in Eq. (2), which is set using μ(H), the mean value of H. Eq. (15) can be applied to any hazy image, and in Case 3, β has a value of 1. Therefore, it is used as the correction constant for G. The scene depth D(x) reflecting the weight G(x) is given by:
where k is the sensitivity constant. Fig. 4 shows the calculated G map, the depth map D, and the dehazing results for different values of k.
The results in Fig. 5 show that Eq. (16) is useful only when the haze is extremely dense, as in Case 2; in fact, Case 2 showed excellent visibility for most values of k, whereas Cases 1 and 3 exhibited darkening of the preserved regions, resulting in image losses, as shown by the magnified images. The cause of this problem can be identified by comparing Cases 1 and 2 in Fig. 5 (a) and (e), respectively. For images that require strong dehazing, such as those in Case 2, emphasizing the G region using Eq. (16) increases the visibility. However, for Case 1, the additional emphasis on G causes a loss of image quality, because it has already been dehazed by dc. Therefore, the depth maps in Cases 1 and 3 were estimated by calculating 1−G, which is the opposite of Case 2. Because 1−G represents the preserved region as opposed to G, we used it to calculate the depth maps for Cases 1 and 3 as follows:
where k1 and k2 denote the sensitivities of the weights. We set k1 as positive and k2 as negative to ensure that the preserved region is not overly affected even if V(x) increases.
Fig. 6 shows the dehazing results for Case 1.
The results in Fig. 6 show that, in Case 1, the image of the close-distance foreground is preserved, and the visibility increases slightly in the far field.
We have estimated the airlight A, atmospheric scattering coefficient β, and depth d through the above-described process. To reconstruct J, we need to calculate t, which is calculated by substituting β and d into Eq. (2). Eq. (18) shows the atmospheric scattering model for J [13].
where
To verify the dehazing performance of the proposed method, we compared it with the GIF [6], WGIF [7], FVR [8], and CAP [9] methods. The test platform comprised a desktop computer with a 7.2 GHz CPU (AMD Ryzen 7) and 16 GB of memory, and the experiments were performed on MATLAB R2023a. In the experiments, the parameter of the proposed method was set as k = 0.7 for Case 2. For Cases 1 and 3, respectively, k1 was set to 0.5 and 0.3, and k2 to −0.7 and −0.4. For comparison, the parameters in Eq. (5) of the CAP [9] used a0 = 0.12178, a1 = 0.95971, and a2 = 0.780245, which are the parameters generated by the author.
Both visual and objective evaluations were performed for the proposed method and conventional methods.
Fig. 7 shows the visual performance of each method with the LIVE [19] dataset by dividing it into Cases 1, 2, and 3 according to the haze concentration in the image. Cases 1, 2, and 3 show images with haze only in the far-field, images with dense haze concentrations, and normal images, respectively. Normal images are classified into three categories, shown in sub-figures (c), (d), and (e) in Fig. 7, to evaluate the dehazing performance for cases where there is a sky region and bright objects.
The results in Fig. 7 show that GIF produced dark images for all images with halo effects in (b) and (e) and significant noise for the sky in (b) and (d). WGIF, an enhanced version of GIF, showed higher contrast than the other methods but showed a loss of close object information in (c) and (d) owing to an excessive increase in saturation. Furthermore, noise occurred in the sky regions, as shown in (b) and (d). Among the conventional methods, FVR showed the best results, as shown in Fig. (a), and the CAP exhibited the best performance, as shown in Fig. (b)-(d). However, FVR produced more unnatural results than the proposed method, owing to excessive edge emphasis and loss of perspective. Moreover, CAP showed haze in (a), (b), and (c) owing to weak dehazing and showed a strong halo effect, especially as seen in (e).
Furthermore, the proposed method produced the most natural results for all images. Color and spatial senses were restored successfully in (a), and visibility was improved in the images in (b) and (c) while preserving close-distance objects. Furthermore, in cases (b) and (d), the sky region was best preserved, and the entire image was reconstructed without noise. Finally, in (e), which has a bright object, the proposed method yielded the most natural results with the lightest halo effect.
We used the O-HAZE [22] and I-HAZE [23] datasets as test images for quantitative evaluation. O-HAZE and I-HAZE provide real-world hazy and ground-truth images (GT), respectively, using professional hazing machines for the same indoor and outdoor scenes. Therefore, they can be used to calculate the quantitative evaluation metrics peak signal-to-noise ratio (PSNR) [24] and structural index similarity measure (SSIM) [25].
Fig. 8 shows the ground truth and hazy images of the datasets along with the corresponding images of the datasets. In Fig. 8, unlike normal hazy images, the hazy images have a bluish tint due to artificial fog spraying. Consequently, the road, windows, and walls in the images (a), (b), and (c) are recognized as bright objects, requiring high performance for airlight extraction.
The proposed method yielded the best results when the resulting images were compared, as shown in Fig. 8. GIF and WGIF produced very dark close-distance objects owing to excessive dehazing, and all conventional methods except FVR showed poor visibility, especially in (c). The FVR showed relatively good results; however, the edges were over-enhanced, resulting in a more severe haze in (a).
For the resulting images in Fig. 8, we calculated the PSNR and SSIM values for quantitative evaluation and compared them, as shown in Table 1.
Table 1 . PSNR and SSIM results of various dehazing methods
Image | Method | PSNR[dB] | SSIM |
---|---|---|---|
Test 1 | GIF | 36.95 | 0.76 |
WGIF | 39.24 | 0.76 | |
FVR | 27.40 | 0.67 | |
CAP | 37.34 | 0.75 | |
Proposed | 39.65 | 0.78 | |
Test 2 | GIF | 28.98 | 0.50 |
WGIF | 27.28 | 0.50 | |
FVR | 38.73 | 0.53 | |
CAP | 36.12 | 0.63 | |
Proposed | 43.45 | 0.68 | |
Test 3 | GIF | 19.68 | 0.11 |
WGIF | 20.55 | 0.14 | |
FVR | 36.11 | 0.27 | |
CAP | 26.59 | 0.29 | |
Proposed | 40.91 | 0.43 |
As shown in Table 1, the CAP with relatively weak dehazing showed good results in terms of SSIM, whereas the FVR showed good results in terms of PSNR, indicating high visibility. Moreover, for test 1, WGIF showed very similar PSNR values to the proposed method, but the proposed method performed better in SSIM and Fig. 8.
Based on natural dehazing, the proposed method achieved the best results in both visual and quantitative evaluations.
In this study, we proposed a visibility enhancement method that reflects the spatial sense of the natural reconstruction of hazy images. We defined the haze concentration using the inverse relationship between the distance between objects and the haze concentration in hazy images. We also set a weight map that reflects the perspective by dividing the images into images with a large sense of space, images with dense haze, and normal images according to the defined HC. The output weight map was applied to the depth map defined by the CAP to preserve objects close to the sky. Consequently, in the output image of the proposed dehazing method, the contrast of close objects with haze was emphasized and the sky region was softened, providing a natural and highly visible result.
As a result of evaluating the performance of the proposed method, both objective and visual evaluations showed that the proposed method produced the most natural results; in particular, it successfully preserved the sky region and close objects.
Young-Su Chung
received her B.S. degree in Control and Instrumentation Engineering at the School of Electrical Engineering from Pukyong National University, Republic of Korea, in 2022. She is currently pursuing an M.S. at Pukyong National University. Her major research interests include image and signal processing.
Nam-Ho Kim
received his B.S., M.S., and Ph.D. degrees in electronics engineering from Yeungnam University, Republic of Korea in 1984, 1986, and 1991, respectively. Since 1992, he has been with Pukyong National University (PKNU), Republic of Korea, where he is currently a professor in the School of Electrical Engineering. From 2004 to 2006, he was Vice Dean of the College of Engineering, PKNU. His research interests include circuits and systems, high-frequency measurement, sensor systems, image and signal processing with wavelet and adaptive filters, and communications theory.
Journal of information and communication convergence engineering 2024; 22(1): 70-79
Published online March 31, 2024 https://doi.org/10.56977/jicce.2024.22.1.70
Copyright © Korea Institute of Information and Communication Engineering.
Young-Su Chung 1 and Nam-Ho Kim2* , Member, KIICE
1Department of Intelligent Robot Engineering, Pukyong National University, Busan 48513, Republic of Korea
2School of Electrical Engineering, Pukyong National University, Busan 48513, Republic of Korea
Correspondence to:Nam-Ho Kim (E-mail: nhk@pknu.ac.kr, Tel: +82-51-629-6328)
School of Electrical Engineering, Pukyong National University, Busan 48513, Republic of Korea
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Bad weather conditions such as haze lead to a significant lack of visibility in images, which can affect the functioning and reliability of image processing systems. Accordingly, various single-image dehazing (SID) methods have recently been proposed. Existing SID methods have introduced effective visibility improvement algorithms, but they do not reflect the image’s perspective, and thus have limitations that distort the sky area and nearby objects. This study proposes a new SID method that reflects the sense of space by defining the correlation between image brightness and haze. The proposed method defines the haze intensity by calculating the airlight brightness deviation and sets the weight factor of the depth map by classifying images based on the defined haze intensity into images with a large sense of space, images with high intensity, and general images. Consequently, it emphasizes the contrast of nearby images where haze is present and naturally smooths the sky region to preserve the image’s perspective.
Keywords: Haze concentration, Dehazing, Single-image dehazing, Depth map
Systems based on outdoor image processing are used in several fields, including intelligent transportation systems, outdoor object recognition systems, and remote sensing systems. However, unlike normal images, images obtained during harsh weather conditions containing atmospheric particles such as haze and smog exhibit a significant lack of visibility owing to light scattering and absorption. These hazed images may hinder the execution of image processing systems and reduce their reliability [1,2]. Therefore, single-image dehazing (SID) methods are required to restore outdoor images obtained during bad weather.
Early dehazing algorithms used traditional image processing methods based on histogram equalization (HE) [3] or algorithms that dehaze the same scene by capturing images with different degrees of depth and polarization [4]. However, HEbased methods do not reflect the physical characteristics of haze; moreover, methods that use multiple images have limitations that require additional research to overcome. Accordingly, SID has been actively studied in recent dehazing research as an alternative to HE [2].
As a classical SID method, He et al. [5] proposed the dark channel prior (DCP) because haze-free regions generally have color channels of very low intensity. Because DCP can remove haze simply and effectively, several improved algorithms [6-8] based on it have been proposed. In particular, guided image filtering (GIF) [6] applies a guided filter to the transmittance map of the DCP to effectively preserve the image quality in hazy images, whereas weighted GIF (WGIF) [7] incorporates edge-aware weighting into GIF to prevent the halo effect and preserve edge regions. Additionally, Tarel et al. [8] proposed fast visibility restoration (FVR) using a modified median filter to reduce the long soft mapping time of the DCP. In recent years, SID has been studied using machine-learning models. For example, a color attenuation prior (CAP)-based method [9] was used to modify a depth map as a linear model for hue, saturation, and value (HSV) through training, thereby producing results without noise. However, the CAP method slightly lowers the image visibility owing to soft edge processing; ICAP [10] suggests a new depth map through a secondary mode to compensate for it. Dana. et al. [11] proposed an efficient depth-map estimation method by defining haze lines within color clusters, and Ngo et al. [12] proposed the machine-learning approach based on the quad decomposition method and the Batcher network to compensate for background noise and color distortion. The SID methods developed to date have demonstrated excellent performance in terms of visibility and edge preservation; however, dehazing based on haze depth is difficult, resulting in an excessive increase in saturation and contrast in light haze regions. Furthermore, dehazing the entire image causes severe color distortion in the sky region and disturbs the perspective of natural images.
In this study, we set up a depth map based on pixel brightness to preserve the visibility and perspective in hazy images. We used the correlation between the value (V) and depth of the haze in the HSV color space to describe the spatial sense of the hazy image and calculated the deviation in V of the airlight to define the haze concentration (HC). Based on the defined HC, a weight map was designed to preserve the sky and light-haze regions; this map was applied to the depth map defined by the CAP for dehazing. The dehazing method proposed in this study provides highly visible and natural results by emphasizing the contrast of nearby objects in the image while softening the sky region.
The remainder of this paper is organized as follows. Section 2 describes related research, and Section 3 describes the proposed method. Section 4 presents the experimental results and analysis and Section 5 presents our conclusions.
Numerous studies have used the atmospheric scattering model proposed by Narasimhan and Nayar [13] for dehazing [5-7,9]. The atmospheric scattering model [13] defines the hazy image to be dehazed, I, as follows:
where x is the spatial position of the pixel in the image, J is the scene brightness of the image without haze, A is the airlight, and t is the transmittance. t is generally defined as follows:
where β denotes the atmospheric scattering coefficient and d denotes the depth of the scene. Here, d(x) has a large value close to ∞ when the medium is extremely far away from the observer. In this case, as per Eq. (2), t has a small value close to zero; therefore, A has a value similar to I [5], which can be shown by the following equation:
To estimate A using Eq. (3), we need to know the position of a sufficiently large d(x). Generally, as the haze region is located far away from the observer, it has a sufficiently larged(x) [2]; based on this, we can estimate the depth map of the scene to calculate A.
Furthermore, if d is known, t can be calculated using Eq. (2). Based on this, the SID methods [5-7,9] estimate A and t or d, substitute them into Eq. (1), and output a restored image of the original J.
The CAP [9] describes the relationship between brightness and saturation in a hazy image based on statistical and atmospheric scattering model equations. According to this algorithm, the denser the haze concentration, the stronger is the effect of the additional airlight, resulting in a decrease in saturation and an increase in brightness. Because the haze concentration is affected by the scene depth d(x) according to Eq. (2), the brightness and saturation of the hazy image have the following correlation:
where c(x) denotes the concentration of haze, and v(x) and s(x) represent the brightness and saturation of the scene in the HSV color space, respectively. Based on this equation, a new linear model dc(x) is proposed using machine learning for the CAP[9]. The depth map dc(x) is represented as follows with respect to v(x) and s(x):
Here, a0, a1, and a2 represent parameters, and ε(x) denotes a variable representing the model’s random error. As the parameters can vary according to the image, Zhu et al. [9] designed 500 training datasets consisting of hazy images and their corresponding scene depths to enable estimation by generalizing Eq. (5). The trained datasets were used to measure the scene depth and airlight by matching 500 clear images to the depth map of Eq. (5), with random brightness and saturation inputs. The designed 500 training data sets estimated parameters a0, a1, and a2 using the MLE supervised learning model, and Eq. (5) can be expressed as a generalized depth equation for brightness and saturation. The parameters estimated through this process were generalized and defined as a0 = 0.12178, a1 = 0.95971, and a2 = 0.780245.
Moreover, ε(x) can be calculated using a Gaussian distribution model N(0, σ2) with a mean of 0 and a distribution of σ. Zhu et al. [9] expressed it using parameters a0, a1, and a2 as follows:
Here, using the values a0, a1, and a2, σ was calculated as 0.041337.
Fig. 1 illustrates the process of extracting A. According to Eq. (3), A has a value similar to I(x) in an image with an extremely dense haze, and d(x) has a large value. Similarly, many existing studies have estimated the airlight A using I(x) according to the definition of the atmospheric scattering model in Eq. (3) [14]: The well-known and effective method is to detect the airlight A using a set of the brightest higher-level pixels [1-2,5-7,9]. For example, He et al. [5] detected the brightest pixel in the local area of the DCP, and Zhu et al. [9] detected a set of the most brilliant pixels in the depth map defined in Eq. (5). The proposed method also detects airlight A in the HSV space using the depth map of the CAP [9].
Fig. 1(a) shows the depth map dA to output A. In Fig. 1 (a), A represents the largest value (brightest pixel) in the depth map.
Because A is calculated using pixel alignment, as in Eq. (7), it can be determined using the variables V(x) and S(x) according to the pixel position in Eq. (5). Therefore, the following equation was applied by approximating a1 and a2 in Eq. (5) to 0.95 and 0.7, respectively:
Moreover, the depth map dA may contain objects brighter than the haze region, such as the t-shirt shown in Fig. 1. Therefore, a 7×7 window is used to perform median filtering of the depth map dA [15, 16]. Fig. 1 (b) shows the median filtering results for dA. Because A is located at the brightest pixel of dA, we define A using the sorting matrix
where w and h are the width and height of the entire image I, respectively, and xA is the spatial coordinate where A is located. Specifically, xA represents the top 0.1%-pixel position of dA, and A represents the I(x) image when the haze is very dense. I(x) corresponding to A is represented by the dot in Fig. 1 (c).
Fig. 2 shows the effect of dehazing based on the haze deviation H and HC.
Generally, the haze concentration increases with distance, as described in Section 2 and shown in Fig. 2 (a), where all regions are affected by haze. However, most conventional filters apply the same equation to the haze region [5-9], which leads to the problems shown in Fig. 2 (d) and (e). Specifically, they show patterns in which the haze remains, as shown in (d), or the contrast in the short range becomes severe, as shown in (c).
Therefore, we calculate the deviation of V and airlight to estimate the effect of the haze added to each pixel and define HC. As shown in Fig. 2 (f), the proposed method improves visibility while preserving close objects.
Fig. 2 (b) shows the brightness V of the HSV space shown in Fig. 2 (a). In (b), the bright region represents the added haze and the black dot at the top represents V(xA), the brightness value of A.
As mentioned above, the airlight A with a distance A represents the set of the brightest high-level pixels for the scene [1-2,5-7,9]. That is, the brightness V(x) of a scene generally shows a smaller difference than the brightness V(xA) of A when the haze is thicker. Thus, we calculate the haze deviation H(x) as follows:
where xA is the spatial coordinate of A, and H is as shown in Fig. 2 (c). According to Eq. (9), H(x) has higher values in closer regions and lower values in farther regions. The road illustrated in Fig. (c) has a lower value when it is the farther away, and the closest object, the bicycle, has the highest white color.
However, because Eq. (9) is defined according to the general airlight assumption, it can generate errors in scenes where highly bright objects are present. For example, Fig. 2 (c) shows a very low value for yellow clothes. Therefore, this study applies the DCP [5] assumption that in pixels without haze, the brightness of at least one of the three RGB channels contains a dark value close to 0 [5] and the airlight [1-2, 5-7, 9] assumption that the haze area comprises a set of the brightest pixels to obtain the maximum and minimum values of H(x) and defines the difference between them as the haze concentration (HC).
HC is calculated using the maximum and minimum values of H as shown below.
where max(H) and min(H) denote the H(x) values of the least and most dense haze, respectively. According to Eq. (10), HC represents the difference between the pixel with almost no haze and that with heavy haze, that is, street haze concentration.
In images, the sense of space felt based on the far and near fields is represented, whereas in hazy images, the far field shows denser haze than other objects, often resulting in images in which only the far field is affected by haze. Owing to this sense of space, the HC has a large value in the far field and normal images, thus degrading the image reconstruction performance. However, as an exception, an image with extremely dense haze, where even close objects are heavily affected by the haze, shows a low HC. In other words, the HC shows a tendency to gradually decrease in the order of distantview, general, and dense images.
Although an actual haze dataset that can accurately determine haze intensity is necessary to verify the accuracy of HC calculations, no study has yet suggested such a dataset. Therefore, this study compares the results of the Synthetic Objective Testing Set (SOTS) of RESIDE (REalistic Single Image DEHazing) [17] and Dense-HAZE [18].
SOTS provides indoor and outdoor composite haze image data. This study uses indoor SOTS data as images of general density because they provide 50 scene images of haze density intensifying in levels 1-10. Moreover, Dense-HAZE provides a dataset of 55 actual dense haze images, enabling the checking of the HC characteristics in thick haze images. The HC results for each dataset are presented in Fig. 3.
Fig. 3 (a) shows a visualization of 500 indoor datasets by averaging 50 images of levels 1-10. Indoor refers to a hazy image with a low density, whereas Dense-HAZE refers to a high-density hazy image. Fig. 3 shows the HC trends for these images.
Fig. 3 (a) shows the tendency of HC to decrease as the haze density increases, matching the HC characteristics described. Moreover, the HC calculation results for the indoor image shows that all datasets exceeded 0.4; thus, the mean HC values in Fig. 3 (a) are all 0.68 or higher. By contrast, for dense haze, all images except one show 0.6 or less. These results confirm that HC tends to have a lower value in images with higher concentration.
In this study, hazy images were classified into three categories according to the HC for dehazing in terms of preserving the sense of space, as shown in Fig. 4, where (a) is a scene with haze only in the far field, (b) is a scene with severe haze, and (c) is a normal scene. The haze deviation in Fig. 4 shows the H map, where the two dots represent min(H) and max(H), respectively. They are represented visually and numerically in terms of deviations for easy comparison.
When the deviations shown in Fig. 4 were substituted into Eq. (10), Cases 1, 2, and 4 show HC = 0.8095, 0.4431, and 0.63918, respectively. Case 1 shows a high value of max(H) and the largest HC because the object has almost no haze. In contrast, Case 2, with dense haze in all regions, has the lowest HC, with max(H) showing a dark value of less than or equal to 0.5. Case 3 shows a value between those of Cases 1 and 2.
We classified hazy images according to their HC and designed a weight map G that reflects this condition. The initial value of G was set as a zero matrix with weights applied only to H(x) to satisfy this condition.
First, data statistics were used to set a reference value for the HC. In Fig. 3, all 54 scenes, except one in Dense-HAZE [18], exhibit HC values of 0.6 or less. Based on these results, the dense haze images from DenseHAZE and LIVE [19] were simulated to set the reference value for the actual dense haze image. LIVE is an actual hazy image dataset with images of various densities and sizes, from which 45 dense hazy images were visually selected.
The HC calculation of the dense haze datasets of Dense-HAZE and LIVE showed 10 out of 100 images having values of 0.8 or higher, and 91 having 0.6 or lower. Analysis of images with values of 0.8 showed that nine types of exceptional photos contained close-range objects with very little haze, generating significant deviations between images with dense haze and those with little haze.
Second, clustering was performed to determine the criteria for images with large deviations. Similar to the above statistical method, the HC values of all 465 actual haze images from LIVE were calculated, and the k-means clustering algorithm [20,21] was implemented to classify them into two groups. The medians of the two classified clusters were 0.86362 and 0.5660, respectively, and the two clusters were classified into images with high deviation and normal images. These measured results were similar to the reference values obtained in the first step.
Therefore, based on the results of the two steps, the criterion was set to 0.8 or more for images with high haze deviation and 0.5 or less for images with dense haze. In this case, the reference value 0.5 is a value which compensates for the criterion for dense haze images to prevent excessive visibility enhancement for some general images.
For a highly deviated fog image, distant regions close to the sky and regions close to the camera must be preserved to preserve the spatial information. This problem is solved by defining H(x), excluding the region to be preserved, as G(x) according to Eq. (11).
where θ1 and θ2 represent the near and far regions that need to be preserved.
In Case 2, the entire image, except for the airlight region is dehazed because a high reconstruction performance is required even for close-distance regions. G in Case 2 is calculated by specifying the airlight region to be preserved through min(H) as follows:
If extremely dense haze is observed in Case 2, G can cause noise in the image because a weight is assigned to the haze. Therefore, we compensate for this by multiplying Eq. (13) by HC.
In normal images, because the damage in the close-distance region is the main factor, it is specified as the region to be preserved. G in Case 3 is set as follows:
where β is the scattering coefficient in Eq. (2), which is set using μ(H), the mean value of H. Eq. (15) can be applied to any hazy image, and in Case 3, β has a value of 1. Therefore, it is used as the correction constant for G. The scene depth D(x) reflecting the weight G(x) is given by:
where k is the sensitivity constant. Fig. 4 shows the calculated G map, the depth map D, and the dehazing results for different values of k.
The results in Fig. 5 show that Eq. (16) is useful only when the haze is extremely dense, as in Case 2; in fact, Case 2 showed excellent visibility for most values of k, whereas Cases 1 and 3 exhibited darkening of the preserved regions, resulting in image losses, as shown by the magnified images. The cause of this problem can be identified by comparing Cases 1 and 2 in Fig. 5 (a) and (e), respectively. For images that require strong dehazing, such as those in Case 2, emphasizing the G region using Eq. (16) increases the visibility. However, for Case 1, the additional emphasis on G causes a loss of image quality, because it has already been dehazed by dc. Therefore, the depth maps in Cases 1 and 3 were estimated by calculating 1−G, which is the opposite of Case 2. Because 1−G represents the preserved region as opposed to G, we used it to calculate the depth maps for Cases 1 and 3 as follows:
where k1 and k2 denote the sensitivities of the weights. We set k1 as positive and k2 as negative to ensure that the preserved region is not overly affected even if V(x) increases.
Fig. 6 shows the dehazing results for Case 1.
The results in Fig. 6 show that, in Case 1, the image of the close-distance foreground is preserved, and the visibility increases slightly in the far field.
We have estimated the airlight A, atmospheric scattering coefficient β, and depth d through the above-described process. To reconstruct J, we need to calculate t, which is calculated by substituting β and d into Eq. (2). Eq. (18) shows the atmospheric scattering model for J [13].
where
To verify the dehazing performance of the proposed method, we compared it with the GIF [6], WGIF [7], FVR [8], and CAP [9] methods. The test platform comprised a desktop computer with a 7.2 GHz CPU (AMD Ryzen 7) and 16 GB of memory, and the experiments were performed on MATLAB R2023a. In the experiments, the parameter of the proposed method was set as k = 0.7 for Case 2. For Cases 1 and 3, respectively, k1 was set to 0.5 and 0.3, and k2 to −0.7 and −0.4. For comparison, the parameters in Eq. (5) of the CAP [9] used a0 = 0.12178, a1 = 0.95971, and a2 = 0.780245, which are the parameters generated by the author.
Both visual and objective evaluations were performed for the proposed method and conventional methods.
Fig. 7 shows the visual performance of each method with the LIVE [19] dataset by dividing it into Cases 1, 2, and 3 according to the haze concentration in the image. Cases 1, 2, and 3 show images with haze only in the far-field, images with dense haze concentrations, and normal images, respectively. Normal images are classified into three categories, shown in sub-figures (c), (d), and (e) in Fig. 7, to evaluate the dehazing performance for cases where there is a sky region and bright objects.
The results in Fig. 7 show that GIF produced dark images for all images with halo effects in (b) and (e) and significant noise for the sky in (b) and (d). WGIF, an enhanced version of GIF, showed higher contrast than the other methods but showed a loss of close object information in (c) and (d) owing to an excessive increase in saturation. Furthermore, noise occurred in the sky regions, as shown in (b) and (d). Among the conventional methods, FVR showed the best results, as shown in Fig. (a), and the CAP exhibited the best performance, as shown in Fig. (b)-(d). However, FVR produced more unnatural results than the proposed method, owing to excessive edge emphasis and loss of perspective. Moreover, CAP showed haze in (a), (b), and (c) owing to weak dehazing and showed a strong halo effect, especially as seen in (e).
Furthermore, the proposed method produced the most natural results for all images. Color and spatial senses were restored successfully in (a), and visibility was improved in the images in (b) and (c) while preserving close-distance objects. Furthermore, in cases (b) and (d), the sky region was best preserved, and the entire image was reconstructed without noise. Finally, in (e), which has a bright object, the proposed method yielded the most natural results with the lightest halo effect.
We used the O-HAZE [22] and I-HAZE [23] datasets as test images for quantitative evaluation. O-HAZE and I-HAZE provide real-world hazy and ground-truth images (GT), respectively, using professional hazing machines for the same indoor and outdoor scenes. Therefore, they can be used to calculate the quantitative evaluation metrics peak signal-to-noise ratio (PSNR) [24] and structural index similarity measure (SSIM) [25].
Fig. 8 shows the ground truth and hazy images of the datasets along with the corresponding images of the datasets. In Fig. 8, unlike normal hazy images, the hazy images have a bluish tint due to artificial fog spraying. Consequently, the road, windows, and walls in the images (a), (b), and (c) are recognized as bright objects, requiring high performance for airlight extraction.
The proposed method yielded the best results when the resulting images were compared, as shown in Fig. 8. GIF and WGIF produced very dark close-distance objects owing to excessive dehazing, and all conventional methods except FVR showed poor visibility, especially in (c). The FVR showed relatively good results; however, the edges were over-enhanced, resulting in a more severe haze in (a).
For the resulting images in Fig. 8, we calculated the PSNR and SSIM values for quantitative evaluation and compared them, as shown in Table 1.
Table 1 . PSNR and SSIM results of various dehazing methods.
Image | Method | PSNR[dB] | SSIM |
---|---|---|---|
Test 1 | GIF | 36.95 | 0.76 |
WGIF | 39.24 | 0.76 | |
FVR | 27.40 | 0.67 | |
CAP | 37.34 | 0.75 | |
Proposed | 39.65 | 0.78 | |
Test 2 | GIF | 28.98 | 0.50 |
WGIF | 27.28 | 0.50 | |
FVR | 38.73 | 0.53 | |
CAP | 36.12 | 0.63 | |
Proposed | 43.45 | 0.68 | |
Test 3 | GIF | 19.68 | 0.11 |
WGIF | 20.55 | 0.14 | |
FVR | 36.11 | 0.27 | |
CAP | 26.59 | 0.29 | |
Proposed | 40.91 | 0.43 |
As shown in Table 1, the CAP with relatively weak dehazing showed good results in terms of SSIM, whereas the FVR showed good results in terms of PSNR, indicating high visibility. Moreover, for test 1, WGIF showed very similar PSNR values to the proposed method, but the proposed method performed better in SSIM and Fig. 8.
Based on natural dehazing, the proposed method achieved the best results in both visual and quantitative evaluations.
In this study, we proposed a visibility enhancement method that reflects the spatial sense of the natural reconstruction of hazy images. We defined the haze concentration using the inverse relationship between the distance between objects and the haze concentration in hazy images. We also set a weight map that reflects the perspective by dividing the images into images with a large sense of space, images with dense haze, and normal images according to the defined HC. The output weight map was applied to the depth map defined by the CAP to preserve objects close to the sky. Consequently, in the output image of the proposed dehazing method, the contrast of close objects with haze was emphasized and the sky region was softened, providing a natural and highly visible result.
As a result of evaluating the performance of the proposed method, both objective and visual evaluations showed that the proposed method produced the most natural results; in particular, it successfully preserved the sky region and close objects.
Table 1 . PSNR and SSIM results of various dehazing methods.
Image | Method | PSNR[dB] | SSIM |
---|---|---|---|
Test 1 | GIF | 36.95 | 0.76 |
WGIF | 39.24 | 0.76 | |
FVR | 27.40 | 0.67 | |
CAP | 37.34 | 0.75 | |
Proposed | 39.65 | 0.78 | |
Test 2 | GIF | 28.98 | 0.50 |
WGIF | 27.28 | 0.50 | |
FVR | 38.73 | 0.53 | |
CAP | 36.12 | 0.63 | |
Proposed | 43.45 | 0.68 | |
Test 3 | GIF | 19.68 | 0.11 |
WGIF | 20.55 | 0.14 | |
FVR | 36.11 | 0.27 | |
CAP | 26.59 | 0.29 | |
Proposed | 40.91 | 0.43 |
Scott Park, Hyun-Jun Choi, Moon-Seok Kim, Dong-Wook Kim4, and Young-Ho Seo*
The Korea Institute of Information and Commucation Engineering 2014; 12(2): 122-127 https://doi.org/10.6109/jicce.2014.12.2.122