Search 닫기

Regular paper

Split Viewer

Journal of information and communication convergence engineering 2024; 22(1): 56-63

Published online March 31, 2024

https://doi.org/10.56977/jicce.2024.22.1.56

© Korea Institute of Information and Communication Engineering

Similar Image Retrieval Technique based on Semantics through Automatic Labeling Extraction of Personalized Images

Jung-Hee Seo *, Member, KIICE

Department of Computer Engineering, Tongmyong University, Busan 48520, Republic of Korea

Correspondence to : Jung Hee Seo (E-mail: jhseo@tu.ac.kr)
Department of Computer Engineering, Tongmyong University, Busan 48520, Republic of Korea

Received: May 10, 2023; Revised: October 20, 2023; Accepted: November 14, 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Despite the rapid strides in content-based image retrieval, a notable disparity persists between the visual features of images and the semantic features discerned by humans. Hence, image retrieval based on the association of semantic similarities recognized by humans with visual similarities is a difficult task for most image-retrieval systems. Our study endeavors to bridge this gap by refining image semantics, aligning them more closely with human perception. Deep learning techniques are used to semantically classify images and retrieve those that are semantically similar to personalized images. Moreover, we introduce a keyword-based image retrieval, enabling automatic labeling of images in mobile environments. The proposed approach can improve the performance of a mobile device with limited resources and bandwidth by performing retrieval based on the visual features and keywords of the image on the mobile device.

Keywords CBIR, Image Retrieval, Deep Learning, CNN, Feature Detect, etc.

With the rapid expansion of communication technology and the widespread use of digital and mobile devices, the generation of images has surged exponentially in daily life. Consequently, most of the memory of personal digital devices is consumed by images. In addition, personalized image data are being increasingly shared offline and online sharing methods, such as through websites and social media. This surge in image generation has led to an increased interest among researchers in image retrieval methods.

An effective algorithm for content-based image retrieval (CBIR) was developed as a result of research over the past several years. Image-based query retrieval holds promise for efficient image retrieval and finds applications across various fields within computer vision and artificial intelligence. Studies have also been conducted on CBIR using Big Data and deep learning techniques.

Conventional image retrieval techniques are used in various fields, including facial recognition [1,2], iris recognition [3], person identification [4], searching for clothing or other products [5], searching for food and groceries [6], and fingerprint recognition [7,8].

Earlier studies on image-retrieval systems primarily relied on text-based frameworks, with image retrieval being conducted through these methods. CBIR emerged later and its focus shifted toward automatic image annotation. Many CBIR systems use these methods or a combination of them to reduce semantic differences between images [9].

The text-based image retrieval (TBIR) methods, however, pose challenges as users are required to manually input keywords, and there are limitations in semantically aligning with these keywords. By contrast, CBIR can solve the problem of text retrieval at a more fundamental level by utilizing a computer’s visual processing capability [10].

CBIR employs feature extraction and matching to classify images based on human semantic perspectives. Feature extraction in CBIR is the first step of image retrieval. It is invariant to image scaling and rotation and is partially invariant to illumination changes. Additionally, feature extraction in CBIR is well-localized in both spatial and frequency domains, laying the foundation for accurate image retrieval, particularly for databases with numerous single features [11].

However, despite the rapid advancement of CBIR, there are significant differences between the visual features of images and semantic features recognized by humans. Therefore, image retrieval that associates semantic similarities recognized by humans with visual similarities is a difficult task for most image retrieval systems, and results in high memory usage and computational complexity owing to large-scale image processing.

CBIR systems excel at automatically extracting visual content from images using low-level features such as color or texture in image queries. However, users generally prefer to query images according to high-level concepts such as keywords [12].

For several years, CBIR has generated considerable interest among researchers in the imaging field, and many studies have focused on CBIR. However, most image-retrieval methods are susceptible to variations in color, texture, and shape, posing difficulties for content-based image retrieval.

The scale-invariant feature transform (SIFT) is a visual feature extraction method that transforms an image into a collection of local feature vectors. This function is invariant to the translation, scaling, or rotation of the images. Furthermore, it is partially invariant to changes in illumination. Recently, researchers have proposed using SIFT to solve CBIR problems [13].

A notable disparity exists between the low-level visual features of an image and the semantic features recognized by humans. To overcome this issue, image classification using deep-learning technology has been investigated, and many achievements have been made through studies on deep-learning-based image classification. Deep learning has demonstrated exceptional performance in object detection and segmentation [14] and has been leveraged to identify similar images based on semantic similarities [9,15].

Convolutional neural network (CNN)-based image retrieval has developed rapidly in recent years owing to the limited expressive capabilities of existing functions and innovations in image processing via deep neural networks. Given CNN’s substantial advancements in image classification, researchers have explored using pre-trained CNN models to conduct classification tasks in CBIR [5].

In this study, we subdivided images semantically using deep-learning techniques to retrieve images that were semantically similar to a personalized image. Additionally, we introduce keyword-based image retrieval approach by automatically labeling images in a mobile environment, enhancing semantics to align with human perception. Therefore, the gap between the content-based visual semantics of an image and the semantic features of an image recognized by humans can be reduced. Moreover, the memory consumption of digital devices can be reduced.

The paper is structured as follows: Section 2 discusses the studies related to conventional image retrieval. Section 3 reinforces the semantics of images and presents a technique for retrieving similar images based on keywords using automatic image labeling in a mobile environment. Section 4 provides implementation results and analysis, while Section 5 offers our conclusions.

The demand for cutting-edge technologies capable of efficiently processing vast amounts of data is ever-increasing, and CBIR stands out as a powerful method for retrieving diverse pictures and videos from extensive image databases [16]. CBIR enables the retrieval of relevant images from a database by utilizing the content of interest as the input image. This technique is used to search for similar products on e-commerce sites, such as Alibaba, Amazon, and eBay [5].

Image classification based on semantics provides a semantically classified hierarchical image database. Pandey et al. leveraged the benefits of such databases in their study and proposed a system that automatically assigns semantics to images through an adaptive combination of multiple visual features [15].

Existing image-retrieval methods include CBIR [5,16,17], symbol image representation [18], hash algorithms [19,20], CNNs for retrieving similar images [14,21], the SIFT [11,22], image semantics [5,9,15,23], geo-multimedia [13], and entropy-based retrieval [24]. These methods present various approaches for effective and efficient image retrieval.

Punitha et al. [18] proposed a method for representing symbolic images in a symbolic image database, ensuring invariance to image transformation and facilitating exact match retrieval.

Cheng et al. [19] primarily improved feature learning, loss functions, and learning methods to increase image retrieval efficiency and learn more hash functions and hash codes more efficiently. They also proposed an adaptive asymmetric residual hash method based on a residual hash, integrated network, and fast-supervised discrete hashing.

Zhang et al. [14] utilized deep-learning techniques to construct a semantic database using a location estimation method based on semantic information.

Pandey et al. [9] developed content-based semantics and image retrieval systems tailored for semantically classified hierarchical image databases.

Munjal et al. [16] combined CBIR and TBIR support methods to structure a collection of photographs systemically. Their approach simplifies information collection and facilitates offline image retrieval through the automatic generation of text metadata.

Wang et al. [11] developed an effective content-based web image search engine using SIFT feature matching. SIFT descriptors capture the local feature of an image, remaining invariant to scaling, transformation, and rotation, while also exhibiting partial invariance to illumination changes and affine transformation. To reduce the unavailable feature matches, a dynamic probability function replaces the original fixed values to determine the similarity distances and databases from the training images. It can improve search performance by saving the key points in XML format by preprocessing the next original image.

Wangming et al. [22] utilized Lowe’s SIFT properties, renowned for their unique local invariant characteristics in CBIR. The visual contents of the query and database images were extracted and described as a 128-dimensional SIFT feature vector using the CBIR system.

Weng et al. [21] introduced an effective framework leveraging convolutional neural architecture search (CNAS) to address diverse image classification tasks.

Li et al. [5] reviewed technological developments regarding image representation and database retrieval. They explained the practical applications of CBIR in fashion image retrieval, person reidentification, e-commerce product retrieval, remote-sensing image retrieval, and trademark image retrieval. Furthermore, they examined the challenges of big data and future research directions for deep learning.

CBIR is primarily performed using large-scale databases. However, image-retrieval methods for small amounts of data are lacking. Most CBIR methods require a large amount of data. Hence, it is difficult to collect images in tasks such as retrieval, and it is expensive to assign labels, thereby placing numerous limitations on the development of CBIR [5].

Hence, there’s a need for a CBIR-based retrieval technique capable of identifying similar images using a small amount of data based on human semantics, similar to how humans can easily classify semantically similar images.

In this study, we visualized the results of retrieving similar images in a mobile environment and devise an effective search strategy. To this end, we added personalized images through transfer learning to subdivide semantics and perform image classification.

Using the trained model, we automatically extract labels from the gallery or images captured by the camera, saving their tag properties. The goal is to concretize the semantics of the images by minimizing the gap between the semantics of the values stored in the tag properties of the images and the semantics of the images perceived by humans.

A. Learning Model for Classifying Images Based on Semantics

Semantics-based image classification plays a crucial role in retrieving similar images. To reinforce the visual semantics of an image, the semantics can be concretized in detail through transfer learning. Consequently, this study proposes an enhanced architecture based on the MobileNet CNN architecture.

High data accuracy and a low degree of overfitting must be maintained to improve the quality of the CNN model in deep learning. Hence, a large amount of training data is required. Nevertheless, as with the CBIR method, there are many practical limitations to collecting large amounts of training data. To address this issue, transfer learning is employed as a compensatory measure.

Fig. 1 shows the overall system procedure for retrieving similar images based on semantics.

Fig. 1. Process of the proposed similarly image retrieval system

The Input Image Data of the Create Model Module in Fig. 1 consist of hierarchical nodes to which the semantic meanings of the images recognized by humans are assigned so that they can be used as inputs for the learning model. These nodes were used to train a model that was pretrained using the newly added data.

The pretrained feature extractor employs MobileNet as the base model, comprising convolutional and pooling layers. It extracts visual features from a lower-level to a higher-level layer. MobileNet was already trained using the ImageNet dataset.

The classifier configuration incorporates a fully connected layer with dense dropout layers, facilitating hierarchical classification. Fine-tuning involves retraining the added data using the pretrained model. The learning rate was low and the entire layer was fine-tuned to gradually increase the learning rate to create a new model for image classification. The TensorFlow Lite Model converts the newly developed model into a TensorFlow Lite Model format, which includes metadata, enabling it to operate on a mobile device.

The embedded model within the mobile app module, as depicted in Fig. 1 runs the model that has been transformed into a TensorFlow Lite Model format on the mobile device.

User Interface Design represents the design of the screen on the mobile device, and label detection and asset semantic (tag) takes the Query Image as the input, extracts the label with the semantic meaning of the image, and assigns it to the tag property of the image. A similar image-search-based tag retrieves similar images according to keywords through tag properties.

B. Semantic-based Similar Image Retrieval

In this study, deep learning technology was employed to train a model for semantic images classification. The trained model was then converted into a TensorFlow Lite model to work with a mobile device. The model runs on a mobile device and automatically extracts labels from the personalized image. The extracted label is assigned to the tag property of the image, facilitating the retrieval of similar images based on keywords. Therefore, similar images can be retrieved according to keywords through tag properties that reinforce the visual features of the content image with semantic features perceived by human vision.

The proposed procedure for similar image retrieval is outlined in Algorithm 1.

In Step 1, the semantic system of images and their visual hierarchy structure is manually established, categorizing images into hierarchical datasets. Each node within this hierarchy represents the specific image semantics and functions as a label for subsequent extraction.

Step 2 constructs the transfer-learning model by training a new model on the recently added data, leveraging MobileNet, as a pretrained model. MobileNet is used only as a feature extractor. Image classification classifies (Classifier Configuration) the categories included in the images according to the feature extractor, pretrained network, and extracted features. The pretrained feature extractor consists of a convolutional layer and a pooling layer, and the Classifier Configuration classifies the newly added data into a hierarchical structure.

In Step 3, the newly added data undergoes relearning using the pre-trained model through fine-tuning, yielding a refined model tailored for image classification.

Step 4 transforms the generated model into a TensorFlow Lite model to add it to the mobile app and generates metadata that include image labels in the model.

Step 5 runs the TensorFlow Lite model, which includes metadata on the mobile device and uses it to automatically extract the labels for the query image.

Step 6 sorts the automatically extracted labels based on accuracy. Among the extracted labels, the label with the highest accuracy was set as the tag for the query image. This tag is then used to perform keyword-based image retrieval on a mobile device.

Step 7 displays similar image retrieval results for the query image on the mobile device.

In Step 8, if the semantics of the image need to be modified, the image tag can be manually edited individually or collectively.

Algorithm 1. Similar Retrieval Method based keyword
  • Construct hierarchical image nodes and use them as the input nodes for the learning model.

    • number of input node classes: 10

    • image size: 224×224

  • Construct a Transfer Learning model (user-defined classifier).

    • input_layer: activation function=ReLU, 32 nodes

    • hidden_layer

    • output_layer: activation function=softmax, 10 nodes

  • Re-train the input nodes through Fine-Tuning.

    • learning rate: 2e-5

  • Transform the generated model into a TensorFlow Lite model and add metadata to it.

  • Automatically extract the label for the query image on the mobile device.

  • Save the extracted label as the Tag property of the image to an SQLite DB.

  • Display the results of keyword-based similar-image retrieval on the mobile device.

  • If the semantics of the image need to be modified, the image Tag property can be edited manually.

The experimental environment of this study utilized Google TensorFlow and Android-based mobile programming, facilitating the construction of a hierarchical database grounded in visual semantics through a transfer learning model.

The dataset used to train the model in this experiment was an open database. The experiment was conducted using image data collected from hierarchical visual image databases including ImageNet’s ILSVRC2012 dataset, Oxford-iiit-pet dataset, Flickr dataset, and TensorFlow’s flower-photos dataset. The experiment evaluated the effectiveness of similar keyword-based image retrieval on a mobile device.

A Gallery image or image captured by a camera was used as the query image, and the features of the image were extracted using a hierarchical database through image classification training. Labels similar to the visual features of the images were extracted. This model operates on Androidbased mobile devices, serving image-labeling tasks by extracting semantic meanings from images. The extracted labels were saved as tag properties of the images. These properties are represented by a hierarchical search structure that visualizes images on a mobile device. The system’s efficiency in assigning semantics to images was validated through effective visual and intuitive retrieval, offering compelling rationale for aligning human-perceived image features with visual features.

This approach can substancially increase the efficiency of keyword-based similar image retrieval on a mobile device using only tag information from the visual features of the images.

Fig. 2 and Fig. 3 present the loss rate and accuracy of the learning model, respectively. The model maintained a consistent loss rate and accuracy. Additionally, the learning model achieved a loss value of 0.31616 and an accuracy of 0.9342.

Fig. 2. Loss rate of training model

Fig. 3. Accuracy rate of the training model

Fig. 4 and Fig. 5 depict the inference results obtained from the proposed model. For example, in Fig. 4, “pred:sea” represents the label inferred from the image, and “label:sea” represents the actual label of the image. Notably, the recognition rate for males was relatively low. However, as indicated by the results 1×8 in the figure, the recognition rate was high, except when a man was incorrectly identified as a woman.

Fig. 4. Inference result of the proposed model-1

Fig. 5. Inference result of the proposed model-2

Fig. 6 illustrates the process of extracting an image tag on mobile device. “Images Display” at the top (a) shows the result of saving the image using the Gallery or Camera application. Tapping on the image transitions the screen to “ImageView,” depicted in (b) at the top of the figure. If the “TAG SEARCH” button is tapped, the image’s label (mountain) and confidence value (0.9986299) are displayed. The extracted label is then saved as a tag property of the image. Finally, (c) represents a list of tag property values stored in an image.

Fig. 6. Extracting and retrieving the tag of an image on the mobile device

At the bottom of the figure, (d) and (e) display the outcomes of retrieving similar images based on keywords associated with sea and mountain by entering “sea” and “mou” as the Tag property values, respectively. (f) shows the result of incorrectly predicting a “baby” as a “woman” in the case where “wo” was entered as the Tag property to search for “woman.” These results indicate significant potential for enhancement in the proposed system, particularly concerning the semantic classification of people.

Therefore, the experimental results provide significant motivation to narrow the gap between the visual features and semantics of images perceived by humans. However, further studies are needed in other areas, particularly concerning the notable disparity between the visual features and semantics of images perceived by humans, especially in the context of identifying individuals.

We propose an approach for efficient similar-image retrieval based on keywords on mobile devices through automatic image labeling. This approach semantically divides images using deep learning. Therefore, it is possible to reduce the gap in the visual semantics of the content-based images. It is also possible to reduce storage consumption in digital devices, which can improve the performance of mobile devices with limited resources and bandwidth by searching according to the visual features and keywords of the image on the mobile device. The proposed approach achieved outstanding performance in the semantic classification of various images.

Moreover, our approach bridges the gap between the semantics of an image perceived by humans and the visual features of the image because CBIR can be implemented on devices with limited resources, such as mobile devices and embedded systems.

  1. M. Sajid, N. Ali, S. H. Dar, B. Zafar, and M. K. Iqbal, “Short search space and synthesized-reference re-ranking for face image retrieval,” Applied Soft Computing Journal, vol. 99, pp. 1-14, Feb. 2021. DOI: 10.1016/j.asoc.2020.106871.
    CrossRef
  2. S. Khan, L. Chen, and H. Yan, “Co-clustering to reveal salient facial features for expression recognition,” IEEE Transactions on Affective Computing, vol. 11, no. 2, pp. 348-360, Apr. 2020. DOI: 10.1109/TAFFC.2017.2780838.
    CrossRef
  3. U. Jayaraman and P. Gupta, “Efficient similarity search on multidimensional space of biometric databases,” Neurocomputing, vol. 452, pp. 623-652, Sep. 2021. DOI: 10.1016/j.neucom.2020.08.084.
    CrossRef
  4. A. Barman and S. K. Shah, “A graph-based approach for making consensus-based decisions in image search and person reidentification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 3, pp. 753-765, Mar. 2021. DOI: 10.1109/TPAMI.2019.2944597.
    Pubmed CrossRef
  5. X. Li, J. Yang, and J. Ma, “Recent developments of content-based image retrieval (CBIR),” Neurocomputing, vol. 452, pp. 675-689, Sep. 2021. DOI: 10.1016/j.neucom.2020.07.139.
    CrossRef
  6. M. A. Subhi and S. M. Ali, “A deep convolutional neural network for food detection and recognition,” in in 2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), Sarawak, Malaysia, pp. 284-287, 2018. DOI: 10.1109/iecbes.2018.8626720.
    CrossRef
  7. N. Singla, M. Kaur, and S. Sofat, “Latent fingerprint database using reflected ultra violet imaging system,” Procedia Computer Science, vol. 167, pp. 942-951, 2020. DOI: 10.1016/j.procs.2020.03.393.
    CrossRef
  8. K. Cao, D. L. Nguyen, C. Tymoszek, and A. K. Jain, “End-to-end latent fingerprint search,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 880-894, 2020. DOI: 10.1109/tifs.2019.2930487.
    CrossRef
  9. S. Pandey, P. Khanna, and H. Yokota, “A semantics and image retrieval system for hierarchical image databases,” Information Processing and Management, vol. 52, no. 4, pp. 57-591, Jul. 2016. DOI: 10.1016/j.ipm.2015.12.005.
    CrossRef
  10. J. H. Seo, “Metadata processing technique for similar image search of mobile platform,” Journal of Information and Communication Convergence Engineering, vol. 19, no. 1, pp. 36-41, Mar. 2021. DOI: 10.6109/jicce.2021.19.1.36.
  11. Z. Wang, Q. Zheng, and J. Sun, “An effective content-based web image searching engine algorithm,” in in 2010 IEEE International Conference on Management of Innovation &Technology, Singapore, pp. 1008-1012, 2010. DOI: 10.1109/ICMIT.2010.5492878.
    KoreaMed CrossRef
  12. C. F. Tsai and W. C. Lin, “A Comparative Study of Global and Local Feature Representations in Image Database Categorization,” in in 2009 Fifth International Joint Conference on INC, IMS and IDC, Seoul, Korea, pp. 1563-1566, 2009. DOI: 10.1109/ncm.2009.83.
    CrossRef
  13. L. Zhu, W. Yu, C. Zhanh, Z. Zhang, F. Huang, and H. Yu, “SVSJOIN: Efficient Spatial Visual Similarity Join for Geo-Multimedia,” IEEE Access, vol. 7, pp. 158389-158408, Oct. 2019. DOI: 10.1109/ACCESS.2019.2948388.
    CrossRef
  14. W. Zhang, G. Liu, and G. Tian, “A Coarse to Fine Indoor Visual Localization Method Using Environmental Semantic Information,” IEEE Access, vol. 7, pp. 21963-21970, 2019. DOI: 10.1109/access.2019.2899049.
    CrossRef
  15. S. Pandey, P. Khanna, and H. Yokota, “An Effective Use of Adaptive Combination of Visual Features to Retrieve Image Semantics from a Hierarchical Image Database,” Journal of Visual Communication and Image Representation, vol. 30, pp. 136-152, Jul. 2015. DOI: 10.1016/j.jvcir.2015.03.010.
    CrossRef
  16. M. N. Munjal and S. Bhatia, “A Novel Technique for Effective Image Gallery Search using Content Based Image Retrieval System,” in in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (Com-IT-Con), Faridabad, India, pp. 25-29, 2019. DOI: 10.1109/COMITCon.2019.8862206.
    Pubmed CrossRef
  17. A. Javeed, S. Zhou, L. Yongjian, I. Qasim, A. Noor, and R. Nour, “An Intelligent Learning System Based on Random Search Algorithm and Optimized Random Forest Model for Improved Heart Disease Detection,” IEEE Access, vol. 7, pp. 180235-180243, Nov. 2019. DOI: 10.1109/ACCESS.2019.2952107.
    CrossRef
  18. P. Punitha and D. S. Guru, “An Effective and Efficient Exact Match Retrieval Scheme for Symbolic Image Database Systems Based on Spatial Reasoning: A Logarithmic Search Time Approach,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1368-1381, Oct. 2006. DOI: 10.1109/tkde.2006.154.
    CrossRef
  19. S. Cheg, L. Wang, and A. Du, “An Adaptive and Asymmetric Residual Hash for Fast Image Retrieval,” IEEE Access, vol. 7, pp. 78942-78953, Jun. 2019. DOI: 10.1109/ACCESS.2019.2922738.
    CrossRef
  20. P. Li, X. Zhu, X. Zhang, P. Ren, and L. Wan, “Hash Code Reconstruction for Fast Similarity Search,” IEEE Signal Processing Letters, vol. 26, no. 5, pp. 695-699, May 2019. DOI: 10.1109/lsp.2019.2898772.
    CrossRef
  21. Y. Weng, T. Zhou, L. Liu, and C. Xia, “Automatic Convolutional Neural Architecture Search for Image Classification Under Different Scenes,” IEEE Access, vol. 7, pp. 38495-38506, Mar. 2019. DOI: 10.1109/ACCESS.2019.2906369.
    CrossRef
  22. X. Wangming, W. Jin, L. Xinhai, Z. Lei, and S. Gang, “Application of Image SIFT Features to The Context of CBIR,” in in 2008 International Conference on Computer Science and Software Engineering, Wuhan, China, pp. 552-555, 2008. DOI: 10.1109/CSSE.2008.1230.
    CrossRef
  23. S. Pandey, P. Khanna, and H. Yokota, “Clustering of Hierarchical Image Database to Reduce Inter-and Intra-semantic Gaps in Visual Space for Finding Specific Image Semantics,” Journal of Visual Communication and Image Representation, vol. 38, pp. 704-720, Jul. 2016. DOI: 10.1016/j.jvcir.2016.04.013.
    CrossRef
  24. S. Sugawara, K. Yamaoka, and Y. Sakai, “A Study on Image Searching Method in Super Distributed Database,” in in GLOBECOM 97. IEEE Global Telecommunications Conference. Conference Record, Phoenix, AZ, Vol. 2, pp. 736-740, 1997. DOI: 10.1109/glocom.1997.638427.
    CrossRef

Jung-Hee Seo

She received a B.S. degree in Computer Science from Silla University in 1994, M.S. degree in Computer Science and Statistics from Kyungsung University in 1997, and Ph.D. degree in Electronic Commerce System from Pukyong National University in 2006. She has been a assistant professor with the Department of Computer Engineering, Tongmyong University, since 2000. Her research interests includes Remote Education, Multimedia, Image Processing, Information Protection, Mobile Application


Article

Regular paper

Journal of information and communication convergence engineering 2024; 22(1): 56-63

Published online March 31, 2024 https://doi.org/10.56977/jicce.2024.22.1.56

Copyright © Korea Institute of Information and Communication Engineering.

Similar Image Retrieval Technique based on Semantics through Automatic Labeling Extraction of Personalized Images

Jung-Hee Seo *, Member, KIICE

Department of Computer Engineering, Tongmyong University, Busan 48520, Republic of Korea

Correspondence to:Jung Hee Seo (E-mail: jhseo@tu.ac.kr)
Department of Computer Engineering, Tongmyong University, Busan 48520, Republic of Korea

Received: May 10, 2023; Revised: October 20, 2023; Accepted: November 14, 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Despite the rapid strides in content-based image retrieval, a notable disparity persists between the visual features of images and the semantic features discerned by humans. Hence, image retrieval based on the association of semantic similarities recognized by humans with visual similarities is a difficult task for most image-retrieval systems. Our study endeavors to bridge this gap by refining image semantics, aligning them more closely with human perception. Deep learning techniques are used to semantically classify images and retrieve those that are semantically similar to personalized images. Moreover, we introduce a keyword-based image retrieval, enabling automatic labeling of images in mobile environments. The proposed approach can improve the performance of a mobile device with limited resources and bandwidth by performing retrieval based on the visual features and keywords of the image on the mobile device.

Keywords: CBIR, Image Retrieval, Deep Learning, CNN, Feature Detect, etc.

I. INTRODUCTION

With the rapid expansion of communication technology and the widespread use of digital and mobile devices, the generation of images has surged exponentially in daily life. Consequently, most of the memory of personal digital devices is consumed by images. In addition, personalized image data are being increasingly shared offline and online sharing methods, such as through websites and social media. This surge in image generation has led to an increased interest among researchers in image retrieval methods.

An effective algorithm for content-based image retrieval (CBIR) was developed as a result of research over the past several years. Image-based query retrieval holds promise for efficient image retrieval and finds applications across various fields within computer vision and artificial intelligence. Studies have also been conducted on CBIR using Big Data and deep learning techniques.

Conventional image retrieval techniques are used in various fields, including facial recognition [1,2], iris recognition [3], person identification [4], searching for clothing or other products [5], searching for food and groceries [6], and fingerprint recognition [7,8].

Earlier studies on image-retrieval systems primarily relied on text-based frameworks, with image retrieval being conducted through these methods. CBIR emerged later and its focus shifted toward automatic image annotation. Many CBIR systems use these methods or a combination of them to reduce semantic differences between images [9].

The text-based image retrieval (TBIR) methods, however, pose challenges as users are required to manually input keywords, and there are limitations in semantically aligning with these keywords. By contrast, CBIR can solve the problem of text retrieval at a more fundamental level by utilizing a computer’s visual processing capability [10].

CBIR employs feature extraction and matching to classify images based on human semantic perspectives. Feature extraction in CBIR is the first step of image retrieval. It is invariant to image scaling and rotation and is partially invariant to illumination changes. Additionally, feature extraction in CBIR is well-localized in both spatial and frequency domains, laying the foundation for accurate image retrieval, particularly for databases with numerous single features [11].

However, despite the rapid advancement of CBIR, there are significant differences between the visual features of images and semantic features recognized by humans. Therefore, image retrieval that associates semantic similarities recognized by humans with visual similarities is a difficult task for most image retrieval systems, and results in high memory usage and computational complexity owing to large-scale image processing.

CBIR systems excel at automatically extracting visual content from images using low-level features such as color or texture in image queries. However, users generally prefer to query images according to high-level concepts such as keywords [12].

For several years, CBIR has generated considerable interest among researchers in the imaging field, and many studies have focused on CBIR. However, most image-retrieval methods are susceptible to variations in color, texture, and shape, posing difficulties for content-based image retrieval.

The scale-invariant feature transform (SIFT) is a visual feature extraction method that transforms an image into a collection of local feature vectors. This function is invariant to the translation, scaling, or rotation of the images. Furthermore, it is partially invariant to changes in illumination. Recently, researchers have proposed using SIFT to solve CBIR problems [13].

A notable disparity exists between the low-level visual features of an image and the semantic features recognized by humans. To overcome this issue, image classification using deep-learning technology has been investigated, and many achievements have been made through studies on deep-learning-based image classification. Deep learning has demonstrated exceptional performance in object detection and segmentation [14] and has been leveraged to identify similar images based on semantic similarities [9,15].

Convolutional neural network (CNN)-based image retrieval has developed rapidly in recent years owing to the limited expressive capabilities of existing functions and innovations in image processing via deep neural networks. Given CNN’s substantial advancements in image classification, researchers have explored using pre-trained CNN models to conduct classification tasks in CBIR [5].

In this study, we subdivided images semantically using deep-learning techniques to retrieve images that were semantically similar to a personalized image. Additionally, we introduce keyword-based image retrieval approach by automatically labeling images in a mobile environment, enhancing semantics to align with human perception. Therefore, the gap between the content-based visual semantics of an image and the semantic features of an image recognized by humans can be reduced. Moreover, the memory consumption of digital devices can be reduced.

The paper is structured as follows: Section 2 discusses the studies related to conventional image retrieval. Section 3 reinforces the semantics of images and presents a technique for retrieving similar images based on keywords using automatic image labeling in a mobile environment. Section 4 provides implementation results and analysis, while Section 5 offers our conclusions.

II. RELATED STUDIES

The demand for cutting-edge technologies capable of efficiently processing vast amounts of data is ever-increasing, and CBIR stands out as a powerful method for retrieving diverse pictures and videos from extensive image databases [16]. CBIR enables the retrieval of relevant images from a database by utilizing the content of interest as the input image. This technique is used to search for similar products on e-commerce sites, such as Alibaba, Amazon, and eBay [5].

Image classification based on semantics provides a semantically classified hierarchical image database. Pandey et al. leveraged the benefits of such databases in their study and proposed a system that automatically assigns semantics to images through an adaptive combination of multiple visual features [15].

Existing image-retrieval methods include CBIR [5,16,17], symbol image representation [18], hash algorithms [19,20], CNNs for retrieving similar images [14,21], the SIFT [11,22], image semantics [5,9,15,23], geo-multimedia [13], and entropy-based retrieval [24]. These methods present various approaches for effective and efficient image retrieval.

Punitha et al. [18] proposed a method for representing symbolic images in a symbolic image database, ensuring invariance to image transformation and facilitating exact match retrieval.

Cheng et al. [19] primarily improved feature learning, loss functions, and learning methods to increase image retrieval efficiency and learn more hash functions and hash codes more efficiently. They also proposed an adaptive asymmetric residual hash method based on a residual hash, integrated network, and fast-supervised discrete hashing.

Zhang et al. [14] utilized deep-learning techniques to construct a semantic database using a location estimation method based on semantic information.

Pandey et al. [9] developed content-based semantics and image retrieval systems tailored for semantically classified hierarchical image databases.

Munjal et al. [16] combined CBIR and TBIR support methods to structure a collection of photographs systemically. Their approach simplifies information collection and facilitates offline image retrieval through the automatic generation of text metadata.

Wang et al. [11] developed an effective content-based web image search engine using SIFT feature matching. SIFT descriptors capture the local feature of an image, remaining invariant to scaling, transformation, and rotation, while also exhibiting partial invariance to illumination changes and affine transformation. To reduce the unavailable feature matches, a dynamic probability function replaces the original fixed values to determine the similarity distances and databases from the training images. It can improve search performance by saving the key points in XML format by preprocessing the next original image.

Wangming et al. [22] utilized Lowe’s SIFT properties, renowned for their unique local invariant characteristics in CBIR. The visual contents of the query and database images were extracted and described as a 128-dimensional SIFT feature vector using the CBIR system.

Weng et al. [21] introduced an effective framework leveraging convolutional neural architecture search (CNAS) to address diverse image classification tasks.

Li et al. [5] reviewed technological developments regarding image representation and database retrieval. They explained the practical applications of CBIR in fashion image retrieval, person reidentification, e-commerce product retrieval, remote-sensing image retrieval, and trademark image retrieval. Furthermore, they examined the challenges of big data and future research directions for deep learning.

III. Technique for Retrieving Similar Images Based on Semantics

CBIR is primarily performed using large-scale databases. However, image-retrieval methods for small amounts of data are lacking. Most CBIR methods require a large amount of data. Hence, it is difficult to collect images in tasks such as retrieval, and it is expensive to assign labels, thereby placing numerous limitations on the development of CBIR [5].

Hence, there’s a need for a CBIR-based retrieval technique capable of identifying similar images using a small amount of data based on human semantics, similar to how humans can easily classify semantically similar images.

In this study, we visualized the results of retrieving similar images in a mobile environment and devise an effective search strategy. To this end, we added personalized images through transfer learning to subdivide semantics and perform image classification.

Using the trained model, we automatically extract labels from the gallery or images captured by the camera, saving their tag properties. The goal is to concretize the semantics of the images by minimizing the gap between the semantics of the values stored in the tag properties of the images and the semantics of the images perceived by humans.

A. Learning Model for Classifying Images Based on Semantics

Semantics-based image classification plays a crucial role in retrieving similar images. To reinforce the visual semantics of an image, the semantics can be concretized in detail through transfer learning. Consequently, this study proposes an enhanced architecture based on the MobileNet CNN architecture.

High data accuracy and a low degree of overfitting must be maintained to improve the quality of the CNN model in deep learning. Hence, a large amount of training data is required. Nevertheless, as with the CBIR method, there are many practical limitations to collecting large amounts of training data. To address this issue, transfer learning is employed as a compensatory measure.

Fig. 1 shows the overall system procedure for retrieving similar images based on semantics.

Figure 1. Process of the proposed similarly image retrieval system

The Input Image Data of the Create Model Module in Fig. 1 consist of hierarchical nodes to which the semantic meanings of the images recognized by humans are assigned so that they can be used as inputs for the learning model. These nodes were used to train a model that was pretrained using the newly added data.

The pretrained feature extractor employs MobileNet as the base model, comprising convolutional and pooling layers. It extracts visual features from a lower-level to a higher-level layer. MobileNet was already trained using the ImageNet dataset.

The classifier configuration incorporates a fully connected layer with dense dropout layers, facilitating hierarchical classification. Fine-tuning involves retraining the added data using the pretrained model. The learning rate was low and the entire layer was fine-tuned to gradually increase the learning rate to create a new model for image classification. The TensorFlow Lite Model converts the newly developed model into a TensorFlow Lite Model format, which includes metadata, enabling it to operate on a mobile device.

The embedded model within the mobile app module, as depicted in Fig. 1 runs the model that has been transformed into a TensorFlow Lite Model format on the mobile device.

User Interface Design represents the design of the screen on the mobile device, and label detection and asset semantic (tag) takes the Query Image as the input, extracts the label with the semantic meaning of the image, and assigns it to the tag property of the image. A similar image-search-based tag retrieves similar images according to keywords through tag properties.

B. Semantic-based Similar Image Retrieval

In this study, deep learning technology was employed to train a model for semantic images classification. The trained model was then converted into a TensorFlow Lite model to work with a mobile device. The model runs on a mobile device and automatically extracts labels from the personalized image. The extracted label is assigned to the tag property of the image, facilitating the retrieval of similar images based on keywords. Therefore, similar images can be retrieved according to keywords through tag properties that reinforce the visual features of the content image with semantic features perceived by human vision.

The proposed procedure for similar image retrieval is outlined in Algorithm 1.

In Step 1, the semantic system of images and their visual hierarchy structure is manually established, categorizing images into hierarchical datasets. Each node within this hierarchy represents the specific image semantics and functions as a label for subsequent extraction.

Step 2 constructs the transfer-learning model by training a new model on the recently added data, leveraging MobileNet, as a pretrained model. MobileNet is used only as a feature extractor. Image classification classifies (Classifier Configuration) the categories included in the images according to the feature extractor, pretrained network, and extracted features. The pretrained feature extractor consists of a convolutional layer and a pooling layer, and the Classifier Configuration classifies the newly added data into a hierarchical structure.

In Step 3, the newly added data undergoes relearning using the pre-trained model through fine-tuning, yielding a refined model tailored for image classification.

Step 4 transforms the generated model into a TensorFlow Lite model to add it to the mobile app and generates metadata that include image labels in the model.

Step 5 runs the TensorFlow Lite model, which includes metadata on the mobile device and uses it to automatically extract the labels for the query image.

Step 6 sorts the automatically extracted labels based on accuracy. Among the extracted labels, the label with the highest accuracy was set as the tag for the query image. This tag is then used to perform keyword-based image retrieval on a mobile device.

Step 7 displays similar image retrieval results for the query image on the mobile device.

In Step 8, if the semantics of the image need to be modified, the image tag can be manually edited individually or collectively.

Algorithm 1. Similar Retrieval Method based keyword
  • Construct hierarchical image nodes and use them as the input nodes for the learning model.

    • number of input node classes: 10

    • image size: 224×224

  • Construct a Transfer Learning model (user-defined classifier).

    • input_layer: activation function=ReLU, 32 nodes

    • hidden_layer

    • output_layer: activation function=softmax, 10 nodes

  • Re-train the input nodes through Fine-Tuning.

    • learning rate: 2e-5

  • Transform the generated model into a TensorFlow Lite model and add metadata to it.

  • Automatically extract the label for the query image on the mobile device.

  • Save the extracted label as the Tag property of the image to an SQLite DB.

  • Display the results of keyword-based similar-image retrieval on the mobile device.

  • If the semantics of the image need to be modified, the image Tag property can be edited manually.

IV. Implementation Results and Analysis

The experimental environment of this study utilized Google TensorFlow and Android-based mobile programming, facilitating the construction of a hierarchical database grounded in visual semantics through a transfer learning model.

The dataset used to train the model in this experiment was an open database. The experiment was conducted using image data collected from hierarchical visual image databases including ImageNet’s ILSVRC2012 dataset, Oxford-iiit-pet dataset, Flickr dataset, and TensorFlow’s flower-photos dataset. The experiment evaluated the effectiveness of similar keyword-based image retrieval on a mobile device.

A Gallery image or image captured by a camera was used as the query image, and the features of the image were extracted using a hierarchical database through image classification training. Labels similar to the visual features of the images were extracted. This model operates on Androidbased mobile devices, serving image-labeling tasks by extracting semantic meanings from images. The extracted labels were saved as tag properties of the images. These properties are represented by a hierarchical search structure that visualizes images on a mobile device. The system’s efficiency in assigning semantics to images was validated through effective visual and intuitive retrieval, offering compelling rationale for aligning human-perceived image features with visual features.

This approach can substancially increase the efficiency of keyword-based similar image retrieval on a mobile device using only tag information from the visual features of the images.

Fig. 2 and Fig. 3 present the loss rate and accuracy of the learning model, respectively. The model maintained a consistent loss rate and accuracy. Additionally, the learning model achieved a loss value of 0.31616 and an accuracy of 0.9342.

Figure 2. Loss rate of training model

Figure 3. Accuracy rate of the training model

Fig. 4 and Fig. 5 depict the inference results obtained from the proposed model. For example, in Fig. 4, “pred:sea” represents the label inferred from the image, and “label:sea” represents the actual label of the image. Notably, the recognition rate for males was relatively low. However, as indicated by the results 1×8 in the figure, the recognition rate was high, except when a man was incorrectly identified as a woman.

Figure 4. Inference result of the proposed model-1

Figure 5. Inference result of the proposed model-2

Fig. 6 illustrates the process of extracting an image tag on mobile device. “Images Display” at the top (a) shows the result of saving the image using the Gallery or Camera application. Tapping on the image transitions the screen to “ImageView,” depicted in (b) at the top of the figure. If the “TAG SEARCH” button is tapped, the image’s label (mountain) and confidence value (0.9986299) are displayed. The extracted label is then saved as a tag property of the image. Finally, (c) represents a list of tag property values stored in an image.

Figure 6. Extracting and retrieving the tag of an image on the mobile device

At the bottom of the figure, (d) and (e) display the outcomes of retrieving similar images based on keywords associated with sea and mountain by entering “sea” and “mou” as the Tag property values, respectively. (f) shows the result of incorrectly predicting a “baby” as a “woman” in the case where “wo” was entered as the Tag property to search for “woman.” These results indicate significant potential for enhancement in the proposed system, particularly concerning the semantic classification of people.

Therefore, the experimental results provide significant motivation to narrow the gap between the visual features and semantics of images perceived by humans. However, further studies are needed in other areas, particularly concerning the notable disparity between the visual features and semantics of images perceived by humans, especially in the context of identifying individuals.

V. CONCLUSION

We propose an approach for efficient similar-image retrieval based on keywords on mobile devices through automatic image labeling. This approach semantically divides images using deep learning. Therefore, it is possible to reduce the gap in the visual semantics of the content-based images. It is also possible to reduce storage consumption in digital devices, which can improve the performance of mobile devices with limited resources and bandwidth by searching according to the visual features and keywords of the image on the mobile device. The proposed approach achieved outstanding performance in the semantic classification of various images.

Moreover, our approach bridges the gap between the semantics of an image perceived by humans and the visual features of the image because CBIR can be implemented on devices with limited resources, such as mobile devices and embedded systems.

Fig 1.

Figure 1.Process of the proposed similarly image retrieval system
Journal of Information and Communication Convergence Engineering 2024; 22: 56-63https://doi.org/10.56977/jicce.2024.22.1.56

Fig 2.

Figure 2.Loss rate of training model
Journal of Information and Communication Convergence Engineering 2024; 22: 56-63https://doi.org/10.56977/jicce.2024.22.1.56

Fig 3.

Figure 3.Accuracy rate of the training model
Journal of Information and Communication Convergence Engineering 2024; 22: 56-63https://doi.org/10.56977/jicce.2024.22.1.56

Fig 4.

Figure 4.Inference result of the proposed model-1
Journal of Information and Communication Convergence Engineering 2024; 22: 56-63https://doi.org/10.56977/jicce.2024.22.1.56

Fig 5.

Figure 5.Inference result of the proposed model-2
Journal of Information and Communication Convergence Engineering 2024; 22: 56-63https://doi.org/10.56977/jicce.2024.22.1.56

Fig 6.

Figure 6.Extracting and retrieving the tag of an image on the mobile device
Journal of Information and Communication Convergence Engineering 2024; 22: 56-63https://doi.org/10.56977/jicce.2024.22.1.56

References

  1. M. Sajid, N. Ali, S. H. Dar, B. Zafar, and M. K. Iqbal, “Short search space and synthesized-reference re-ranking for face image retrieval,” Applied Soft Computing Journal, vol. 99, pp. 1-14, Feb. 2021. DOI: 10.1016/j.asoc.2020.106871.
    CrossRef
  2. S. Khan, L. Chen, and H. Yan, “Co-clustering to reveal salient facial features for expression recognition,” IEEE Transactions on Affective Computing, vol. 11, no. 2, pp. 348-360, Apr. 2020. DOI: 10.1109/TAFFC.2017.2780838.
    CrossRef
  3. U. Jayaraman and P. Gupta, “Efficient similarity search on multidimensional space of biometric databases,” Neurocomputing, vol. 452, pp. 623-652, Sep. 2021. DOI: 10.1016/j.neucom.2020.08.084.
    CrossRef
  4. A. Barman and S. K. Shah, “A graph-based approach for making consensus-based decisions in image search and person reidentification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 3, pp. 753-765, Mar. 2021. DOI: 10.1109/TPAMI.2019.2944597.
    Pubmed CrossRef
  5. X. Li, J. Yang, and J. Ma, “Recent developments of content-based image retrieval (CBIR),” Neurocomputing, vol. 452, pp. 675-689, Sep. 2021. DOI: 10.1016/j.neucom.2020.07.139.
    CrossRef
  6. M. A. Subhi and S. M. Ali, “A deep convolutional neural network for food detection and recognition,” in in 2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), Sarawak, Malaysia, pp. 284-287, 2018. DOI: 10.1109/iecbes.2018.8626720.
    CrossRef
  7. N. Singla, M. Kaur, and S. Sofat, “Latent fingerprint database using reflected ultra violet imaging system,” Procedia Computer Science, vol. 167, pp. 942-951, 2020. DOI: 10.1016/j.procs.2020.03.393.
    CrossRef
  8. K. Cao, D. L. Nguyen, C. Tymoszek, and A. K. Jain, “End-to-end latent fingerprint search,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 880-894, 2020. DOI: 10.1109/tifs.2019.2930487.
    CrossRef
  9. S. Pandey, P. Khanna, and H. Yokota, “A semantics and image retrieval system for hierarchical image databases,” Information Processing and Management, vol. 52, no. 4, pp. 57-591, Jul. 2016. DOI: 10.1016/j.ipm.2015.12.005.
    CrossRef
  10. J. H. Seo, “Metadata processing technique for similar image search of mobile platform,” Journal of Information and Communication Convergence Engineering, vol. 19, no. 1, pp. 36-41, Mar. 2021. DOI: 10.6109/jicce.2021.19.1.36.
  11. Z. Wang, Q. Zheng, and J. Sun, “An effective content-based web image searching engine algorithm,” in in 2010 IEEE International Conference on Management of Innovation &Technology, Singapore, pp. 1008-1012, 2010. DOI: 10.1109/ICMIT.2010.5492878.
    KoreaMed CrossRef
  12. C. F. Tsai and W. C. Lin, “A Comparative Study of Global and Local Feature Representations in Image Database Categorization,” in in 2009 Fifth International Joint Conference on INC, IMS and IDC, Seoul, Korea, pp. 1563-1566, 2009. DOI: 10.1109/ncm.2009.83.
    CrossRef
  13. L. Zhu, W. Yu, C. Zhanh, Z. Zhang, F. Huang, and H. Yu, “SVSJOIN: Efficient Spatial Visual Similarity Join for Geo-Multimedia,” IEEE Access, vol. 7, pp. 158389-158408, Oct. 2019. DOI: 10.1109/ACCESS.2019.2948388.
    CrossRef
  14. W. Zhang, G. Liu, and G. Tian, “A Coarse to Fine Indoor Visual Localization Method Using Environmental Semantic Information,” IEEE Access, vol. 7, pp. 21963-21970, 2019. DOI: 10.1109/access.2019.2899049.
    CrossRef
  15. S. Pandey, P. Khanna, and H. Yokota, “An Effective Use of Adaptive Combination of Visual Features to Retrieve Image Semantics from a Hierarchical Image Database,” Journal of Visual Communication and Image Representation, vol. 30, pp. 136-152, Jul. 2015. DOI: 10.1016/j.jvcir.2015.03.010.
    CrossRef
  16. M. N. Munjal and S. Bhatia, “A Novel Technique for Effective Image Gallery Search using Content Based Image Retrieval System,” in in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (Com-IT-Con), Faridabad, India, pp. 25-29, 2019. DOI: 10.1109/COMITCon.2019.8862206.
    Pubmed CrossRef
  17. A. Javeed, S. Zhou, L. Yongjian, I. Qasim, A. Noor, and R. Nour, “An Intelligent Learning System Based on Random Search Algorithm and Optimized Random Forest Model for Improved Heart Disease Detection,” IEEE Access, vol. 7, pp. 180235-180243, Nov. 2019. DOI: 10.1109/ACCESS.2019.2952107.
    CrossRef
  18. P. Punitha and D. S. Guru, “An Effective and Efficient Exact Match Retrieval Scheme for Symbolic Image Database Systems Based on Spatial Reasoning: A Logarithmic Search Time Approach,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1368-1381, Oct. 2006. DOI: 10.1109/tkde.2006.154.
    CrossRef
  19. S. Cheg, L. Wang, and A. Du, “An Adaptive and Asymmetric Residual Hash for Fast Image Retrieval,” IEEE Access, vol. 7, pp. 78942-78953, Jun. 2019. DOI: 10.1109/ACCESS.2019.2922738.
    CrossRef
  20. P. Li, X. Zhu, X. Zhang, P. Ren, and L. Wan, “Hash Code Reconstruction for Fast Similarity Search,” IEEE Signal Processing Letters, vol. 26, no. 5, pp. 695-699, May 2019. DOI: 10.1109/lsp.2019.2898772.
    CrossRef
  21. Y. Weng, T. Zhou, L. Liu, and C. Xia, “Automatic Convolutional Neural Architecture Search for Image Classification Under Different Scenes,” IEEE Access, vol. 7, pp. 38495-38506, Mar. 2019. DOI: 10.1109/ACCESS.2019.2906369.
    CrossRef
  22. X. Wangming, W. Jin, L. Xinhai, Z. Lei, and S. Gang, “Application of Image SIFT Features to The Context of CBIR,” in in 2008 International Conference on Computer Science and Software Engineering, Wuhan, China, pp. 552-555, 2008. DOI: 10.1109/CSSE.2008.1230.
    CrossRef
  23. S. Pandey, P. Khanna, and H. Yokota, “Clustering of Hierarchical Image Database to Reduce Inter-and Intra-semantic Gaps in Visual Space for Finding Specific Image Semantics,” Journal of Visual Communication and Image Representation, vol. 38, pp. 704-720, Jul. 2016. DOI: 10.1016/j.jvcir.2016.04.013.
    CrossRef
  24. S. Sugawara, K. Yamaoka, and Y. Sakai, “A Study on Image Searching Method in Super Distributed Database,” in in GLOBECOM 97. IEEE Global Telecommunications Conference. Conference Record, Phoenix, AZ, Vol. 2, pp. 736-740, 1997. DOI: 10.1109/glocom.1997.638427.
    CrossRef
JICCE
Sep 30, 2024 Vol.22 No.3, pp. 173~266

Stats or Metrics

Share this article on

  • line

Related articles in JICCE

Journal of Information and Communication Convergence Engineering Jouranl of information and
communication convergence engineering
(J. Inf. Commun. Converg. Eng.)

eISSN 2234-8883
pISSN 2234-8255