Journal of information and communication convergence engineering 2024; 22(4): 288-295
Published online December 31, 2024
https://doi.org/10.56977/jicce.2024.22.4.288
© Korea Institute of Information and Communication Engineering
Correspondence to : Khang Nhut Lam (E-mail: lnkhang@ctu.edu.vn)
Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Recipe generation is an important task in both research and real life. In this study, we explore several pretrained language models that generate recipes from a list of text-based ingredients. Our recipe-generation models use a standard self-attention mechanism in Transformer and integrate a re-attention mechanism in Vision Transformer. The models were trained using a common paradigm based on cross-entropy loss and the BRIO paradigm combining contrastive and cross-entropy losses to achieve the best performance faster and eliminate exposure bias. Specifically, we utilize a generation model to produce N recipe candidates from ingredients. These initial candidates are used to train a BRIO-based recipe-generation model to produce N new candidates, which are used for iteratively fine-tuning the model to enhance the recipe quality. We experimentally evaluated our models using the RecipeNLG and CookingVN-recipe datasets in English and Vietnamese, respectively. Our best model, which leverages BART with re-attention and is trained using BRIO, outperforms the existing models.
Keywords Attention mechanism, BART, Recipe-generation model, Transformer
Journal of information and communication convergence engineering 2024; 22(4): 288-295
Published online December 31, 2024 https://doi.org/10.56977/jicce.2024.22.4.288
Copyright © Korea Institute of Information and Communication Engineering.
Khang Nhut Lam 1*, My-Khanh Thi Nguyen 1, Huu Trong Nguyen 1, Vi Trieu Huynh 2, Van Lam Le 1, and Jugal Kalita3
1Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam
2Research and Application Development Department, FPT University, Can Tho 94100, Vietnam
3Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, USA
Correspondence to:Khang Nhut Lam (E-mail: lnkhang@ctu.edu.vn)
Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Recipe generation is an important task in both research and real life. In this study, we explore several pretrained language models that generate recipes from a list of text-based ingredients. Our recipe-generation models use a standard self-attention mechanism in Transformer and integrate a re-attention mechanism in Vision Transformer. The models were trained using a common paradigm based on cross-entropy loss and the BRIO paradigm combining contrastive and cross-entropy losses to achieve the best performance faster and eliminate exposure bias. Specifically, we utilize a generation model to produce N recipe candidates from ingredients. These initial candidates are used to train a BRIO-based recipe-generation model to produce N new candidates, which are used for iteratively fine-tuning the model to enhance the recipe quality. We experimentally evaluated our models using the RecipeNLG and CookingVN-recipe datasets in English and Vietnamese, respectively. Our best model, which leverages BART with re-attention and is trained using BRIO, outperforms the existing models.
Keywords: Attention mechanism, BART, Recipe-generation model, Transformer
Ansary Shafew, Dongwan Kim, and Daehee Kim
Journal of information and communication convergence engineering 2024; 22(4): 303-309 https://doi.org/10.56977/jicce.2024.22.4.303