Search 닫기

Regular paper

Journal of information and communication convergence engineering 2024; 22(4): 288-295

Published online December 31, 2024

https://doi.org/10.56977/jicce.2024.22.4.288

© Korea Institute of Information and Communication Engineering

Enhancing Transformer-based Cooking Recipe Generation Models from Text Ingredients

Khang Nhut Lam 1*, My-Khanh Thi Nguyen 1, Huu Trong Nguyen 1, Vi Trieu Huynh 2, Van Lam Le 1, and Jugal Kalita3

1Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam
2Research and Application Development Department, FPT University, Can Tho 94100, Vietnam
3Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, USA

Correspondence to : Khang Nhut Lam (E-mail: lnkhang@ctu.edu.vn)
Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam

Received: July 9, 2024; Revised: September 16, 2024; Accepted: October 4, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Recipe generation is an important task in both research and real life. In this study, we explore several pretrained language models that generate recipes from a list of text-based ingredients. Our recipe-generation models use a standard self-attention mechanism in Transformer and integrate a re-attention mechanism in Vision Transformer. The models were trained using a common paradigm based on cross-entropy loss and the BRIO paradigm combining contrastive and cross-entropy losses to achieve the best performance faster and eliminate exposure bias. Specifically, we utilize a generation model to produce N recipe candidates from ingredients. These initial candidates are used to train a BRIO-based recipe-generation model to produce N new candidates, which are used for iteratively fine-tuning the model to enhance the recipe quality. We experimentally evaluated our models using the RecipeNLG and CookingVN-recipe datasets in English and Vietnamese, respectively. Our best model, which leverages BART with re-attention and is trained using BRIO, outperforms the existing models.

Keywords Attention mechanism, BART, Recipe-generation model, Transformer

Article

Regular paper

Journal of information and communication convergence engineering 2024; 22(4): 288-295

Published online December 31, 2024 https://doi.org/10.56977/jicce.2024.22.4.288

Copyright © Korea Institute of Information and Communication Engineering.

Enhancing Transformer-based Cooking Recipe Generation Models from Text Ingredients

Khang Nhut Lam 1*, My-Khanh Thi Nguyen 1, Huu Trong Nguyen 1, Vi Trieu Huynh 2, Van Lam Le 1, and Jugal Kalita3

1Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam
2Research and Application Development Department, FPT University, Can Tho 94100, Vietnam
3Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, USA

Correspondence to:Khang Nhut Lam (E-mail: lnkhang@ctu.edu.vn)
Department of Information Technology, Can Tho University, Can Tho 94100, Vietnam

Received: July 9, 2024; Revised: September 16, 2024; Accepted: October 4, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Recipe generation is an important task in both research and real life. In this study, we explore several pretrained language models that generate recipes from a list of text-based ingredients. Our recipe-generation models use a standard self-attention mechanism in Transformer and integrate a re-attention mechanism in Vision Transformer. The models were trained using a common paradigm based on cross-entropy loss and the BRIO paradigm combining contrastive and cross-entropy losses to achieve the best performance faster and eliminate exposure bias. Specifically, we utilize a generation model to produce N recipe candidates from ingredients. These initial candidates are used to train a BRIO-based recipe-generation model to produce N new candidates, which are used for iteratively fine-tuning the model to enhance the recipe quality. We experimentally evaluated our models using the RecipeNLG and CookingVN-recipe datasets in English and Vietnamese, respectively. Our best model, which leverages BART with re-attention and is trained using BRIO, outperforms the existing models.

Keywords: Attention mechanism, BART, Recipe-generation model, Transformer

JICCE
Dec 31, 2024 Vol.22 No.4, pp. 267~343

Stats or Metrics

Share this article on

  • line

Related articles in JICCE

Journal of Information and Communication Convergence Engineering Jouranl of information and
communication convergence engineering
(J. Inf. Commun. Converg. Eng.)

eISSN 2234-8883
pISSN 2234-8255