Javascript must be enabled to continue!
Towards automated recipe genre classification using semi-supervised learning
View through CrossRef
Sharing cooking recipes is a great way to exchange culinary ideas and provide instructions for food preparation. However, categorizing raw recipes found online into appropriate food genres can be challenging due to a lack of adequate labeled data. In this study, we present a dataset named the “Assorted, Archetypal, and Annotated Two Million Extended (3A2M+) Cooking Recipe Dataset” that contains two million culinary recipes labeled in respective categories with extended named entities extracted from recipe descriptions. This collection of data includes various features such as title, NER, directions, and extended NER, as well as nine different labels representing genres including bakery, drinks, non-veg, vegetables, fast food, cereals, meals, sides, and fusions. The proposed pipeline named 3A2M+ extends the size of the Named Entity Recognition (NER) list to address missing named entities like heat, time or process from the recipe directions using two NER extraction tools. 3A2M+ dataset provides a comprehensive solution to the various challenging recipe-related tasks, including classification, named entity recognition, and recipe generation. Furthermore, we have demonstrated traditional machine learning, deep learning and pre-trained language models to classify the recipes into their corresponding genre and achieved an overall accuracy of 98.6%. Our investigation indicates that the title feature played a more significant role in classifying the genre.
Public Library of Science (PLoS)
Title: Towards automated recipe genre classification using semi-supervised learning
Description:
Sharing cooking recipes is a great way to exchange culinary ideas and provide instructions for food preparation.
However, categorizing raw recipes found online into appropriate food genres can be challenging due to a lack of adequate labeled data.
In this study, we present a dataset named the “Assorted, Archetypal, and Annotated Two Million Extended (3A2M+) Cooking Recipe Dataset” that contains two million culinary recipes labeled in respective categories with extended named entities extracted from recipe descriptions.
This collection of data includes various features such as title, NER, directions, and extended NER, as well as nine different labels representing genres including bakery, drinks, non-veg, vegetables, fast food, cereals, meals, sides, and fusions.
The proposed pipeline named 3A2M+ extends the size of the Named Entity Recognition (NER) list to address missing named entities like heat, time or process from the recipe directions using two NER extraction tools.
3A2M+ dataset provides a comprehensive solution to the various challenging recipe-related tasks, including classification, named entity recognition, and recipe generation.
Furthermore, we have demonstrated traditional machine learning, deep learning and pre-trained language models to classify the recipes into their corresponding genre and achieved an overall accuracy of 98.
6%.
Our investigation indicates that the title feature played a more significant role in classifying the genre.
Related Results
The Histological Diagnosis of Breast Cancer by Employing scale invariant ResNet 18 With Spatial Supervised Technique
The Histological Diagnosis of Breast Cancer by Employing scale invariant ResNet 18 With Spatial Supervised Technique
Abstract
Background
Breast cancer is one of the most prevalent cause of morbidity and mortality in women all over the world. Hi...
Sentiment/tone (Automated Content Analysis)
Sentiment/tone (Automated Content Analysis)
Sentiment/tone describes the way issues or specific actors are described in coverage. Many analyses differentiate between negative, neutral/balanced or positive sentiment/tone as b...
Violin miniature in creativity by Liudmila Shukailo: features of the genre interpretation
Violin miniature in creativity by Liudmila Shukailo: features of the genre interpretation
Background. Rapidness of information flows of contemporary life enforces to concentrate a significant amount of information in small formats. This fact meaningfully increases socia...
A Supervised Machine Learning Algorithms: Applications, Challenges, and Recommendations
A Supervised Machine Learning Algorithms: Applications, Challenges, and Recommendations
Machine Learning (ML) is an advanced technology that empowers systems to acquire knowledge autonomously, eliminating the need for explicit programming. The fundamental objective of...
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using d...
Bentuk Campur Kode Dalam Buku Resep Mindy Cake & Cookies Karya Mindy Mot
Bentuk Campur Kode Dalam Buku Resep Mindy Cake & Cookies Karya Mindy Mot
Indonesians, in general, are bilinguals. This is because Indonesians have both mother tongue and second language which is used nationally. A person who can speak two languages or m...
A Recipe for "Blackened 'Other'"
A Recipe for "Blackened 'Other'"
When you sit down to eat your delicious meal, it's better that you don't know that most of what you are eating came off a plane from Miami. And before it got on a plane in Miami, w...
Advancements in Semi-Supervised Deep Learning for Brain Tumor Segmentation in MRI: A Literature Review
Advancements in Semi-Supervised Deep Learning for Brain Tumor Segmentation in MRI: A Literature Review
For automatic tumor segmentation in magnetic resonance imaging (MRI), deep learning offers very powerful technical support with significant results. However, the success of supervi...

