Duplicated Cooking Recipe Determination Using Multimodal Information

Duplicated Cooking Recipe Determination
Using Multimodal Information
March 19, 2020
●Nguyen The Tung1, Yuki Nakayama2
1Nara Institute of Science and Technology
2Rakuten Institute of Technology, Rakuten, Inc.

2
Duplicate Recipe Detection
nTask description:
ØGiven a new recipe, decide whether it is duplicated with a recipe in
the database or not.
nMethod:
ØStep 1: From the database 𝑅"# and the new recipe 𝑟, generate
candidates of duplication 𝑅% ⊂ 𝑅"# ,
• we use the work from previous intern [Oguni+ 2018]
ØStep 2: Decide whether a pair of 𝑟, 𝑟′ , 𝑟′ ∈ 𝑅% is duplicate or not.
• Our work in this paper – duplicated recipe determination
[Oguni+ 2018] Masaki Oguni, Lasguido Nio, Yu Hirate, and Yohei Seki. Method for Detecting Near-duplicate
Recipes Based on Nearest Neighbor Search for Features of Cooking Instructions and Food Images (in Japanese)

3
Related work and Proposal
n Deriving a Recipe Similarity Measure for Recommending Healthful
Meals [van Pinxteren +2011]
Ø Features: cooking instruction + ingredients
Ø Human makes decision based on features
n Clustering for Closely Similar Recipes to Extract Spam Recipes in
User-generated Recipe Sites [Hanai+ 2015]
Ø Features: ingredients
Ø Cluster recipes into groups
n Our proposal: Treat as classification
Ø detecting duplicate recipes based on Multi-Layer Perceptron
Ø The classifier uses similarity scores of cooking instruction (text), ingredients
(text), user ID, and the result photo (image)

4
Previous work’s pipeline (Step 1)
Ingredients Food ImageCooking Instruction
Extract Text Vector Extract Image Vector
Database
Nearest Neighbor Search (NGT)
Database
Nearest Neighbor Search (NGT)
Extract candidates of
original recipes
Extract ingredients of
original recipes
Duplicate candidates
Calculate ingredient
similarity
New Recipe
[Oguni+ 2018]NGT: Neighborhood Graph
and Tree [Iwasaki 2010]

5
How to extract text vector: SCDV [Mekala+ 2017]
n Sparse Composite Document Vectors (SCDV)
n Output vector (dimension=10,000) represents the cooking
instruction of each recipe.
n Vector dimension is too high à use Principal Component Analysis
(PCA) to reduce dimension to 2000.

6
How to extract image vector:
Inception-V3 [Szegedy+ 2016]
n Convolutional Neural Network model to recognize generic object
n We extract 2,048 dimensional vectors from images
n We use pre-trained inception-v3 model for ImageNet competition
input
output
2,048 dimensional vector

7
How to calculate ingredients similarity
𝐼𝑛𝑔_𝐴
∗
= {にんにく, ⽶, ねぎ, いくら, 砂糖}
𝐼𝑛𝑔_𝐵
∗
= {ニンニク, ライス, ネギ, いくら, 砂糖}
*A is a near-duplicate recipe candidate
B is a original recipe candidate
1. Find common ingredients between Ing_A and Ing_B (Intersection set)
𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛 = 𝐼𝑛𝑔_𝐴 ∩ 𝐼𝑛𝑔_𝐵 = {いくら, 砂糖}
2. Create new sets by removing common ingredients
𝐼𝑛𝑔_𝐴’ = {にんにく, ⽶, ねぎ}, 𝐼𝑛𝑔_𝐵’ = {ニンニク, ライス, ネギ}
3. Convert to Katakana each ingredient
𝐼𝑛𝑔_𝐴’_𝑘 = {ニンニク, コメ, ネギ}, 𝐼𝑛𝑔_𝐵’_𝑘 = {ニンニク, ライス, ネギ}
4. Find common ingredients between Ing_A’_k and Ing_B’_k, add to Intersection.
𝐼𝑛𝑔_𝐴’_𝑘 ∩ 𝐼𝑛𝑔_𝐵’_𝑘 = {ニンニク, ネギ}
5. Create new sets by removing common ingredients
𝐼𝑛𝑔_𝐴’’ = {⽶}, 𝐼𝑛𝑔_𝐵’’ = {ライス}
𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛 = {いくら, 砂糖, ニンニク, ネギ}

8
How to calculate ingredients similarity
6. Search similar ingredients of each ingredient of Ing_A’’ using word2vec model
trained by 1.16 million recipe data (training data).
If the system finds ingredients of Ing_B’’ in top 3 search result, we consider as
the same ingredient; add it to Intersection.
・Similar word of “⽶” 𝑆𝑖𝑚𝑖𝑙𝑎𝑟 𝑖𝑛𝑔𝑟𝑒𝑑𝑖𝑒𝑛𝑡𝑠 = {こめ, コメ, ライス, レンジ, 五穀⽶}
𝐼𝑛𝑔_𝐴’’ ∩ 𝐼𝑛𝑔_𝐵’’ = {⽶ (ライス)}
𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛 = {いくら, 砂糖, ニンニク, ネギ, ⽶}
7. Create new sets by removing common ingredients.
𝐼𝑛𝑔_𝐴’’’ = 𝜑 , 𝐼𝑛𝑔_𝐵’’’ = 𝜑 , 𝑖𝑛𝑔𝑟𝑒𝑑𝑖𝑒𝑛𝑡 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 = 𝐼𝑛𝑔_𝐴’’’ + 𝐼𝑛𝑔_𝐵’’’
8. Make Union set and calculate the Jaccard similarity.
𝑈𝑛𝑖𝑜𝑛 = 𝐼𝑛𝑔_𝐴’’’ ∪ 𝐼𝑛𝑔_𝐵’’’ ∪ 𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛
𝑖𝑛𝑔𝑟𝑒𝑑𝑖𝑒𝑛𝑡 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 =
𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛
𝑈𝑛𝑖𝑜𝑛

9
Detection Model using Multi-Layer Perceptron (MLP)
n MLP is a simple yet powerful classification model used in various
tasks.
n The task of duplication detection can be viewed as classification,
with two class: duplicate, non-duplicate.
Input features
duplicate
non-duplicate

10
Experiment: Dataset
n 1847 pairs of recipe extracted from database
Ø Annotate the pairs with labels of duplication in terms of instruction,
ingredients, and image.
Ø If either one of those labels is duplicate à the pair is regarded as duplicate.
n Training dataset
Ø 1547 (1255 duplicate/292 non-duplicate) recipe pairs
n Development dataset
n Test dataset

11
Features extraction
n Instruction vector: SCDV à using PCA to reduce dimension to 2000.
n Image vector: dimension 2048 (Inception v3) .
n The features above are used as input of MLP, we trained a model
that classifies input into duplicate or non-duplicate.
Features type Description
Instruction Euclidean distance between two vectors
Image Euclidean distance between two vectors
Ingredient 1.0 − 𝑖𝑛𝑔𝑟𝑒𝑑𝑖𝑒𝑛𝑡 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 (p.8)
User
Identity function: 𝑠𝑐𝑜𝑟𝑒 = N
0, 𝑢𝑖𝑑1 = 𝑢𝑖𝑑2
1, 𝑢𝑖𝑑1 ≠ 𝑢𝑖𝑑2

12
Results: Our method vs. Previous method
n Our method
n Previous method
[Oguni+ 2018 ]
Ø pick top k pairs with highest
similarity score and assign
them as “duplicate”, the
remaining are “non-duplicate”
Ø With all values of k, the
proposal outperforms the
previous method
Positive label Negative label
Recall Precision F1 Recall Precision F1
0.99 0.92 0.95 0.91 0.99 0.95
Duplicate Non-
duplicate
Duplicate 99 9
Non-
duplicate
1 91
k Recall@k Precision@k F1@k Recall@k Precision@k F1@k
20 0.15 0.75 0.25 0.53 0.95 0.68
40 0.34 0.85 0.49 0.59 0.94 0.72
60 0.47 0.78 0.59 0.62 0.87 0.73
80 0.54 0.68 0.60 0.60 0.74 0.67
100 0.64 0.64 0.64 0.64 0.64 0.64
120 0.67 0.56 0.61 0.59 0.47 0.52
140 0.73 0.52 0.61 0.55 0.33 0.41
160 0.82 0.51 0.63 0.55 0.22 0.31
Ground
truth
Prediction

13
Results: Effectiveness of features
n Both User and Image information help improve performance
compared to Instruction + Ingredient set.
n Jaccard coefficient is better at representing ingredient similarity
than ingredient difference.
n We achieved highest result when using all the features.
Features
Recall Precision F1 Recall Precision F1
Instruction + Ingredient
(ingredient difference at p.8)
0.89 0.60 0.71 0.40 0.78 0.53
Instruction + Ingredient 0.85 0.66 0.74 0.56 0.79 0.66
Instruction + Ingredient + User 0.94 0.90 0.92 0.90 0.94 0.92
Instruction + Ingredient + Image 0.97 0.91 0.94 0.90 0.97 0.93
Instruction + Ingredient + User + Image 0.99 0.92 0.95 0.91 0.99 0.95

14
Correct prediction examples
Ingredients:酒粕、砂糖、珈琲焼酎, ⽜乳.
Sake lees, sugar, shochu, milk.
Instruction: 鍋にすべての材料と⽔100g(分量外)を⼊れ、
あたためながら10分ほどとかしまぜてできあがり。
Put all ingredients and 100g of water (outside the amount)
into the pan and mix for about 10 minutes with warming.
Ingredients:酒粕, タイム, 砂糖.
Sake lees, thyme, sugar.
Instruction: 鍋にすべての材料と⽔200g(分量外)を⼊れ、
あたためながら10分ほどとかしまぜてできあがり.
Put all ingredients and 200g of water (outside the
amount) into the pan and stir for about 10 minutes.
Prediction: Duplicate
Ground truth: Duplicate

15
Correct prediction examples
Ingredients: ブロッコリー, ☆マヨネーズ, ☆プレーンヨーグルト, ☆塩コショウ
Broccoli, ☆ mayonnaise, ☆ Plain yogurt, ☆ Salt pepper)
Instruction:ブロッコリーは洗い、（⽔分は拭き取らずに）軸の太い部分は⼗字
に切り込みを⼊れラップでふんわり包みます. レンジに約３分かけて取り出し少
し冷めてから切り分けます（冷凍保存も可）☆を混ぜ低カロリーマヨネーズを作
り、付けていただきます。
Ingredients:ブロッコリー, 塩, *だし汁, *醤油, *⾟⼦.
Broccoli, salt, * dashi soup, *soy sauce, * pepper.
Instruction:ブロッコリーは⼩房に切って、塩を加えた熱湯で茹でてザルにあげ
る。*は合わせておく。1のブロッコリーと*を混ぜ合わせ、器に盛る。完成︕.
Put all ingredients and 200g of water (outside the amount) into the pan and stir
for about 10 minutes.
Prediction: Non-duplicate
Ground truth: Non-duplicate
Wash broccoli (without wiping away moisture) and cut the thick part of the shaft
into a cross and wrap it softly with a wrap. Take it out to the range for about 3
minutes, cool it a little, and cut it out (Frozen storage is also possible) ☆ Mix and
make low calorie mayonnaise.

16
Wrong prediction example
Ingredients:コーラス, バナナ⼩, プレーンヨーグルト, ⽔⽺羹,
きな粉. (Chorus, banana, plain yogurt, water sheep, kinako.)
Instruction:バナナは⼩さくちぎり、上記材料と⼀緒に全てミ
キサーにかけ、ジュース状になったら出来上がりです.
Bananas are chopped into small pieces and put into a mixer
with the above ingredients.
Ingredients:バナナ, プレーンヨーグルト, みかん, オリゴ糖,
りんごジュース. Banana, plain yogurt, mandarin orange,
oligosaccharidea, apple juice.
Instruction:バナナは⼩さくちぎり、上記材料と⼀緒に全てミ
キサーにかけ、ジュース状になったら出来上がりです.
Bananas are chopped into small pieces and put into a mixer
with the above ingredients
Prediction: Duplicate
Ground truth: Non-duplicate

17
Summary
n Conclusion
Ø Implement duplicate recipe detection system based on MLP.
Ø Our proposal outperforms the previous work significantly, reaching 95%
accuracy.
Ø Image and user information contributes to the task of predicting duplicate
recipe pairs.
n Future Work
Ø Investigating other kind of features.
Ø Expand our dataset.

19
Architecture
with Threshold
Ingredients Food Image
Cooking
Instruction
Extract
Text Vector
Extract
Image Vector
Database
Nearest Neighbor Search
(NGT)
Database
Nearest Neighbor Search
(NGT)
Text similarity >
threshold
Yes
Extract candidates
of original recipes
Extract ingredients
of original recipes
Image similarity
> threshold
We set the threshold to β
β = 0.94
We set the threshold to α
α = 0.9
System judges posted recipe as near-duplicate recipe
Calculate ingredient
similarity
Ingredient
similarity >
threshold
Posted Recipe
(Near-duplicate recipe candidate)
Appendix

20
Experiment on Tsukuba dataset
n The paper is not yet published à the results are not confirmed.
n Tsukuba dataset contains lots of conflicting samples (example below).
Appendix
Method Features Recall Precision F1
Tsukuba team
(Random Forest)
Instruction (n-gram mover distance)
+ ingredients (ingredient difference)
0.9 0.77 0.83
Our method Instruction + Ingredient 0.64 0.31 0.42
Ingredients: Clam, water, miso.
Instruction: Remove the salt in the pan, add
well-washed clams and water, and set it on
medium heat. When the clam opens, it will come
out and throw it away. When you put out the fire
and melt the miso, it ’s done
https://recipe.rakuten.co.jp/recipe/1190013605
Ingredients: Clam, water, miso.
Instruction: The clams are sanded out, rub the
shells and wash well. Put the clam and water in
the pan and bring it to a boil. When the clam
shell opens, the scoop will come out. Stop the
fire, melt the miso and let it stand for a while.
Duplicate
Ingredients: Mackerel, salt.
Instruction: Shake the mackerel and let it sit
for 10 minutes
Wipe off moisture
Bake for 10 minutes on the grill
263
Ingredients: Mackerel, salt.
Instruction: Lower 2 persimmons and finish 2
Sprinkle salt in the bowl and leave in the
refrigerator for 30 minutes
Wipe the water from the mackerel and bake for
about 9 minutes on the grilled fish.
Non-duplicate

21
Using raw subtraction vector as input features
n Instruction (raw): Absolute of subtraction between text vectors of
recipes in the recipe pair.
n Image (raw): Same as above, but use image vector.
n The resultant vectors were used as input features.
n We observe no significant difference compared to the features set
using Euclidean distance. On the other hand, computation time and
storage requirement increased by a large amount.
Appendix
Features Positive label Negative label
Accuracy Recall Precisio
n
F1 Recall Precisio
n
F1
Instruction + Ingredient
+ User + Image
95% 0.99 0.92 0.95 0.91 0.99 0.95
Instruction (raw) +
Ingredient + User
+ Image (raw)
95% 1 0.91 0.95 0.9 1 0.95

Duplicated Cooking Recipe Determination Using Multimodal Information

Recomendados

Recomendados

Más contenido relacionado

Similar a Duplicated Cooking Recipe Determination Using Multimodal Information

Similar a Duplicated Cooking Recipe Determination Using Multimodal Information (20)

Más de Rakuten Group, Inc.

Más de Rakuten Group, Inc. (20)

Último

Último (20)

Duplicated Cooking Recipe Determination Using Multimodal Information