Open access peer-reviewed chapter

A Food Recommender Based on Frequent Sets of Food Mining Using Image Recognition

Written By

Thunchanok Tangpong, Somkiet Leanghirun, Aran Hansuebsai and Kosuke Takano

Submitted: October 7th, 2020 Reviewed: March 12th, 2021 Published: April 19th, 2021

DOI: 10.5772/intechopen.97186

Chapter metrics overview

350 Chapter Downloads

View Full Metrics


Food recommendation is an important service in our life. To set a system, we searched a set of food images from social network which were shared or reviewed on the web, including the information that people actually chose in daily life. In the field of representation learning, we proposed a scalable architecture for integrating different deep neural networks (DNNs) with a reliability score of DNN. This allowed the integrated DNN to select a suitable recognition result obtained from the different DNNs that were independently constructed. The frequent set of foods extracted from food images was applied to Apriori data mining algorithm for the food recommendation process. In this study, we evaluated the feasibility of our proposed method.


  • food recommender
  • food data mining
  • image recognition
  • deep neural network
  • data mining algorithm

1. Introduction

People are now consuming more high energy foods, fats and meat, and most of them do not eat enough fruit, vegetables and other dietary fiber. The make-up of a diversified and combined food will vary depending on individual characteristics such as age, gender, lifestyle and degree of physical activity, cultural context, locally available foods and dietary customs. While the increase of food services plays an important role in food market and business, the food service industry is a vital part of economy [1, 2, 3]. The business relies on its management to control costs, keep customers happy, and ensure smooth operations on a daily basis. There are many different types of food service types or procedures, but the major category of the food service is Buffet and Family style services.

Being an industry that serves the human needs, the food service is always the forefront of innovation. Even the food safety practices have been continuously updated along with legislation, the service is still facing a number of issues such as food technologies and consumer trends. For example, a customer wants to know the food information in order to have a set of food on the table. Foreigners who are not familiar with the local foods would like to enjoy having foods in a common style in those countries. In addition, a food designer is seeking a new decoration idea for the beautification of foods on plate. Food recommendation therefore is an important tool to enrich our life. It can be defined as a system that will recommend items to the users/customers within an environment depending on their past activities.

There was demonstration that digital imaging could estimate food information in many environments and it had many advantages over other methods [4, 5]. However, to derive the food information such as food type, food combination and portion size from food images remains uncertainty.

Accordingly, to achieve better food recommendation, it would be useful to analyze foods that people are actually eating in daily life. POS (Point of Sale) is a large-scale transaction data relevant to the customer’s purchase tendency [6]. The data is used only by individual store and not open for public. Therefore we cannot analyze the food purchase data among different stores, restaurants, canteens, and so on. Amazon Go is a smart store where a purchase transaction can be detected by a camera. Comparing with the POS system, the Amazon Go provides an automatic management of information about the foods that people bought, including the items that associated with those products and the appearance of each item with individual preference [7]. In addition, the system can predict the expectation of the market. Note that this kind of Amazon-Go-like system has similar constraint for collecting big data. The obtained database from different sources thus is varied. Therefore we can say that there is a limitation of integrating the purchase transaction over different database as shown in Figure 1.

Figure 1.

Limitation of integrating the purchase transaction over different database.

From the diagram, it seems to be meaningful to create a system that analyzes the big data of food-images from various communities including companies, restaurants, and groups in social network system for extracting the people’s preference of food combination, food design, and food appearance by applying the image recognition technology. In the field of learning representation, there are many established models such as Artificial Neural Network (ANN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN).

ANN is a broad term that encompasses any form of deep learning model. It can be either shallow or deep depending on the number of hidden layers. CNNs are designed specifically for computer vision. They are different from standard layers of ANNs as they are constructed to receive and process pixel data. RNNs are the “time series version” of ANNs. They are meant to process the sequences of data. They are at the basis of forecasting models and language models. The most common kind of recurrent layers are called LSTM (Long Short Term Memory) and GRU (Gated Recurrent Units). They contain a series of small, in-scale ANNs that are able to choose the existed information they want to let it flow through the model. That is how they establish the “memory”.

However, some problems for food recognition by these models may occur as there are tremendous number of foods. In addition, it remains a challenging issue due to the complexity of emotional expressions, arising from the food variety, gender differentiation, cross culture and age-related differences [8, 9]. Thus, it is hard to find the proper model that can recognize all of them.


2. Related work

Food image recognition is one of the promising applications of visual object application, as it will help estimate food characteristics and analyze people’s eating choices for daily life. Many research works represented food recognition more practical by using the convolutional neural network (CNN) model [10, 11, 12]. CNN was applied to the tasks of food detection and recognition through parameter optimization. A dataset of the most frequent food items was constructed in a publicly available food-logging system. The CNN showed significantly higher accuracy than a conventional method did. In addition, the color feature is not always helpful for improving the accuracy by comparing the results of two group of controlled trials. It was reported that the achievement of CNN model was at 70–80% on one dataset and 60% on the multi-food dataset. The improvements could be expected by collecting more images and optimizing the network architecture and hyper-parameters.

For example, Deep Convolutional Neural Network (DCNN) was introduced for food recognition based on a combination of CNN-related techniques such as pre-training with the large-scale ImageNet data, fine-tuning and activation features extracted from the pre-trained CNN [13, 14]. Another approach was based on two main steps: firstly, to produce a food activation map on the input image (i.e. heat map of probabilities) for generating bounding boxes proposals and, secondly, to recognize each of the food types or food-related objects presented in each bounding box [15]. Interestingly, the Max-Pooling function was used for the data and the features extracted from this function were used to train the network. An accuracy of 86.97% for the classes of the FOOD-101 data set was recognized [16]. It was found that the image classification could be extended using prominent features that could categorize food images. Note that the feature-based approach and the multi-level classification approach (hierarchical approach) were highly appreciable to avoid mis-classifications when the number of classes was increased. However, these methodologies consumed high computational time.

2.1 Concept of convolutional neural network (CNN)

Convolutional neural network is a network that employs a mathematical operation called convolution. There are two main processes in CNN architecture – Learning extraction and Classification [17].

Step-1: Learning extraction

This process executes feature extraction from images through the following three layers -

  1. Convolution layer: this is the first layer to extract features from an input image. There are matrix filters (feature map) that multiplies with image in order to extract some features such as edge, blur, and color.

  2. Activation layer: this layer is used to increase non-linearity of the network without affecting receptive fields of convolution layer. The output is ƒ(x) = max(0,x).

  3. Pooling layer: this layer reduces the number of parameters when the images are too large. This operation means that the layer reduces the dimensionality of each map but keeps the important information.

Step-2: Classification

This process executes image classification. There are two main layers as follows:

  1. Fully connected layer (FC layer): to connect every neuron in one layer to every neuron in another layer. We flattened our matrix into vector and feed it into a fully connected layer like neural network.

  2. Softmax layer: A special kind of activation layer, usually at the end of FC layer outputs. It produces a discrete probability distribution vector.

At this stage, the pretrained models such as AlexNet, VGGNet, GoogLeNet, and ResNet will be applied in order to integrate the CNNs. The procedure is to merge the multiple CNNs with fixed number of CNNs and trained them after merging.

2.2 Association rule

Association rule is utilized in data mining phase as it is a machine learning method which finds the relationship between the items based on the frequency of item sets [18, 19]. To do this, a mathematical algorithm is needed to arrange the frequent item sets, for extracting association rule of food sets.

There are three measures in association rule task as shown in Figure 2:

  1. Support: It indicates how often the items appear in the data-set.

  2. Confidence: It indicates how often a rule is found to be true.

  3. Lift: It indicates how the antecedent and the consequent are related to one another. If the Lift value is 1, it means that the antecedent and the consequent are independent. If the Lift value is less than 1, it means that the antecedent has negative effect on occurrence on the consequent. In addition, if the Lift value is more than 1, it means that the antecedent and the consequent are dependent.

Figure 2.

Basic terminologies of association rule measure.


3. Proposed method

As the uncertainty of convolutional architecture due to the fix number of CNNs and the interpretability after merging of CNNs [20]. Simpler and more highly scalable method to integrate the multiple networks is necessary and therefore there are three features in consideration - liability score for the system, no further training after integration, and high scalability. Deep neural network (DNN) may be an alternative choice to solve this problem [21]. It provides the neural basis for efficient visual object recognition in humans. Its architecture comprises more than three layers. Each of layers contains a combination of convolution, max-pooling and normalization stages, whereas these layers are fully connected. It inherently fuses the process of feature extraction with classification into learning using Fuzzy Support Vector Machine (FSVM) and enables the decision making [22, 23]. Many efforts were made to speed up both the training time as well as inference time of DNNs. Since the flexibility and the performance scalability to deal with various types of networks are crucial requirement of the DNN accelerator design [24], we therefore proposed a scalable architecture for integrating different deep neural networks with a reliability score to increase the probability to return correct class as the result of food recognition. The reliability score allowed the integrated DNN to select a suitable recognition result obtained from the different DNNs that were independently constructed. In this study, we evaluated the feasibility of our proposed method.

The frequent set of foods extracted from food images was applied to a data mining algorithm such as the Apriori algorithm for the food recommendation process.

Figure 3 shows the schematic diagram of food recommender. This system would recommend a set of foods suitable for users based on association rule, which were extracted by food set mining from food images.

Figure 3.

Schematic diagram of food recommender.

The process of food recognition and extraction of association rule is given in Figure 4. Firstly, the food set images from database were collected. Secondly, the reliability score was calculated by Eq. (1). Then the food recognition and segmentation from the images were operated using DNNs. Finally, the extraction of association rule was obtained as the frequent food sets.

Figure 4.

Process of recognition of food set images and extraction of association rule.


where A, C, I, and D represented the accuracy of the learning process, number of classes, number of leaning images, and difference between learning data and test data, respectively.

We combined several DNNs and calculated new probability with reliability scores as shown in Figure 5. Each DNN has different reliability score depending on following factors:

  • accuracy of the learning process

  • number of classes

  • number of learning images

  • difference between learning data and test data

Figure 5.

The feature of proposed method.

If the learning data and the test data are similar, the accuracy will be high but it may be not reliable. As we test with the unseen data, the DNN may predict wrong answer. Therefore, the difference between the learning data and the test data is necessary.

We calculated the reliability score and the weight output (W) of fully connected layer as the new probability using Eq. (2). The label of the new probability as the predicted result was provided.


Accordingly, the food set could be recognized from food images using different DNNs with the reliability scores as shown in Figure 6. Based on the recognition results, transactions of food were formed and applied to conventional association in the phase of data mining.

Figure 6.

Workflow of proposed food recommender.


4. Experimental and evaluation

We implemented our prototype of merging the results from different DNNs using MATLAB. The performance of recognition accuracy and the relationship between the recognition accuracy and the number of classes were evaluated. We used the dataset of food images for creating the DNNs and evaluated them as shown in Table 1. Three DNNs with different number of recognition classes for each dataset were created as given in Table 2. In addition, we made each DNN from Scratch by applying the transfer learning using GoogLeNet. Accordingly, we totally had 18 DNNs in this experiment.

Type of datasetDatasetNumber of classesNumber of images

Table 1.

Dataset of food images.

DatasetNumber of classesScratchGoogLeNet

Table 2.

Deep neural network with different number of classes.

Recognition accuracy was calculated using the following equation:


where TP, TN, FP, and FN were True Positive, True Negative, False Positive, and False Negative respectively.

To evaluate the performance of integrated deep neural network, we created two single DNNs and four integrated DNNs as shown in Table 3. The UNIMIB2016 dataset for the test was employed. Recognition accuracy then was calculated using Eq. (3).

No.Number of DNNsType of DNNDataset
22GoogLeNetFood-101 + UEC-256
33GoogLeNetFood-101 + UEC-256 + Fruit-360
52ScratchFood-101 + UEC-256
63ScratchFood-101 + UEC-256 + Fruit-360

Table 3.

Integrated deep neural network.

To evaluate the accuracy of extracted frequent food set, we employed three food labels as shown in Table 4 for extracting the association rule.

Types of food label
1Predicted label by integrated DNN using GoogLeNet
2Predicted label by integrated DNN made from Scratch
3Corrected label (no prediction)

Table 4.

Food labels used for association rule.


5. Results and discussion

Figure 7 shows the recognition accuracy of DNN by food-101 dataset, uec-256 dataset, and fruit-360 dataset respectively. There was a relationship between the recognition accuracy of DNN and the increase of the number of class for each database. It was found that the recognition accuracy of DNN slightly decreased according to the increase of the number of recognition classes. In addition, the performance of DNNs where GoogLeNet applied was higher than the DNNs made from Scratch.

Figure 7.

Recognition results of food-101, uec-256 and fruit-360.

Figure 8 shows the result of integrated deep neural network. It implied that if the number of networks was increased, the recognition accuracy of the integrated DNNs would be enhanced. In addition, it was found that the performance of DNNs where GoogLeNet applied was higher than the DNNs made from Scratch. It is because the proposed reliability score allowed the integrated DNNs to select a suitable recognition result obtained from the three different networks, which were constructed from the food-101, the uec-256 and the fruit-360 dataset. Note that although the DNN constructed from the fruit-360 dataset showed the high accuracy of 97.72% for 10 classes recognition, it could not return a suitable result for an image of general food other than a fruit. This is the disadvantage of a DNN constructed from a specific image dataset.

Figure 8.

Result of integrated deep neural network.

Tables 57 show the results of food labels where the association rules were sampled from the whole results. It was found that the number of bad and good results for “Kiwi, Doughnut” was almost equal. While “Pineapple mini and Ice-cream” in Table 6 gave better results. It was confirmed that the number of good results in the integrated networks using GoogleNet was preferable. In Table 7, “Steak and French fries” showed good result relevant to the test data.

Kiwi, DoughnutPineapple, OrangeGood
Kiwi, DoughnutOrange, Miso soupBad
Kiwi, DoughnutSpaghetti Bolognese, OrangeGood
Kiwi, DoughnutOrange, Spam musubiBad
Kiwi, DoughnutPineapple, Miso soupBad
Kiwi, DoughnutPineapple, Spaghetti BologneseGood
Kiwi, DoughnutPineapple, Spam musubiBad
Kiwi, DoughnutSpaghetti Bolognese, Miso soupBad
Kiwi, DoughnutSpam musubi, Miso soupGood
Kiwi, DoughnutSpaghetti Bolognese, Spam musubiBad

Table 5.

Results of integrated DNNs made from Scratch.

Pineapple mini, Ice-creamApple red/yellowGood
Pineapple mini, Ice-creamCantaloupeGood
Pineapple mini, Ice-creamCocosGood
Pineapple mini, Ice-creamFrench loafGood
Pineapple mini, Ice-creamGranadillaBad
Pineapple mini, Ice-creamGrapefruit pinkGood
Pineapple mini, Ice-creamKiwiGood
Pineapple mini, Ice-creamPeach flatGood
Pineapple mini, Ice-creamPeach abateGood
Pineapple mini, Ice-creamPitahaya RedGood

Table 6.

Results of integrated DNN using GoogLeNet.

Steak, French friesBanana, Roll breadGood
Steak, French friesBaklava, OrangeGood
Steak, French friesRoll bread, OrangeGood
Steak, French friesSpaghetti, OrangeGood
Steak, French friesBaklava, Roll breadBad
Steak, French friesSpaghetti, BaklavaGood
Steak, French friesSpaghetti, Roll breadGood
Steak, French friesYogurt, Roll breadGood
Steak, French friesSpaghetti, YogurtGood
Steak, French friesSpaghettiGood

Table 7.

Results of corrected labels.


6. Conclusion

In food recognition phase, integrated networks (DNNs) showed higher recognition accuracy (80%) than a single network. Since the proposed reliability score allowed the integrated networks to select a suitable recognition result obtained from the different network with different domains. The performance of networks where GoogLeNet applied gave higher recognition accuracy. In addition, it was found that when we used the test dataset different from the trained dataset, we could not get the suitable results.

In Data mining phase, we could extract some meaningful rules by applying the Apriori algorithm to recognize the results of canteen image dataset. In our future work, we will modify this system in recognition phase and will increase the performance of the networks. We will evaluate the effectiveness of modified system using bigger size of food data. In addition, we are developing visual food mining using mixed model of DNN and RNN (recurrent neural networks) for continuing our research.



This research is a part of academic cooperation between Chulalongkorn University, Thailand and Kanagawa Institute of Technology (KAIT), Japan. Authors would like to thank KAIT for providing scholarship and deeply appreciate Professor Kosuke Takano for his advice and facility in his Laboratory at KAIT.


Conflict of interest

The authors declare no potential conflict of interest.


  1. 1. Johns N., Pine R. Consumer behavior in the food service industry: a review. Int. J. of Hospitality Management.2002;21(2):119-134
  2. 2. Gwo-Hshiung T., Hung-Fan C. Applying Importance-Performance Analysis as a Service Quality Measure in Food Service Industry. J. of Technology Management & Innovation. 2011;6(3). DOI: 10.4067/S0718-27242011000300008
  3. 3. Abdullah F., Abdurahman, A., Hamali J. Managing Customer Preference for the Foodservice Industry. Int. J. of Innovation, Management and Technology. 2011;2(6):525-533
  4. 4. Zubair Hasan H.M., Khan H., Asif T., Hashmi S., Rafi M. Towards a transfer learning approach to food recommendations through food images. In: Proceedings of the 3rd International Conference on Machine Learning and Soft Computing. January 2019. Da Lat, Viet Nam, p. 99-105
  5. 5. Liu C., Cao Y., Luo Y., Chen G., Vokkarane V., Ma Y. DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In: Proceedings of Int. Conference on Smart Homes and Health Telematics. May 2016, p. 37-48
  6. 6. Inman J.J., Nikolova H. Healthy Choice: The Effect of SimplifiedPoint-of-Sale Nutritional Information on Consumer Food Choice Behavior. Journal of Marketing Research. 2015;52(6):817-835
  7. 7. Polacco A., Backes K. The Amazon Go Concept: Implications, Applications and Sustainability. Journal of Business and Management. 2018;24(1):79-92
  8. 8. Ding N., Sethu V., Epps J., Ambikairajah E. Speaker variability in emotion recognition-an adaptation based approach. In: Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE 2012. p. 5101-5104
  9. 9. Mill A., Allik J., Realo A., Valk R. Age-related differences in emotion recognition ability: a cross-sectional study. Journal of Emotion. 2009:9(5):619
  10. 10. Zhang W., Zhao D., Gong W., Li Z., Lu Q., Yang S. Food Image Recognition with Convolutional Neural Networks. In: Proceedings of 2015 IEEE 12th Intl Conference on Ubiquitous Intelligence and Computing. Beijing, China, 10-14 August, 2015
  11. 11. Kagaya H., Aizawa K., Ogawa M. Food Detection and Recognition Using Convolutional Neural Network. In: Proceedings of the 22nd ACM international conference on Multimedia. November 2014. p. 1085-1088
  12. 12. Ng Y.S., Xue W., Wang W., Qi P. Convolutional Neural Networks for Food Image Recognition: An Experimental Study. In: Proceedings of the 5th International Workshop on Multimedia Assisted Dietary Management. October 2019. p. 33-41
  13. 13. Kowano Y., Yanai K. Food image recognition with deep convolutional features. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. September 2014, p. 589-593
  14. 14. Yanai K., Kawano Y. Food image recognition using deep convolutional network with pre-training and fine-tuning. In: Proceedings of 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). June 2015
  15. 15. Bolanos M., Radeva P. Simultaneous Food Localization and Recognition. In: Proceedings of 23rd International Conference on Pattern Recognition (ICPR). December 2016
  16. 16. Attokaren D.J., Fernandes I.G., Sriram A., Srinivasa Murthy Y.V., Koolagudi S.G. Food classification from images using convolutional neural networks. In: Proceedings of 2017 IEEE Region 10 Conference (TENCON), Malaysia, November 2017, p. 2801-2806
  17. 17. Saha S. A Comprehensive Guide to Convolutional Neural Networks – the ELI5 way. Available from: http// [Accessed: 2020-06-12]
  18. 18. Ziauddin Z., Kammal S., Khan K.Z., Khan M.I. Research on Association Rule Mining. Journal of Advances in Computational Mathematics and its Applications. 2012;2(1):226-236
  19. 19. Traore B.B., Kamsu-Foguen B., Tangara F. Deep Convolution Neural Network for image recognition. Journal of Ecological Informatics. 2018;48:257-268
  20. 20. Wickstrøm K., Kampffmeyer M., Jenssen R. Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps. Journal of Medical Image Analysis. 2020; 60: 101619
  21. 21. Zhou L., Zhang C., Liu F., Qiu Z., He Y. Application of Deep Learning Food: A review. Journal of Comprehensive Reviews in Food Science and Food Safety. 2019;18(6):1793-1811
  22. 22. Cichy R.M., Khosla A., Pantazis D., Torralba A., Oliva A. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Journal of Scientific Reports. 2016; 6(1):27755
  23. 23. Verma J.K., Paul S., Johr P. In: Book - Computational Intelligence and Its Applications in Healthcare. Academic Press. 2020. ISBN 978-0-12-820604-1
  24. 24. Sakamoto R., Takata R., Ishii J., Kondo M., Nakamura H., Ohkubo T., Kojima T., Amano H. The Design and Implementation of Scalable Deep Neural Network Accelerator Core. In: Proceedings of 2017 IEEE 11th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC). Seoul, Korea. September 2017

Written By

Thunchanok Tangpong, Somkiet Leanghirun, Aran Hansuebsai and Kosuke Takano

Submitted: October 7th, 2020 Reviewed: March 12th, 2021 Published: April 19th, 2021