Part 0.
Part 1.

So, by the end of the previous post on Mathematica and Graph Theory, we had managed to take our single string of text and turn it into a graph of ingredients where nodes linked by edges go well together. Now we actually want to do something useful with this data – ie. come up with some recipes!

We will refer in the following to foodgraph as the graph of all of the food pairings we started with.

We will define a recipe very loosely as a set of ingredients all of which go well together. This is clearly far from a real recipe, but it’s a pretty good starting point for one.

Within the graph, a set of ingredients which all go well together form what is known as a connected subgraph – each node is linked to every other node. It’s no good having a recipe where A goes well with B and B goes well with C, but A doesn’t go well with C.

A connected subgraph is known in graph theory language as a clique, and luckily Mathematica has a very simple syntax for us, however, we need to be a little careful. In fact when we run the command:

FindClique[foodgraph, {n}, All]

Will find all the maximal cliques of size n in the graph foodgraph. Note however that this will find only maximal cliques. This means that it will not find any cliques which themselves are subgraphs of larger cliques, but in fact these would be rather useful too.

It turns out that the largest clique in our graphs are of size 8, and there are a total of four of these:

  • basil, fennel, marjoram, onion, oregano, parsley, rosemary, thyme
  • basil, fennel, garlic, marjoram, oregano, parsley, rosemary, thyme
  • allspice, cardamom, cinnamon, clove, coriander, cumin, ginger, nutmeg
  • allspice, cinnamon, clove, coriander, cumin, ginger, mace, nutmeg

We can see immediately from this that there are some pretty clear missing food pairings in our original string. For instance we can see that onion and garlic have not been paired, else there would have been the combined clique containing the union of the first two sets. Similarly mace and ginger have not been paired as the same would have been true for the second two cliques.

For any cooks of European-based cookery out there none of these ingredient lists would be very surprising. We need to dig a bit further to find some more unusual combinations. For instance, digging down we can find a clique of size 4 which is:

cilantro, orange, basil, mint – an unusual but intriguing combination.

To find all the cliques of size 7, running the command

FindClique[foodgraph, {7}, All]

Isn’t quite right as this will not include subgraphs of the cliques of size 8.

The size seven cliques are thus given by:

size7cliques=Join[FindClique[gg, {7}, All], Flatten[Table[Delete[#, n] // Sort,{n, 8}]&/@FindClique[gg, {8}, All], 1] // Union] // Union

Where we have added to the maximal cliques of size 7, all size 7 subgraphs of the cliques of size 8. Getting down to smaller cliques requires simply an iteration of the above line. Plotting the (log of the) numbers of cliques of each size gives an intriguing plot:
cliquenumbers

This perhaps raises the question as to what we can learn about a graph structure by studying the form of this curve. I shall leave this as an interesting avenue for the reader…

Last time we also asked the question about the furthest distances between foods. These should (if the graph is really complete in terms of food pairings) tell us about foods which go badly together.

In order to find the foods that have the largest path between them in the graph, we actually need to ask for the shortest path between all pairs of ingredients.

In order to do this we first need to define all the ingredients in the graph:

vl = VertexList[foodgraph]

we can then create a table of shortestpaths between all ingredients:

l1=Length[vl]

sp=Flatten[Table[{{vl[[m]], vl[[n]]}, Length[FindShortestPath[foodgraph, vl[[m]], vl[[n]]]]}, {m, l1}, {n, l1}], 1]

The output of this is of the form:

{{{apple, sesame}, 3},

{{apple, red cabbage}, 3},

{{apple, passion fruit}, 3},

{{apple, fenugreek}, 4},

{{apple, lemon balm}, 4},

{{apple, pork}, 4}}

this says that the shortest path between apple and sesame is 3 and between apple and fenugreek is 4. In fact we can see immediately that the food pairings list is woefully incomplete as apple and pork definitely go well together…anyway, we shall experiment nonetheless and get a general idea of what this particular food list can come up with.

To find the longest path between ingredients we simply need:

longest = Sort[sp, #1[[2]] > #2[[2]] &][[1, 2]]

The answer to this is in fact 6 – ie. the longest path between any two ingredients is 6 nodes (including the two ingredients themselves.

We can then ask for all of those pairs in the list sp that are 6 nodes apart:

furthest=Select[sp, #[[2]] == longest &][[;; , 1]] // Sort // Union

there are a total of 266 food pairs which are 6 ingredients apart. Here’s a random selection of these:

RandomSample[furthest, 5]

gives:

{{corn, huckleberry}, {white chocolate, olives}, {grains, jasmine}, {legumes, white chocolate}, {huckleberry, other citrus}}

We can ask to see the chain of ingredients which include these. For instance:

PathGraph[FindShortestPath[foodgraph, “white chocolate”, “olives”], VertexLabels -> “Name”, ImagePadding -> 100, ImageSize -> 600]

gives

chainAs pointed out by Robert Spencer in a comment in the first post on this topic, one could use such a longest path to create a multi-course menu.

Using

findcliquewith[ingredients_] :=
       Module[{nit, allcliques, recipe, cliqueswithfood1},
       allcliques = FindClique[gg, 8, All];
       DeleteCases[If[Intersection[#, Sort[{ingredients} // Flatten]] == Sort[{ingredients} // Flatten], #] & /@ allcliques, Null]

]

we can ask for:

findcliquewith[#][[1]] & /@ With[{x = FindShortestPath[gg, “white chocolate”, “olives”]}, Partition[Riffle[x, x][[2 ;;]], 2]] // Reverse // MatrixForm

which gives as an output:

first course: {fennel, basil, chives, olives},

second course: {cilantro, clove, coriander, cumin, ginger, fennel},

third course: {cardamom, cilantro, clove, coriander, cumin, ginger, citrus},

fourth course: {citrus, gooseberry, lychee},

fifth course: {gooseberry, white chocolate}

where each course includes two consecutive foods in the path between olives and white chocolate, thus taking the diner between two foods which are apparently far apart in terms of pairings.

In the next post we will ask about finding the minimum dominating subgraph for the whole of the food graph.

How clear is this post?