by Elliott Morss, Morss Global Finance
In the six previous blind tastings of the Lenox Wine Club, a box wine has come in either first or second. Chianti was the focus of our 7th blind tasting. And once again, a 3-liter box wine got the highest score. In all this there is raging controversy about rating vs. scoring. Scoring allows taster to register intensity but introduces a degree of arbitrariness and allows taster to “play” the system.
On Robin – “look at the tails”
Ratings and scoring come out the same. Most discriminating tasters (spread of 4 or less) on two glasses from the same bottle also rated the Piccini box the highest.
Table 1, Comparative Results
Testing Tasters’ Competency
While there were 5 wines, we actually tasted one wine in two separate glasses. What is this all about? As I have described in an earlier posting, Robert Hodgson has his own winery and has been troubled by erratic ratings his wines had been receiving from judges at tastings. So he came up with a way to rate potential judges. The key to his method? Have the candidates do blind tastings that include more than one glass of the same wine. If the candidates do not score glasses of the same wine nearly the same, they are not competent to judge wines. Hodgson’s suggested overall scheme is quite rigorous: candidates must do four blind tastings of ten glasses each. We used his methodology in a less rigorous way: two glasses poured from the same bottle: we can’t be as sure this will single out incompetent tasters but the results are “indicative”.
On the Hodgson test – was GR “playing the system” when he scored 4 wines the same?
I have recently argued that most wines taste the same. I went on to suggest that consequently, the only really interesting wines are those we either really like or dislike. One way to see how the wines did on this basis is to count the number of times each wine got the top or bottom rank (6 or 1) with ties (5.5 and 1.5). This is done in Table 2.
Table 2 – The “Tails” Test
Notable about Table 1 are the number of high tails received by one of the two Piccinis and the absence of any high or low tails for our most expensive wine, the Castello.
What do we make of these findings and how definitive are they? Findings are more definitive if all tasters agree. Perhaps the best statistic for measuring taster’s agreement is the Kendall’s Tau: a higher number indicates greater uniformity among tasters. The Tau for our tasting was only 0.079 suggesting very little agreement among tasters.
But there is another way to look at this: what is the probability that at all 7 tastings, one of the five wines tasted would come in either first or second? The chances of this happening simply by chance are very low: (2/5)7 = 0.16%. So if you think of a box wine as a type of wine, having it come in first or second in seven consecutive tastings by chance is extremely low. By extension with a few logical leaps, one might conclude, say if you want to buy wine but don’t know what to get, buy a 3L box!
That having been said, two findings from the tastings are nevertheless quite striking:
- The consistently excellent performance of the box wines, and
- The consistently poor performance of the most expensive wines.
© Elliott R. Morss, November 2013 – All Rights Reserved