Sunday, February 6, 2011

A Statistician Compares the Fab Five

“I have never begun a novel with such misgivings,” Somerset Maugham penned as the first line of his celebrated Razor’s Edge. And as I key in my processed thoughts into the blog, I echo his sentiments to reverberation.

The point to cover blog has had more than 21500 visitors even as I write. Without the banner, blessing or banality associated with modern monolithic cricket sites. As an independent blogger I can conclude that it has built itself a quaint boutique of popularity amidst the flying debris of media mayhem that represents present day cricket writing. Yet, what I am on the verge of inflicting on readers may drive away several, goading them into clicking their getaways to  conventional semi sensible reportage till this particular post is relegated to the depths of forgotten cache.

I am about to unleash statistics. Of a pretty severe variety.  

And by doing so, I am going to compare the fab five legends of modern Indian batsmanship. Sachin, Sourav, Rahul, Sehwag and Laxman.

Beware, when I say statistics, it is something more than the representation of figures in various categories as dished out by Cricinfo and others. I promise, or rather threat, to be rigorously scientific.

For me, the lures of cricket are not limited to the heady mix of the flannels on green, the shiny red ball, the sonorous contact of the willow on the leather. The game also spills out of the arena, in the form of figures with the heavenly storehouse of data it carries in its wake, painstakingly tabulated scores and numbers. Even as I glance through the scorecards, averages, centuries and fifties of players who have responded to the final call and long passed beyond into the Garden of Eden where no over is called, numerous encounters come alive through the combination of dates, digits and ditties.

And here my post graduate degree hankers for more. My profile says that I have trained as a statistician from the Indian Statistical Institute. Well, the teachers did try their best and some of the wisdom did pass on to me in spite of my rather reluctant receptiveness.
Hence, to me statistics and data are not synonymous. With all the numbers that have been associated with the game and stored for nearly a century and a half, it seems criminal that statistical analysis of the performers is limited to average, strike rate, economy rate and categorisation.
Although, again according to my profile, I tend to cleanse my soul through writing, some severe sanitisation seems necessary in the statistical corners as well.

Along with Cricinfo’s Numbers Game column, there are similar and, once in a considerable while innovative, calculations carried out in cricket periodicals around the globe. However, I have never seen any debate being put to rest through robust use of Statistical Techniques.  Sadly S. Rajesh and company are limited either in knowledge or by editorial dictates catering to populist demands (my strong hunch is a combination of both) to bring in any statistical tests to conclusively prove or disprove assertions. All we see in pages of supposed statistical analysis are figures straight out of Statsguru, averages and trends. Data categorization by year, opponent, ground condition and batting position are just forms of representation. Believe me, Statistics offers much more than that in the form of inference.

For five long years, and through more than 50 painful examinations, I have had to deal with the pyro-techniques that can be performed with available data. After that, the sham that passes for analysis is painful to tolerate, but not absolutely unexpected. 
Rajesh and company at least take a couple of legitimate formulae – although seldom more than mean and strike rate – out of what can be best described as the most elementary statistics handbook for middle school. In contrast, in the business industry around me, I witness the abuse and plunder of the figures that amount to violation, torturing them to confess falsehoods using misguided, misinformed, misappropriate methods – making my Statistician’s conscience  suffer as much as the data that is disfigured.

I understand that much of the papers are sold and page hits generated by stimulating debates of a largely emotional fan following. Using statistical tools to lay such arguments to rest can be tantamount to self mutilation.
The comparison between the merits and performance of a Lara and a Sachin, for example, is once in a while fanned by rational arguments, but almost always abundantly laced or replaced by preconceived notions, yellow taint of the influencing media, respective allegiance, patriotism or the lack thereof, and often a heavy dose of individual impotence and frustration. Sometimes averages are compared, but very soon the parameters for judgement veer off dangerously into areas far removed from the twenty two yards, with endorsements and auctions coming into the equation, with moral, material, metaphysical questions creeping up to assess batsmanship! A rational judgement is almost as difficult to obtain as a remedy to the Arab-Israel conflict. So, one expects heated arguments, mountains of evidence and counter evidence, expert opinions, heaps of abuse – often passing without moderation in the web forums and Facebook pages. It is not absolutely impossible for all this to end up in fisticuffs and ruined relationships.

Columns e resort to comparison of averages or strike rates in various conditions - trying to convey important inferences. Unfortunately, for a trained statistician, a difference in average is not sufficient to prove a distinction between two performances. A lot of the behaviour of data depends on the randomness and variation inherent in the nature of the game, and unless there is significant statistical evidence that the difference is irrespective of the fluctuations that can be expected, we cannot come to any conclusions.

What I have noted in the paragraph above comes from a major technique in Statistics called Hypothesis Testing.
For a statistician, the problem of comparing Lara and Sachin boils down to a hypothesis to be tested which is articulated as follows.
The null hypothesis is that both the batsmen contribute equally, and the alternative hypothesis is that one batsman is better than the other.

 The data is analysed and based on the characteristics of the data sets available, a measurement is derived.
 If this measurement generates a value which is very unlikely if Tendulkar and Lara are of same quality, then the null hypothesis is rejected. The probability of the measurement having the generated value under the supposition of null hypothesis is called the p-value. So, for a very low p-value, we reject the equality of batsmen and say that there is enough evidence to say that Tendulkar and Lara are different.

If the measurement returns a figure which is not extremely unlikely if the two batsmen are considered to be equal, then the conclusion is that the difference is not significant enough. It can be attributed to variations inherent in the data. It is surmised that there is insufficient evidence to reject the hypothesis that the batsmen are of similar calibre and performance.

In normal life, if averages are quoted, the argument gets heated and the follower of the hero with the lesser average generally points out the respective performances in crunch situations, most often quoting instances when Lara demolished Murali, Sachin decimated Warne and so on.

The popular cricket ‘statistician’ employed by the media pulls up all the performances in crunch matches and compares averages. As we have mentioned in earlier, this hardly ever impresses the trained statistician. A simple difference of averages, while assuming erudite structure in a hastily written analysis section of a website, will generally be laughed off by the statistically savvy.

The trained statistician on the other hand, refines the data based on crunch matches and again tests the statistical significance that the measurement on the data points to difference between the merits of the batsmen.

Having tried, almost certainly in vain, to give a painless and figureless introduction to Statistical Hypothesis Testing, let me move to the problem that I have said I will try to resolve.

The comparative merits of the fab five who have been the pillars of the Indian batting for the past fifteen or more years. The problem has been the source of sustenance, support and spite for an entire generation and a half of Indian and many non-Indian followers of the game

Such a lot has been written about them, a major chunk in opinionated and biased vernacular dailies and their increasingly tabloid-like English counterparts, that if piled up in a stack, the size and colour may induce the Chinese poets to mistake it as symbolic rendition of the Yellow Mountain.

And now I say that I can assuage all debate and lay doubts to rest based on the figures.
Let us see how it works out.

As discussed above, we can use statistical hypothesis testing to test whether Rahul Dravid is a significantly better batsman than Sourav Ganguly, or whether VVS Laxman is significantly better than Virender Sehwag, while taking the chance variations of the figures into consideration.  The key word here is significance. Even if the averages are different, do they say with significant certainty that the performances are different, that one is better than the other?

We can look at the career statistics and find that Laxman averages 47 while Sourav notches up 42. The question to be answered is what can we conclude from this? Does this difference in the averages prove that Laxman is better as a wielder of the willow than Sourav? How confidently can we say so? Statistical Testing of Hypothesis answers these questions.

If the data under consideration is normally distributed, a simple technique called the t test can determine whether a particular data set (say the scores of Laxman) is significantly different from another data set (scores of Ganguly).  And the result of the test states whether one is better than the other with a percentage figure which denotes the measure of confidence. We can say that the batsman A is better than batsman B with x% confidence. Generally, in a statistical test, we consider 95% confidence to be enough evidence to reject the null hypothesis (in this case the quality of batsmen are equal)

Anyone who has undergone the process of performance appraisal understands how normal distributions are abundantly cited everywhere. A casual observation of this vice of modern industry will unearth how little people understand of this distribution, trying to fit bell curves on categorical data in  a population of less than five and move on merrily unpunished after many such murders of statistical theory.
 A lot of data under the sun, in fact, do not follow the normal distribution. Cricket scores almost certainly do not. So what do we do?

We perform a non-parametric test named Mann-Whitney. This technique takes two sets of data of unknown underlying distributions, arranges them together by ordering them from highest to lowest, and computes a score based on the ranks of each score in the combined data set. Ultimately, it tells us whether we can say that one data set is greater than the other with any statistical significance.

If the statistically formidable paragraphs have not taken the fight out of you, let us proceed to test our hypotheses.

We are going to test the relative merits of Sachin Tendulkar, Rahul Dravid, Sourav Ganguly, VVS Laxman and Virender Sehwag over their career through pairwise comparisons.

Additionally, we are going to test who is a greater contributor in matches won by the team, putting to rest the long drawn out arguments about match winning abilities which have been aired too long by semi illiterate scribes eking out a living out of kindling and fanning misguided passions.

From Cricinfo’s Statsguru, which is an excellent data repository while not being a statistical tool, I tabulated the innings by innings scores of the five stalwarts, along with the result of the matches.

To refine the data, I decided to delete all the unbeaten innings of the batsmen where they have scored 39 or less. A score of 10 not out is hardly significant, but it skews the data since 10 is considered as a low score.

After this I performed Mann-Whitney pair-wise comparison of the innings by innings scores of each batsman with the others, and tested if the contribution with the bat of one was greater than the other throughout their test careers.

Testing Sachin against Dravid produced a p value of 35%.
In robust statistical jargon, this means that under the assumption that Sachin and Dravid are equal in their contributions with the bat as opposed to Sachin being the better batsman, the probability of their career data turning out to be what it is amounts to 35%.
 This can be loosely interpreted as the statement that we can say that Sachin is a greater contributor with the bat than Dravid with 65% confidence.

Well, even if this seems favourable to Sachin, most of the rigorous statisticians would conclude that there is insufficient evidence to conclude that Sachin is better than Dravid. The generally accepted p value is 5%, which means a 95% confidence level. In some cases we can choose to test at 90% level as well.

However, when comparing Sachin with Sourav, the p value is 3%, which translates to a 97% confidence level with which we can say that the contribution of Sachin is greater than that of Sourav.

So, indeed, we can say with a statistical significance generally accepted across the world that Sachin has been a better contributor than Sourav.

The only other comparison which comes to a similar confidence level is when we compare Rahul Dravid with Sourav. The details of the tests and the resulting confidence levels are given below, with red and amber colours denoting the significance of the confidence levels.



Debates, arguments, comparison of averages, scores, centuries can be pushed aside. Robust statistical tests conclude whatever the table above says. 

However, having looked at their overall contribution, let me turn to the oft debated criteria of match-winning. 

 If we consider all cricket facts and figures that are painted yellow by journalists, this is by far the most jaundiced feature shaped by pulp and print. Forever used to pull down the monumental contributions of Sachin Tendulkar, his so called lack of match winning ability is an urban legend brought out ever and ever again by the significant number of disturbed individuals in denial who argue that the man with almost 30000 international runs and 100 centuries has not yet done enough for a nation of cricket followers made out of a significant proportion of underachievers.

Here are some of the several facts about Sachin Tendulkar and match winning that I have pointed out in endless arguments.

He has 5431 runs in 61 won test-matches with 20 hundreds.  

There are several other facts that are conveniently ignored by critics. 

That he scores a hundred in every 3rd won test match - a rate bettered by only Bradman, Inzamam, Hammond and Sobers in the history of the game.  
That he has featured in 61 of the 109 total wins of India (that includes all the wins in the 57 years of cricket India played before his d├ębut). That the W/L ratio when he has played for India is 1.32. Before he started playing it had been 0.48 with 43 wins and 89 losses, now it has been hauled up to 0.78. 
That of the 14 matches he has missed during his career, India has won 5 and lost 4 with 2 of those wins coming against Zimbabwe. That he has more runs, centuries and a better average than any of his team mates in won matches - including all who have often been proclaimed and paraded as the real match winners by irresponsible reportage.

The figures are for all to see:

In Won Matches



However, all these figures I have quoted are very much in the Cricinfo mode. The data speaks a lot, but there is no hint of statistical significance in the argument to be considered scientifically valid.
So, the next step I took was to perform the same Mann Whitney tests, but this time only considering matches in which India ended as winners.

The results are given below.



As can be seen, Sachin Tendulkar leads, just some inches over Dravid (statistically insignificant difference) while head and shoulders above the rest, as far as match winning is concerned. At 75% Confidence, we can say that he is a better contributor in won matches than everyone except Dravid.  

 All the five have been great contributors to India’s cause, but there is a distinct difference in class, contribution and consistency. The above tables just denote the scientific proof of the same.
A lone hand of 136 in a losing cause or a brilliant 73 not out in a one wicket victory can go a long way in influencing mass reaction. People can raise a new hero in a day while tarnishing a contributor of two decades based on a solitary performance. “Make him another Sachin” can be the mass reaction when a newcomer blasts a quickfire 40. However, the figures, when tested for statistical robustness, reveal whatever as been put in the table above.

There is a definite statistical difference between the others and Sourav Ganguly as well as far as his prowess as a batsman is concerned. I have written about his sterling contribution to Indian cricket in an earlier post. However, purely as a batsman, he is definitely the weak link among the five.

Statistically, with these results, the arguments about the relative merits of the fab five and the match-winning abilities can be logically put to rest. And I know all that all this analysis achieves monumental nothing.


Logic Cordoned Off
Speaking realistically, rational end to arguments of this kind is an unachievable dream. If mankind – Indian cricket followers being a defining and populous sample of the more irrational of the species – was unequivocally rational, would we have had several of the landmarks that etch the grotesque scars in human history? The dark ages, the crusades, the wars, the Nazis, ethnic cleansings and Geroge Bush? Major world news is still dominated by religious and ideological fanaticism. Given half a chance, a sport played in the oval is transformed into a gladiatorial combat between worshipped gods, all sense and sensibilities ultimately sacrificed on the altar of the eternal king of the deities, namely filthy lucre. Reason does not have a statistically significant chance of survival.

To demonstrate this, let me recount the outcomes of the last few times I discussed Sachin Tendulkar and match-winning with four supposedly educated members of the cricket following community.

The first was one of those journalists of an Indian vernacular daily who scavenge on the scattered fragments of rumour outside the offices of Jagmohan Dalmiya in order to earn his daily bread. His opinion was that an Adidas logo on one’s ass does not demonstrate greatness. When I conjectured that given a similar offer from Adidas to paste a logo on his not-so-famous butt he would have probably turned the other cheek, he did not take it very sportingly.

The second was morose that every time he drank Pepsi, Sachin got a cut. That a similar cut went into the pocket of the CEO of Pepsi did not bother him, but because of the riches that had accumulated in Tendulkar’s coffers, Yuvaraj and Pujara were in his opinion better performers with the bat!!

The third raised metaphysical objections, saying that a mortal man cannot be elevated to the position of god in a team game.

At the other end of the spectrum, a Sachin devotee rejoiced at his pioneering double hundred in the one day format, gleefully noting the Wittgensteinean Tautology that it was surely the fastest 200 in ODIs.

I would not dare to dream that Messrs Mann and Whitney can resolve heated debates in this battlefield of emotions with their test of statistical significance.

Try Statistical Arguments