Thursday, 10 July 2014

Correcting OPS

Often, we fall into the trap of thinking that OBP = plate discipline and walks, while SLG% = power. In reality, both OBP and SLG% are largely made up of batting average. For a league average player, hitting .260/.330 per se, the batting average makes up nearly 4 fifths of the OBP. Even for an extreme on-base guy, hitting say .220 with a .350 on-base, the average still accounts for nearly 2 thirds of the OBP. As for slugging, a player’s average contributes over 50% of their SLG% (apart from extremely unusual examples, such as Jose Abrau). In other words, the bread and butter of a player’s OBP and SLG% is batting average. 

The reason that analysts idolise OBP is not because ‘taking a walk’ is of specific value over and against hitting a single. It is base runners that are so valuable. Therefore, as OBP measures the percentage of times you become a base runner (whether by hit, walk or HBP), it becomes a valuable calculation. A player is not valuable simply because he takes a walk; he is valuable because he reaches base. Of course, a player with a high batting average will have a high on-base, whether or not he walks at a high clip. And such players are still valuable.

So if OBP and SLG% both largely consist of batting average, then there is tremendous overlap between what they measure. While OBP and SLG% have distinctions in what they record, the majority of their measurements overlap. Indeed, for some players, the overlap is stretched. A typical ‘singles hitter’ will find that their OBP and SLG% are similar (in some cases, OBP is higher). Their OBP is naturally measuring their average plus walks, whilst their SLG% is merely measuring their average plus occasional extra base hits. So for that breed of player, the SLG% is really just a slightly inflated batting average. SLG% only becomes a significant measure for players who can use the power to produce a decent separation within their slash-line.

All this produces an issue with OPS.

If you accept these 2 premises:
  1. That around 75% of a player’s OBP is typically comprised of their average.
  2. And that over 50% of a player’s SLG% is comprised of their average.

Then, OPS records a player’s batting average twice. Every time a player gets a hit, it contributes to both their OBP and their SLG%.  Thus each hit will be represented twice in the player’s OPS! Each walk will be recorded once, as will base beyond 1st. But the reaching of 1st base (via a hit, double, triple or homer) will be recorded twice. This means that OPS unequally weights batting average as twice as valuable as any of the other elements that contribute to the stat.

Take a player whose offensive contribution rests primarily in their average. Players who don’t walk much, or hit for much power such as Juan Lagares, Ben Revere, Darwin Barney, Jon Jay, Zack Cozart or Adeiny Hechavarria (hardly MVP material, I realize, but everyday major league players nonetheless). Their OPS is little more than their batting average doubled.

Revere is hitting .290 with a .662 OPS. Of that OPS, .580 is batting average. That means that only 8% of Ben Revere’s OPS comes from outside of his average. Admittedly, OPS was not formulated with Ben Revere in mind. But with power on the decline, the typical major league player is increasingly deriving his OPS from the doubling of his average. This uncovers an increasing issue with the use of OPS.

However, OPS has a place. If only it could be re-thought, it would more accurately measure what it attempts to value. The value of OPS was that it attempted to combine multiple aspects of offence into one number. More detailed stats such as WAR and wRC+ have since taken this to another level. But the aim of OPS is honourable. It takes more into account than the individual elements of the traditional slash line. OPS acknowledges that these individual elements are lacking, and therefore have weaknesses. Let me explain.

·         Average: the weakness of batting average is that it only accounts for one means of reaching base (albeit, the most important one). Also, an infield single and a grand slam are weighted equally within a player’s average.

·         OBP: On-base, unlike average, records all base-reaching skills. Measuring ‘base reaching’ is valuable. However, like average, it does not distinguish between the value of the bases reached, whether singles, doubles etc.

·         SLG%: Slugging is, of course, a record of the number of bases reached per at-bat. However, the picture is incomplete. It does record singles, but not walks, despite the fact that both result in a player reaching 1st

All these stats paint a picture of a player’s contribution. But in each case, the picture is lacking. In the case of OPS, the picture is overcrowded, recording average twice. But what if you could produce a simple metric where each base was included, yet only once?

Here are all the elements needed:

  • Walks 
  •  Hits
  • Extra base hits
  • HBP

Also, stolen bases should be included. There is a reason that doubles are more important than singles. Extra base hits increase the chances of producing a run, moving the runner into scoring position. Yet a stolen base functions this way also. In most cases, the value of a double is equal to the value of a single plus a steal (just as, in most cases, a single is equal to a walk in value). Therefore, if we are going to reward extra bases as more valuable than singles, then stolen bases ought to be recognized.
Adding all these elements together would give us the number of bases reached per at-bat, whether those bases were reached by walk, single, double, homer or steal. Each base is valuable, thus each base should be measured. The more the merrier. This would be like adding steals, walks and HBP to a player’s total bases.

We might call this a measure of ‘bases per bat’ (or BPB).

Let’s use an example.
A player has 550 at-bats. He produces:
  • 100 singles
  • 30 doubles (60 bases)
  • 10 triples (30 bases)
  • 20 home runs (80 bases)
  • 40 walks
  • 10 HBP
  • 40 steals
(These are Carlos Gomez’ impressive 2013 stats, rounded for convenience).

The number of bases reached by his own merit is 360 (I realize that going 1st to 3rd and scoring from 2nd are also valuable skills in advancing through the bases, yet these are dependent upon the contribution of another batter).You could use net steals, as a single followed by a caught stealing is roughly equivalent to making an out at the plate. If you divide those 360 bases by his 550 at-bats, you get .655. This number is significantly different to Gomez’ OPS of nearly .850, but more evenly and consistently reflects the total value of the bases he merited through a season’s work.

Friday, 30 August 2013

Hunter vs Kenny

Everyone loves Torii Hunter, and rightly so. He is an engaging, lively and articulate person off the field, and productive player on the field. Yet his recent interaction with Brian Kenny put pay to any hope of a future career in analytics for Torii. In less than 45 seconds, Torii Hunter exposed a lack of understanding that is widespread amongst players and fans.

This provides an interesting example of the rhetoric given by people to defend their misunderstandings. While the exchange was brief, his comments were symptomatic of the problem. Two of these comments were particularly interesting.

   1. "The numbers are good but they lie a lot…” “…the numbers lie sometimes”

This is a common fallacy. Numbers don’t lie. They cannot lie. They are simply pieces of information. They may be misused, handled poorly or interpreted in deceptive ways, but a number does not lie.

Sometimes, you have to read between the lines. Does Torii think that Max Scherzer’s numbers lie? Or that Miguel Cabrera’s numbers lie? What about Miggy’s 173 hits, 25 doubles, 43 home runs, 359/460/683 slash line? Do any of those numbers lie? Of course not, because they support Torii’s team-mate. Apparently, the only ‘lying numbers’ are the ones that negatively portray our friends. All positive number are truthful; all negative numbers are liars.

Several questions are left unanswered: which numbers are lying? How can you tell a lying number? Who judges which numbers are truthful? The short answer is: the truthful numbers are the ones that conveniently support my case. The same players that are quick to jump on Scherzer’s 19-1 record as evidence of his ability, swiftly pooh-pooh Verlander’s 12-10 record as deceptive. 

Double standards are easily exposed.

  2. "You never played the game"

Hint: when an argument is not going your way, make it personal. And one of the best ways to personalise an argument is by criticising your opponent’s credentials rather than engaging with their arguments.

For athletes, this often materialises in the “you never played the game” retort. Firstly, this is simply a logical fallacy. There is no connection between a person’s ability to play a sport and their ability to make judgements about a sport. Take for example Joe Maddon. He is widely regarded as the best manager in baseball, yet his playing career amounts to just four partial seasons at ‘A’ ball. Or consider the modern GM. They are charged with making the judgements and decisions regarding the quality of the players inside and outside of their franchise. Consider Theo Epstein or Ben Cherington for example, who never “played the game”.

Worse still, Hunter’s own GM, Dave Dombrowski never played the game. Does this call into question Dombrowski’s ability to do baseball analysis and make decisions on baseball players [including his decision to sign Torii Hunter]? Of course not. What about the scouts that signed Hunter? Does their opinion sink or swim on the on the validity of their professional baseball experience. Evidently, no.

Next time Torii Hunter makes a remark about a political issue, perhaps someone should respond “hey Torii, since when were you a politician? How’s your political career going? How many years did you spend in office?”

I realise that Torii’s comments were made in good humour, yet they are largely reflective of the way in which many people still reason. Once we strip away the sound-bite style of argumentation, we are left with little more than an amusing and entertaining, yet shallow line of reasoning.