Criticism. Essay. Fiction. Science. Weather.
Rather than maintain its typical singularity of focus, this week's Dissect-O-Stat will run through a series of several baseball-related statistics and then attempt to answer the question, "Can one measure the soul of a baseball game?" For those of you who do not particularly like acronyms, I caution you to proceed at your own peril.
*
The hippest method of quantifying performance on the baseball diamond is through a body of statistics known collectively as sabermetrics. Founded in 1971 and standing in as the first of our acronyms, the Society for American Baseball Research (SABR) did much of the early development and organization of these measures (which is why they got to name them). SABR's most famous proponent is a man named Bill James, whose
Baseball Abstracts helped popularize the Society's statistical research. In 2003, Michael Lewis published a book called
Moneyball which truly brought sabermetrics to the masses (though the hero of that book, Billy Beane, has never publicly endorsed the saber-method).
All these saber-statistics attempt to do one or more of three things. First, they attempt to control for a number of variables - such as the baseball stadium in which a batter most frequently bats or how good the other players on his team are - that better-known baseball statistics don't account for. Second, they don't look at players in absolute terms, but rather in comparison to the "average" player or to what would happen if just any old guy were playing in his stead. Third, these statistics consider the effect one can reasonably expect a narrow set of statistics to have on different, broader set of statistics given historical precedent.
So, let's get down to a few stats. We'll start with an easy one: OPS or On-Base Plus Slugging percentage. This stat has made significant in-roads into the mainstream of statistical baseball, so it may look familiar. Saber-junkies rely on it because typical measures of hitter production do not include all the ways a hitter can reach base (batting average does not consider how often a hitter walks) or depend too much on the batters hitting before and after a given hitter (a high RBI total requires that the a hitter come up with runners on base).
OPS provides a quick and easy way to see how well a hitter is doing his two main jobs, which are: 1) not getting out and 2) getting himself and the people in front of him as many bases ahead as he can. The first part of OPS (on-base percentage) one-ups Batting Average because it counts all the ways a player gets on base, not just the number of hits he gets. The second part (slugging percentage) one-ups RBI because it tells us how far the batter (and whatever hypothetical baserunners who might or might not have been on base when the hitter came to the plate) typically advances when he does reach base. In this way, the measure of the batter's production is not dependent on what other hitters have or have not done for him lately.
So what's a good OPS? Well, when it comes to production at the plate, there are two gold standard traditional stats: a batting average of .300 and an RBI total of 100. So what's the ".300" and the "100" of OPS? The easiest way to answer that question is as follows: Thirty-eight hitters hit .300 or better during 2006. Thirty-three hit .300 in 2005, 36 in 2004, 40 in 2003, 35 in 2002, 46 in 2001, and 53 in 2000. So, on average over the last seven years, about 42 hitters have hit .300 every year. Now, in any given year, a few guys are going to get lucky, hit .300, and then never be heard from again. What makes a hitter great is his ability to hit .300 on a consistent basis. Over those seven years I just named, only about 30 different players made the list 3 times. So let's say for a minute the top 30 hitters in any given year are truly ".300"-caliber hitters. Well, the 30th highest OPS in 2006 was everybody's favorite Yankee, Derek Jeter, who had an ops of .900. Over the last seven years, the OPS of the 30th place hitter in the league has varied between about .890 and .910. As such, it seems fair to say that .900 is the new .300.
We can do the same analysis for RBI. In 2006, 38 hitters drove in 100 runs. Over the last seven years, anywhere from 27 to 46 batters have driven in 100 runs in a given year, but the average over that period is, once again, about 30, confirming the notion that .900 is the gold standard OPS total.
As an interesting aside, this year was widely regarded to have been one of the best statistical seasons of bottom-of-the-gold-standard Derek Jeter's career, and he is the front-runner for the American League's Most Valuable Player Award. Jeter's much-maligned teammate, Alex Rodriguez, on the other hand, was deemed to have had a "year-long slump" by the New York sporting press and nearly booed off the field by the New York faithful. At the end of the season, though, A-Rod's OPS was 14 points higher than Jeter's.
Equivalent Average (EQA) is another statistic that fits into our first category of sabermetrics. The basic idea of this statistic is that there are two factors that will distort a player's batting average: the field on which he plays his home games and the league in which he plays. In other words some players get screwed because they have to play their home games in a stadium where everybody has trouble getting a hit, while other players get lucky because they get to play their home games in a stadium where the ball just flies off the bat. EQA accounts for these biases by adjusting a player's Average (which itself has been adjusted in true sabermetric style) to account for the ability of average hitters to get hits in the baseball stadium where the player plays most of his games.
Value over Replacement Player (VORP) is the best known of the second category statistics. This stat reports what you might think it reports - in essence it counts the marginal utility of a player. It begins by assuming that some player has to play at every position in the field (which is true). It then looks at what level of talent is "freely available" at a given position and compares the offensive production of the freely available player with the production of the player in question. There is no actual person who is used to represent "freely available" talent. Rather, "freely available" talent is based on a composite of the average player production across the league. The "production" we are talking about can be measured according to any of several type one saber mega-statistics, including Runs Created, linear weights, or equivalent runs. Those are all more complicated than we really need to deal with right now, though.
Finally, the Pythagorean Expectation is a good representative of the third type of saber statistic mentioned above, but it is also much like OPS in that it is a quick and easy way to look at the overall performance of a baseball entity (in this case it's a team as opposed to a player, though). The Pythagorean Expectation states that a team's winning percentage in a given year is determined by the number of runs it scores and the number of runs opponents score against it. Pretty straightforward idea. Pythagoras steps in to help us to mathematically relate runs scored and runs against to winning percentage:

Look familiar? Remember A2 + B2 = C2? Well, it turns out that historically speaking, squaring runs scored and runs allowed (as opposed to just comparing the two terms directly) made for much more accurate predicting. In truth, the most accurate exponent is 1.8 (and the most, most accurate exponent involves more equations), but 2 is just so much more elegant.
*
So, now the real question: what about our baseball souls? What does all this controlling for variables and comparing to "replacement players" and looking at things historically really tell us about what's going to happen on the next pitch? It tells us something. But not nearly everything. And that's why baseball is so beautiful. It's like life, and it's like America: the perfect combination of statistical likelihood and lucky (or unlucky) timing. You can do everything right and think it's going to work out, but in the end it's often just not up to you.
On November 4th, 2001, the last game of the season, journeyman Luis Gonzalez hit a bottom-of-the-ninth bloop single off the great Mariano Rivera. Rivera did everything right in that at-bat. He was ahead in the count. Historical numbers were in his favor. He even managed to break Gonzalez's bat in half. But the ball bouncing off that broken bat fell just right, dropping over the head of a drawn-in Derek Jeter and barely rolling to the outfield grass. With that hit, the squad from Arizona, only four years in existence, defeated the venerable 26-time World Champions.
And what makes teams like Arizona and like last year's Chicago White Sox so exciting is that this non-statistical part of the game can seem, at times, to be even more dependable than the numbers. It almost feeds off itself. When it starts to click, you're heart can tell that that bloop single is going to fall just in the right place, though your mind is telling you that Luis Gonzalez is an easy out.
Walt Whitman said it best, "I see great things in baseball. It is our game - the American game."