Baseball Hacks: Tips & Tools for Analyzing and Winning with Statistics by Joseph Adler
Spring has sprung, and with it comes the smell of hot dogs, the yell of the concessionaire, and the crack of the bat. Baseball season is here again, and with all the traditions that come along with it comes another annual rite, the gathering of baseball statistics. Of all the sports, probably the one most amenable to being picked apart by mathematical methods is baseball. There are a number of reasons for that, namely that the play is regimented by the numbers of strikes, balls, players, and innings. That leads to the ability to break down plays into a quantifiable set of numbers that can be used not just to settle bar bets, but to identify trends that reflect the game, the players, and the era they played in.
There are a number of books that are out there that give you every record and every statistic on every player you can imagine. Those lists of numbers are long and, mostly, boring to look through. But if you're handy with a computer, and don't mind getting your hands dirty with some SQL and Perl, than you can slice and dice the statistics with the best of them.
The book that shows you how is Baseball Hacks. It gives you everything you need to become a closet sabermatrician (a sabermatrician is a person that studies the statistics behind the game and comes up with measures for players and teams).
Published by O'Reilly Books, this book has 75 great hacks that you can use to prove your own pet hypotheses. Could Babe Ruth out-hit Barry Bonds if he had lived today? Could someone like Randy Johnson have been successful in the early 1920s?
This book will show you how to use open source programs such as MySQL, R, and Perl to parse freely available data on the game and store it and display it to your heart's content. Starting with a section on what the basics of baseball are, it tells you how to score the game and where to look online for resources pertaining to the game.
Once you know what you're getting into, the next part of the book lets you know how to get data on games in the past and career statistics. Did you know that data on just about every game every played since the 1870s is available? The book gives you ideas on what programs you need to start working with that data.
Section 3 is where you learn to get information on current games. ESPN shows you only part of what's going on; it takes some work to download that data to your own personal database and display only the stats you're interested in. This book shows you how to use the Internet to get everything you'll need and more.
Of course, the data is no good unless there's some way to visualize it. Section 4 details how to do this. Sometimes, a picture is worth a thousand words, especially if you're dealing with rows of numbers that would drive just about anyone crazy!
The next section is concerned with the formulas behind the game. Did you know that batting average is the most commonly used and, generally speaking, probably one of the least useful statistics a player can have? This book tells you about much more meaningful statistics like Runs Created, which is a measurement of the runs that a player contributes to a team over a season. The formula looks like:
RC = (H+BB+HBP-CS-GIDP)*(TB+0.26*(BB - IBB+HBP))/AB+BB+HBP+SH+SF
Wow! There are 13 different versions of this formula created by Bill James, the guru of sabermetrics. Many of the more esoteric measurements that are bandied about today came about as a result of his work on the subject.
The next section talks about some of the problems and questions that have been addressed by applying statistical methods. For example, how to define a "clutch player"? Or, how much does playing at high altitude (Coors Field in Denver is the classic example) actually affect the game?
Finally, there is a section on how to apply what you've worked on when you want to join a fantasy league, or write your own personalized scoreboard to keep track of your favorite stats. The book shows you all that and more. Not only is it a great introduction and education in the game itself, it shows you a great deal about working with both SQL and Perl, as well as Microsoft Access and Excel. Now how often can you say you've used those programs for something as fun as baseball?
Itís good, though not essential, to know something about computers and the various programs. Baseball Hacks provides some hand-holding along the way (though not as much as you may want, in which case you should get an introductory Perl and MySQL book to go along with it!). Now you can start really understanding the great American pastime and become a total computer nerd while you're at it! Baseball Hacks is available at Amazon through this link: