chicago, illinois, city-2064523.jpg


WAR! HUH! What is it Good For? Well... It Depends

By Teige Mullin
baseball, sport, preparation-647423.jpg

Note: For the sake of saving some time this article assumes you have a pretty good grasp on what WAR is and sabermetrics. If you don’t know what WAR is I encourage you to look it up and the differences between bWAR and fWAR.

If you’ve been around the game of baseball at all for the past 10-15 years you’ve probably heard of the sabermetric stat WAR (Wins Above Replacement). The stat is an attempt to give you an all-encompassing look at a player’s value during the season, or career. More specifically, it tries to find how many more wins a player is above a replacement level player. A replacement player is “basically a projection of what the next guy up from AAA would do should player A go down (Thank you Nick DiCola)”. If a player finishes the year over 6 fWAR, they’re having a darn good year. For a while, more outlets were trying to incorporate the stat into just about anything. Broadcasts would show it, bloggers would use it to make their latest sleeper MVP case, and much more.

Ok Teige, so why are you writing it and why am I here?

Well, both are great questions, but I can answer the first one a bit easier: I have never seen a stat die in baseball nerd circles just about as quickly as it’s been introduced to widespread attention. So, sit back, grab some popcorn, and let me explain what I mean.

What I Like About WAR

Before I go off on the issues of WAR, and what I think we can do to help it, I want to say that I’m not super anti-WAR. As a matter of fact I do think it has its purpose, it’s just not the end all be all stat it tries to be. That being said, what do I like about it?

  1. In terms of an all in one stat, it’s not horrible. Look, considering all the factors, it’s really hard to sabermetrically quantify any game sport. The fact that we have so many advanced stats and tools at our disposal in baseball so well is awesome. Sure, I dislike how WAR handles defense and pitching a little bit, but at the end of the day, this statistic does a way better job at putting it all into one number than I COULD EVER DO. 

  2. It has more mainstream attention than ever before. It’s good now that more baseball fans can see that you don’t need to have a high BA to be valuable. Getting on base and run production, while stopping run production on the defensive side are what it’s all about. Oh really buddy, the game is about runs who would’ve thought? Yes, who would’ve thought! You’d be surprised how much I still see people evaluate players based on traditional stats and counting numbers. Come on down to the 21st century my friends. I like that you or I can mention WAR and know what we’re talking about. Plus, the number is sexy. Nice, clean, a digit and decimal.

And that’s it. Now let’s get into the bad stuff.

The Problems with WAR

So right off the bat we can find a few issues with WAR.

  1. It’s not standardized. That’s right, the stat doesn’t even have a standard definition. Why is that? Well, the two most popular baseball stats sites (Baseball Reference and FanGraphs) have different formulas to compute it. The stat also uses weighting coefficients thanks to using wOBA, and if that sounds annoying it’s because it is. You can’t just take a player’s stats and calculate his WAR on my own. If you wanted to do it as a high school coach, well good luck. Now, for what it’s worth, the numbers usually end up pretty similar between both sites but not exact. In theory, a broadcast could say Player A has a WAR of 5.0, while Player B has a WAR of 4.5. Well not so fast. Now, you have to ask yourself, are they using fWAR or bWAR and are they using one site for one then switching for the other to make a narrative? For what it’s worth, I don’t think it’s common practice for a broadcast to use both as an attempt to skew your opinion on a player, but the fact remains that they COULD. And I’d wager a lot of fans just simply couldn’t be paid to care to find the difference

2. WAR doesn’t take into account the situation. Not all home runs are created equal. A walk off homer is a lot different than hitting a home run down 8 runs. Sure, your OPS and contract bonus doesn’t care, but the game situation does. If you want situations, you need to look at Win Probability Added (WPA) or a similar stat (oooo foreshadowing). Also, relief pitchers are consistently under-valued using WAR. Kenley Jansen recently got his 400th career SV, has a career 2.49 ERA, and over 1000 strikeouts. Kenly Jansen also has only a 19.8 career bWAR. Mariano Rivera, the highest career Closer in terms of bWAR with 56.3, and also the owner of the most career saves, sits at 229th overall. 2nd place goes to Goose Gossage (on a very quick glance. I did not comb the leaderboard super thoroughly) with 41.2 bWAR. He sits at 507th. Let’s just imagine all 652 of Mariano’s saves were in super stressful, bases loaded with no one out, 1 run situations (Lol can you imagine if this was the case? I’d call him the greatest player of all eternity if this happened). This is obviously pretty much impossible, but Mariano’s WAR would be a whopping, you guessed it, 56.3 still. The reason relievers don’t get valued as well is because of Issue 3.

3. WAR is, at it’s core, just another counting stat. If you take nothing else away from this rant, it’s that we need to stop using WAR in May. Counting stats usually need a much larger sample size to be reliable. It’s a fun stat to reflect on a season, maybe even a career, but stop comparing mid season form using WAR. It’s just not reliable. It’s like when a guy only has 15 RBIs heading into mid May. If he’s a consistent slow starter but always finished around 100 then don’t worry about it. Just don’t use it for some weird hot take argument is all. This kinda dips into issue 4 but first I also want to point out that WAR benefits players with lengthy careers, because it compares them to a theoretical replacement player, again adding to the fact that it’s a counting stat. It will also hurt career relievers, as I pointed out with the last issue because they will just simply never match the same IP output.

4. WAR doesn’t do a great job of adjusting for different eras of baseball. Put simply, Babe Ruth will never be caught in the WAR race because the Great Bambino was so much better than his competition for so long that his WAR is just so skewed. I’m sorry, but the Bambino ain’t handling 103 Sinkers coming from Dustin May on his diet of steak, whiskey, and cigars. No amount of era adjusting is fixing WAR as well as you hope.


5. The defensive adjustment. WAR gives defensive bonuses for playing a harder position like SS, and penalties for playing 1B or just being a DH. Why is that? Because the stat attempts to be an all in one number for a player’s value. 

Isn’t defense really hard to quantify analytically? 

Yep. which is part of the issue. Earlier this month on Twitter, Sergio M Quintero compared Ha-Seong Kim and Yordan Alvarez, and said he’d rather have Alvarez on the Padres than Kim, despite Kim having a higher bWAR. I don’t blame him either. One has a 168 wRC+, while the other has a 98. On the flip side, another user pointed out the fWAR for both was 0.8 (Kim) and 1.5 (Alvarez) and argued fWAR had better defensive measurements. See how we circled back to Issue 1? In one stat Ha-Seong Kim was, at the time, THE 4th MVP IN THE NL ACCORDING TO bWAR. With fWAR Kim isn’t close to any top 10 overall list. Maybe his defense really has won a game by itself, but we can’t even agree how to evaluate it. 

What do we WAR then?

First things first, get the stat standardized across the board. It’ll never happen because these are independent sites and they have very hard working staff members that are proud of their work, but in my opinion the stat should have a standard definition. BA is BA no matter where you go, WAR should be too. Second, lock in the fielding metric. As fielding metrics get better so can WAR. No one uses OAA yet in their calculations and I think that’s a travesty. OAA has its own problems, but it’s a valuable tool that no one uses. Presumably because again, OAA is on Baseball Savant, a different site. Third, we have to figure out what to do about the game situation. WPA has its own problems, as it is not predictive, and is a cumulative stat. Still, in my opinion, we should reward players that cash in as many opportunities as possible. After all, WAR is attempting to find a player’s relative value (and heck, it’s cumulative too, so) and I’d rather give more WAR to a player who always seemed to be a difference in those high leverage moments. There’s also a clutch stat. I don’t know, throw a dart and add one into WAR. Make it fun. Why not, right?

Can I Just Use Something Else Instead of WAR?

Yep! WAA (Wins Above Average). In my opinion you should be using WAA for just about anything player value related. In fact the people at the SABR website itself say you should use it. When you add a team’s cumulative WAA of all of its players the team’s record matches it almost perfectly. So if your pitchers add up to -5 WAA and non pitchers +5, you should end up right around 82-82 give or take a few. See what I mean? It actually counts the player’s Wins contribution. And you can use it as the season progresses because the numbers should still match the team’s win totals. 

Well then why did you write so much about WAR, jerk? 

Because Nick said if I don’t write more, he’ll unleash the alligator man on me. No, but really because I wanted to write about it and give you something to read if you were bored. I’m bored now, can I go? No, you’ve made it this far may as well keep going. 

Hey wait a second I just saw Kim had a WAA of 1.2 and Alvarez only 0.7. 

Yep! WAA still takes defense into account and Kim’s is valuable, but WAA also relies on the context of your team when you add them together. Point is, neither players have hurt their team yet overall (Kim however has hurt his team offensively, you can track that separately, whereas Yordan has one job and excels at it). Another point is, there’s only one WAA number. None of this mix and match BS to fit your narrative.

Wrapping It All Up

One of the things that inspired me to write about WAR was I noticed that a lot of fellow baseball nerds that I follow on Twitter have started to openly shrug off WAR. I think the reason for that is two fold: First, these nerds (I mean that so affectionately too by the way. I consider myself one) need something to be not mainstream. Ya know, now that everyone knows WAR they’re like, “Oh, you’re still on WAR? Well I’m on wRC+ 9000”. Second, because of the problems with WAR that I mentioned, analysts would rather use multiple stats like wRC+, wOBA, Barrels, expected stats, etc. and mention defense separately, and just use multiple stats to show a player’s value instead of one number that can seem arbitrary to many.