2020 Off-Season Free Agent Signing AI Projections: Position Players
Predicting Big $ Contracts
The 2020 off-season and free agency period in MLB has begun. While we probably won’t have multiple $300M+ contracts like last year’s Harper-Machado-Trout bonanza; we do have several interesting stories to follow with Anthony Rendon, Stephen Strasburg, and Gerrit Cole; all coming off World Series heroics.
As an exercise in curiosity and perhaps a public service to the financial planners for this year’s free agents*, I wanted to develop an analytical method for predicting free agency outcomes: who will sign for how much and for how long and with whom?
*I’m talking about the financial planners for guys like Robinson Chirinos and other; not Gerrit Cole’s financial planner. That guy has it easy.
To do this, I took advanced analytics season results for all players from Fangraphs going back several seasons, free-agent signings from Cot’s Baseball Contracts, and AI modeling tools from DataRobot to develop predictions for this year’s free-agent class. Effectively, the AI models I built looked at several years of player performance, then compared that against free-agent contracts signed subsequent to performance to learn about relationships between player performance and other factors with free-agent value.
The other factors I included in the predictions were
relative value and
macro market conditions. In all, there were three categories of variables that I used to predict contracts:
- Absolute Player Performance: how the player performed in a vacuum (most stats and analytics we attribute to each player – a proxy for
- Relative Player Performance: How the player performed relative to others at his position (this is meant to be a crude proxy for
market supplyin this supply-demand market).
- Market Conditions: How much teams are spending in this offseason, with committed spend and spending trends as proxies (this is the
What this is NOT: A Forecast of Value
These models were built to predict market transaction prices, NOT player value. Value will inevitably be created and destroyed by players that over- or under-perform their expectations, but that’s not what this model is looking for.
Position Player Contracts & Future Posts
This post will just look at expected contract terms (years and dollars) for
position players. Future posts will look at pitchers, as well as predictions on destinations and how that could affect final contract value.
I will also continue to update these predictions, as each signing in this off-season will offer valuable information to help inform predictions for other players, so stay tuned!
2020 Position Player Contract Predictions
|Player||Average Annual Value||Years||Total Contract Value|
|Anthony Rendon||$ 21,063,964||7||$ 147,447,750|
|Josh Donaldson||$ 18,105,112||6||$ 108,630,669|
|Nicholas Castellanos||$ 15,550,854||3||$ 46,652,562|
|Marcell Ozuna||$ 21,155,203||2||$ 42,310,406|
|Jose Abreu||$ 10,275,342||3||$ 30,826,025|
|Howie Kendrick||$ 10,112,487||2||$ 20,224,974|
|Brett Gardner||$ 10,000,634||2||$ 20,001,269|
|Kole Calhoun||$ 9,865,025||2||$ 19,730,050|
|Yasiel Puig||$ 18,321,945||1||$ 18,321,945|
|Eric Thames||$ 8,854,942||2||$ 17,709,884|
|Corey Dickerson||$ 7,785,726||2||$ 15,571,452|
|Avisail Garcia||$ 15,368,255||1||$ 15,368,255|
|Asdrubal Cabrera||$ 6,831,499||2||$ 13,662,997|
|Jonathan Schoop||$ 6,741,736||2||$ 13,483,472|
|Brian Dozier||$ 12,386,253||1||$ 12,386,253|
|Justin Smoak||$ 11,724,017||1||$ 11,724,017|
|Robinson Chirinos||$ 5,724,032||2||$ 11,448,064|
|Eric Sogard||$ 5,536,218||2||$ 11,072,435|
|Didi Gregorius||$ 10,937,351||1||$ 10,937,351|
|Lonnie Chisenhall||$ 9,974,120||1||$ 9,974,120|
|Todd Frazier||$ 9,833,767||1||$ 9,833,767|
|Travis d’Arnaud||$ 8,982,666||1||$ 8,982,666|
|Mitch Moreland||$ 8,781,348||1||$ 8,781,348|
|Jose Iglesias||$ 4,284,485||2||$ 8,568,969|
|Logan Morrison||$ 7,870,225||1||$ 7,870,225|
|Brock Holt||$ 7,725,734||1||$ 7,725,734|
|Brad Miller||$ 7,450,786||1||$ 7,450,786|
|Steve Pearce||$ 7,439,719||1||$ 7,439,719|
|Gordon Beckham||$ 3,707,617||2||$ 7,415,233|
|Matt Joyce||$ 7,145,916||1||$ 7,145,916|
|Hunter Pence||$ 7,090,897||1||$ 7,090,897|
|Francisco Cervelli||$ 6,884,713||1||$ 6,884,713|
|Neil Walker||$ 6,380,041||1||$ 6,380,041|
|Cameron Maybin||$ 6,333,086||1||$ 6,333,086|
|Jason Castro||$ 5,973,642||1||$ 5,973,642|
|Yonder Alonso||$ 5,745,456||1||$ 5,745,456|
|Stephen Vogt||$ 5,627,733||1||$ 5,627,733|
|Rajai Davis||$ 5,038,365||1||$ 5,038,365|
|Alex Avila||$ 5,023,928||1||$ 5,023,928|
|Rene Rivera||$ 5,009,883||1||$ 5,009,883|
|Curtis Granderson||$ 5,003,374||1||$ 5,003,374|
|Gerardo Parra||$ 4,993,965||1||$ 4,993,965|
|Adam Jones||$ 4,897,382||1||$ 4,897,382|
|Martin Maldonado||$ 4,508,043||1||$ 4,508,043|
|Ben Zobrist||$ 4,431,368||1||$ 4,431,368|
|Melky Cabrera||$ 4,309,284||1||$ 4,309,284|
|Mark Trumbo||$ 4,303,055||1||$ 4,303,055|
|Russell Martin||$ 3,946,136||1||$ 3,946,136|
|Logan Forsythe||$ 3,787,685||1||$ 3,787,685|
|Jordy Mercer||$ 3,765,403||1||$ 3,765,403|
|Adeiny Hechavarria||$ 3,566,447||1||$ 3,566,447|
|Jarrod Dyson||$ 3,375,988||1||$ 3,375,988|
|Austin Romine||$ 3,071,744||1||$ 3,071,744|
|Matt Wieters||$ 2,997,167||1||$ 2,997,167|
|Jon Jay||$ 2,768,360||1||$ 2,768,360|
|Ryan Flaherty||$ 2,549,306||1||$ 2,549,306|
|Jonathan Lucroy||$ 2,539,387||1||$ 2,539,387|
|Sean Rodriguez||$ 2,140,430||1||$ 2,140,430|
|Martin Prado||$ 2,100,235||1||$ 2,100,235|
|Drew Butera||$ 1,972,815||1||$ 1,972,815|
Last update: November 5, 2019
Help me understand how you calculated this:
As noted above, I combined multiple sources of data, and performed some additional analysis to create better signals in the data for true player market value. I then ran this data through DataRobot, an Auto-ML platform to help me construct the optimal AI models for prediction. The resulting
Total Contract Value model looked to the following variables for greatest prediction value:
Stay tuned as we also calculate pitcher contracts, including the massive (and risky) contracts coming to Gerrit Cole and Stephen Strasburg.
What Drove The Predictions
First, age was a major factor in predicting
Total Contract Value. The chart below shows, as would be expected, older free agents get smaller contracts.
Prior 3-Years WAR proved to be very important to predictions, which makes sense – players who produced more over a sustained period of time should earn bigger contracts.
3-Years Prior WAR metric also reveals an expected but important truth about the model: it does NOT perform well at the upper-extremes. In the chart below, the orange marks are actual contract values from the past, and blue values are what the model would’ve predicted. As you can see, the actual values start exceeding the predicted values by significant amounts for most players above 10
3-Years Prior WAR. For contracts like the 2019 paydays, the model would’ve performed poorly.
This could mean that Bryce Harper and Manny Machado got over-paid significantly due to an irrational market, or it could mean there is something else going on with these players that the model doesn’t account for, such as marketing and revenue benefits from adding these marquee players that justify the additional cost.
Lastly, if a player rejected a
qualifying offer, the model attributes significantly more value to the player. This is intuitive and obvious to a person, but is very important context to the AI.
These variables are combined with many more to drive the AI models, with the best models I eventually relied upon being those that made the most sense of the 80+ variables available to them to predict contracts.