
NOTE: This post was written before any details were released on the Cardinals infections of July 31. However, those numbers should not change the final conclusion of this post.
Major League Baseball has come under scrutiny in the last week as they figure out how to deal with a COVID-19 outbreak in the Miami Marlins clubhouse that may (though not likely) have spread to the Philadelphia Phillies. This has yielded 19 positive tests thus far amongst Marlins players and coaches, begging the question – is it safe and responsible to be playing baseball right now? I dug into the data, and the answer is ‘yes’.
When players first reported to “Summer Camp” in the first week of July, there was a thorough (though not perfectly executed) testing process for all players before they were allowed to engage in full team activities. That intake process identified 59 players positive for COVID-19. Since that intake ended, an additional 18 players have tested positive, as of July 30. Those numbers are all alarmingly high when you consider the Summer Camp size was 1,800 players and the MLB roster size since July 23 has cut that population to 900. These 77 player-positives are causing people to question the moral rightness of playing baseball amidst a pandemic in the way the MLB is attempting to, as the presumption is that playing baseball is leading to more infections and perpetuating the spread of the virus.
I wanted to test this hypothesis with a data-driven experiment and figure out of proceeding with the regular season under MLB’s player protection protocol is actually the ‘right’ thing to do through the lens of ‘stopping the spread of COVID-19’. We lack a perfect control group of “MLB players going about their lives as normal” to establish a baseline of infections for this population, but I do have a lot of public data on COVID-19 infections. Additionally, I’m not trying to explore or opine on any of the specific measures they are or aren’t taking; I’m just looking at the data and what it tells us about the spread of the virus amongst the MLB player population. So, here’s what I did:
- I gathered the complete list of MLB player positive tests, along with the date each positive was reported through July 30
- I gathered the daily ‘Implied Infections’ for each state with an MLB team (and the city of Toronto), as calculated by rt.live, which uses reported infections as well as other signals to estimate how many actual infections occurred on a given day.
- For simplicity, I only took the daily infections for July 20 – a mid-point during this period of analysis. I’m going on the assumption that this rate is roughly representative of the infection rate in each state during the month of July
- Using the infection rates and populations for each state, I estimated how likely a ‘normal’ person in that state’s general population is to contract the virus for each day they are in the state, then used this rate to estimate how many infections we’d expect on an MLB Summer Camp population of 60 people or regular-season roster of 30 people
- I compared this ‘Expected Infections’ total to ‘Actual Infections’ to see how many additional (or fewer) infections may have been ’caused’ by the MLB season
The results of this experiment weren’t shocking. Across a Summer Camp population of 1,800 and Regular Season Population of 900, we would expect ~4 infections in a general population sample over the course of July after adjusting for the infection rates in each home state. Instead, we had 77 positive tests. So, this comparison would lead us to believe that “playing baseball is responsible for 73 COVID-19 infections”, or playing baseball has infected 5% of the player population in July.

But I don’t think it’s that simple. Remember, 59 player showed up to Summer Camp (likely) already infected, as they had confirmed, reported positives tests by July 11. Given the virus has an incubation period of up to 5 days before it can even register on a test, and given the precautions MLB took to minimize interactions before the first round of testing was done, its more likely than not that these 59 infections were brought to Summer Camp by the players, and they contracted the virus in their home environments. This is important because it is evidence that baseball players aren’t the general population and have different risk profiles for infection.
If 59 players out of a population of 1,800 were infected as of the Summer Camp report date, or the beginning of intake, then that is 3.2% of the population. We don’t know when those players were infected, so we have to assume it was evenly distributed over the normal incubation period prior to reporting. That is, these 59 infections occurred evenly over the 14-day period (the generally accepted infection duration) prior to intake; so 4.2 players were infected per day before reporting to summer camp. In a population of 1,800 baseball players, this is a 0.23% daily infection rate, which is significantly higher than the general population rate of 0.01%. So, while 1,800 baseball players were getting infected at a rate of 4.2 people per day, 1,800 people in the general population were getting infected at a rate of 0.18 people per day.

Over the course of the rest-of-July (the 22 days post-intake), this would translate into 74 expected infections amongst the player population if they were not in the MLB season and protocol, compared to the 18 actual infections we’ve seen through July 30. That means the MLB season and protocol has reduced COVID-19 infections in the player population by 56 cases since intake was completed, which is protecting 3.9% of the average population size (accounting for 60 player rosters in Summer Camp up to July 23, and 30 man rosters since). This is a major reduction in infections and one that is hard to ignore.

I am by no means a proponent of paternalism and the nanny state. However, I think we have also witnessed in this pandemic that some people make bad choices that endanger others. Also, baseball players (and major sports pro athletes in general) are not normal people and often live very different lifestyles from the general population. Thus, it’s reasonable to expect they would have different risk factors for COVID-19 infection from the general population. Getting into the actual differences would just be speculation on my part and will vary widely from individual to individual*. However, the data from intake are compelling and lead me to believe strongly that getting players into the focus and structure of the MLB season and abiding by COVID-19 protocols is keeping them away from other viral dangers – dangers that predispose them more to exposure to COVID-19 than the general population. We can look at how many actual infections we see today, but we must also look at how many infections we would have otherwise. Keeping players engaged and in the protocol is a net positive as far is reducing the spread of the virus is concerned.
*Some guesses: physical training requirements, entourages, or picking up dinner at strip clubs.
Start the discussion