Mortality Research & Consulting

Simpson’s Paradox and the MLB

SM Day

Well, it’s October, and the World Series is under way. Good time to talk baseball I’d say. So here’s a blast from my own past on the subject: Simpson’s Paradox and Major Leagues Baseball’s Hall of Fame.1

In 1994 I wrote an article with this title1 for the AMATYC Journal, the official journal of record at the time of the American Mathematical Association of Two Year Colleges (AMATYC). The article highlighted a relatively rare phenomenon of statistics that occurs in many arenas. Simpson himself referred to this as a danger of amalgamating two-by-two tables.2 For example, if two drugs, A and B, are compared with respect to their impact in treating a certain disease, and if the possible results are simply “cure” or “no cure”, in two separate comparisons it may appear that drug A had a better chance of curing the disease than drug B, yet when all the data are combined, the opposite may turn out to be true.

Run 1 – Drug A is betterCureNo cure% Cure
Drug A901090%
Drug B80020080%
Run 2 – Drug A still betterCureNo cure% Cure
Drug A70030070%
Drug B653565%
Runs 1+2 – Suddenly Drug B is betterCureNo cure% Cure
Drug A79031072%
Drug B86523579%

In baseball, the apparent paradox (not a true paradox, as it will make perfect sense upon careful consideration of the numbers involved) can arise when comparing batting averages across seasons, or even across games. In my 1994 article I identified four examples from the annals of MLB (three involving Hall-of-Famers). For full details, see the original article, available in the combined online volume 15 of the journal via ERIC here, or in isolation extracted from that original source here. Briefly, the examples were:

  1. Jim Lefebvre and Ron Fairly played for the Los Angeles Dodgers during the 1965 and 1966 World Series. During each of the two Series, Fairly’s batting average was better than Lefebvre’s. Yet when at-bats and hits were tallied across both Series, Lefebvre’s average was higher.
  2. Lou Gehrig had a better batting average than his teammate Babe Ruth in each of Gehrig’s first three years with the Yankees. Yet when at-bats and hits are tallied across the three years, Ruth had the better average.
  3. “Ruth was at it again in 1934 and 1935. Fellow Hall of Famer Rogers Hornsby beat him in each of those years, but again Ruth was better when the years were combined.” – Quoted from the original article.1
  4. Stan Musial had a better batting average than Joe DiMaggio in 1941 and in 1942, but for the two years combined, DiMaggio had the better batting average.

It’s an odd thing when one first contemplates it, but by examining the details of these and other examples given in the article it will become clear to the reader what is going on. One way this can occur in baseball is if both players have a really high batting average one year, during which one player has many at-bats and the other player has only few; then a relatively poor batting average the next year, during which both players have a lot of at-bats. The weighting of each individual average is the key. It is nearly impossible for this “paradox” to occur if both players have nearly the same number of at-bats in each year. The paradox does occur in real baseball, though, and makes for good trivia. Challenge your baseball fan friends at the bar with that Gehrig-Ruth example.


  1. Day SM. Simpson’s paradox and Major League Baseball’s Hall of Fame. AMATYC J. 1994;15(2):26-35.
  2. Simpson EH. The interpretation of interaction in contingency tables. J R Stat Soc Series B Stat Methodol 1951;13:238-41.


Share the Post:

Other topics of interest