18 Fairness Tradeoffs

In the last assignment we measured fairness criteria on a random forest trained to predict whether a defendant would reoffend. The model violated Independence (it flagged African American defendants at higher rates) and violated Separation (its false positive rate was higher for African Americans and its false negative rate was higher for Caucasians).

In this classwork, we’ll discuss the Fairness Impossibility Theorem: it’s impossible for a model to satisfy both Independence and Separation when base rates differ (one group actually reoffends more than another).

Part 1: The Fairness Impossibility Theorem Proof

First, some terminology:

Let \(p_s = P(Y = 1 | S = s)\) be the base rate: the true reoffending rate in group \(s\).
Let \(FPR_s = P(\hat{Y} = 1| Y = 1, S = s)\) be the false positive rate: the probability a person in group \(s\) is flagged given they will not reoffend.
Let \(TPR_s = P(\hat{Y} = 1| Y = 1, S = s)\) be the true positive rate: the probability a person in group \(s\) is flagged given they will reoffend.
Note that \(1 - TPR_s = FNR_s\) because if a person is flagged, they are either correctly flagged or they are incorrectly flagged. So another way to define separation is that \(TPR\) and \(FPR\) are equal across groups.
Let \(q_s = P(\hat{Y} = 1 | S = s)\) be the flagging rate in group \(s\). Then independence says \(q\) is equal across groups.

Let’s show that if separation holds (given \(TPR_B = TPR_W = t\) and \(FPR_B = FPR_W = f\)), independence cannot (\(q_B \neq q_W\)).

Step 1: Start with the “law of total probability”: \(q_s = TPR_s \cdot p_s + FPR_s \times (1 - p_s)\). That is, the flagging rate (\(q_s\)) is the true positive rate times the true reoffending rate plus the false positive rate times the non-reoffending rate. Apply to each group (black and white), and then subtract equations:

\[q_B = TPR_B \cdot p_B + FPR_B \times (1 - p_B)\] \[q_W = TPR_W \cdot p_W + FPR_W \times (1 - p_W)\]

\[q_B - q_W = TPR_B \cdot p_B + FPR_B \times (1 - p_B) - (TPR_W \cdot p_W + FPR_W \times (1 - p_W))\]

Question 1: Let separation hold (\(TPR_B = TPR_W = t\) and \(FPR_B = FPR_W = f\)) and show that \(q_B - q_W = (t - f)(p_B - p_W)\).

Now we’ll interpret the equation \(q_B - q_W = (t - f)(p_B - p_W)\):

Question 2: \((t -f)\) is a measure of how good the classifier is. We want to argue that a “nontrivial model” will always have \(t - f > 0\). What would a model look like if \(t - f\) were zero?

Question 3: \((p_B - p_W)\) is the difference in reoffending rates. If base rates were equal (the black reoffending rate is equal to the white reoffending rate), are separation and independence at odds? Why or why not?

The intuition of the Fairness Impossibility Theorem proof is that if you run the same net through two different ponds, you’ll catch more fish in the fuller pond. That is, if your model has TPR = 70% and FPR = 30%, you’ll catch 70% of black reoffenders and 70% of white reoffenders. But if there are more black reoffenders to catch, you’ll flag more black defendents than white defendents: your flagging rate will not be equal. So when base rates aren’t equal, separation and independence can’t coexist. You have to pick one or the other to prioritize.

Part 2: Equal Base Rates

Now let’s show that if we scramble reoffended so that base rates become equal, our random forest model can satisfy both independence and separation.

# Here I use `sample()` to scramble reoffended, assigning 0 and 1 randomly
# across observations.

set.seed(123)
crime_equal_base <- crime %>%
  filter(race %in% c("Caucasian", "African_American")) %>%
  mutate(reoffended = sample(reoffended, size = nrow(.))) %>%
  mutate(train = sample(0:1, size = nrow(.), prob = c(.2, .8), replace = T))

# Question 4: Verify that base rates are (approximately) equal:
# black defendents reoffend at the same rate that white defendents reoffend.
crime_equal_base %>%
  group_by(______) %>%
  summarize(______)

# Here we fit a new random forest using `crime_equal_base`:
crime_train_equal_base <- crime_equal_base %>% filter(train == 1)
crime_test_equal_base  <- crime_equal_base %>% filter(train == 0)

crime_rf_equal_base <- randomForest(
  factor(reoffended) ~ . - train,
  data  = crime_train_equal_base,
  ntree = 100,
  mtry  = 3
)

crime_predictions_equal_base <- crime_test_equal_base %>%
  select(-train) %>%
  mutate(
    prediction = predict(crime_rf_equal_base, newdata = crime_test_equal_base, type = "prob")[, 2],
    classifier = if_else(prediction >= 0.3, 1, 0)
  )

# Question 5: Test whether independence holds: do we flag blacks and whites
# at (approximately) the same rate? Which race do we flag a little more?
crime_predictions_equal_base %>% 
  group_by(_____) %>% 
  summarize(_____)

# Question 6: Test whether separation holds: are FPR and FNR 
# (approximately) equal across racial groups? Which group do
# we incorrectly flag a little more frequently?
crime_predictions_equal_base %>% 
  group_by(_____) %>% 
  summarize(
    FPR = sum(reoffended == __ & classifier == __) / sum(reoffended == 0), 
    FNR = sum(reoffended == __ & classifier == __) / sum(reoffended == 1)
    )

Part 3: What if we had the perfect classifier?

It’s not necessarily a realistic case, but what if our model made no mistakes?

Question 7: If our model made no mistakes, what would FPR and FNR be for everyone? Would separation be satisfied?

Question 8: Under unequal base rates, would independence be satisfied?

Question 9: Argue why, in the case of the perfect classifier, independence becomes less of a concern, and the policy debate should shift away from whether the algorithm is fair and toward the broader question about what consequences society believes should follow from accurate predictions about future behavior (the amount of bail, sentence severity, whether parole is granted, and the level of supervision after release).

Part 4: Coin Flip Classifier

Suppose instead of using a prediction model, a judge flipped a fair coin: heads means “flag,” tails means “do not flag,” and the result is completely random.

Question 10: Does the coin flip satisfy Independence? Explain.

Question 11: Does the coin flip satisfy Separation? Explain.

Question 12: Does performing better on fairness metrics automatically make the coin flip a good decision tool? What is missing from a purely fairness-based evaluation?

Part 5: Developing a Fairer Model

The threshold of 0.5 is just a default. We can lower or raise it to shift the balance between false positives and false negatives. Lowering the threshold flags more people, which catches more true reoffenders but also wrongly flags more innocent defendants. Raising it does the opposite.

The code below computes fairness metrics at different thresholds.

fairness_at_threshold <- function(t) {
  crime_test %>%
    mutate(
      pred_prob  = predict(crime_rf, newdata = crime_test, type = "prob")[, 2],
      classifier = if_else(pred_prob >= t, 1, 0)
    ) %>%
    filter(race %in% c("Caucasian", "African_American")) %>%
    group_by(race) %>%
    summarize(
      threshold = t,
      accuracy = mean(classifier == reoffended),
      flag_rate = mean(classifier),
      FPR = sum(reoffended == 0 & classifier == 1) / sum(reoffended == 0),
      FNR = sum(reoffended == 1 & classifier == 0) / sum(reoffended == 1)
    )
}

threshold_table <- bind_rows(
  fairness_at_threshold(0.1),
  fairness_at_threshold(0.2),
  fairness_at_threshold(0.3),
  fairness_at_threshold(0.4),
  fairness_at_threshold(0.5),
  fairness_at_threshold(0.6),
  fairness_at_threshold(0.7),
  fairness_at_threshold(0.8),
  fairness_at_threshold(0.9)
)

threshold_table

Question 13: What threshold does best for accuracy?

Question 14: What threshold does best for independence? Why do you think that is?

Question 15: What threshold does best for minimizing the difference between the FPR? Why do you think that is?

Question 16: What threshold does best for minimizing the difference between the FNR? Why do you think that is?

Question 17: Given your answers to the previous questions, what threshold would you recommend and why?

Question 18: What if you chose different thresholds for each racial group? If you chose a threshold of 0.4 for Caucasians, what threshold for African Americans would minimize the differences in flagging rate, FPR, and FNR?

The problem with applying different thresholds to different groups: this would raise ethical and legal concerns because the algorithm would explicitly treat defendants differently based on race. Two people with identical criminal histories and predicted probabilities could receive different outcomes solely because they belong to different racial groups. The policy would amount to racial discrimination, even if the intent was fairness.

Defenders of race-conscious policies point to affirmative action in college admissions, where race has sometimes been considered in an attempt to expand educational opportunities for historically disadvantaged groups. But criminal justice raises different concerns because the stakes involve punishment, incarceration, and restrictions on liberty rather than access to opportunities. This creates a difficult philosophical question about what fairness should mean in practice. Should fairness require identical rules for everyone, regardless of group outcomes? Or should fairness focus on balancing mistakes and harms across groups, even if that requires explicitly different treatment?

Question 19: Write a short paragraph summarizing the main lesson of this assignment. Your paragraph should address: different ways we measure fairness, why independence and separation matters, what the impossibility theorem says, and what it means for people who design models used in high-stakes settings like criminal justice.

Download this assignment

Here’s a link to download this assignment.