Infographic: effect of historical disadvantage on algorithmic selections.
A hypothetical AI system is accepting or rejecting loan applications. Its decisions are informed by training data that reflects a historical gender inequality. The inequality was large in the past, but is improving over time. This means that an AI trained on data from 2015 to 2020 will disproportionately reject female applicants, but the problem can be addressed by restricting it to only use recent data. In this infographic, we can adjust the age of the data using a slider. There is a limit to how effective this strategy can be, however, because this also reduces the amount of data available to learn from. If we only accept data from the narrow 2019 to 2020 window, then the model starts making new errors (on both men and women).
Rejected due to
historical bias.
Rejected due
to insufficient
training data.
Suitable man
Unsuitable man
Suitable woman
Unsuitable woman
Selected
2015
2017
2019
Training Data Interval:
2015
- 2020