Why it’s totally unsurprising that Amazon’s recruitment AI was biased against women

jeff bezos amazon

Amazon admitted this week that it experimented with using machine learning to build a recruitment tool. The trouble is, it didn’t exactly produce fantastic results and it was later abandoned.

According to Reuters, Amazon engineers found that besides churning out totally unsuitable candidates, the so-called AI project showed a bias against women.

To Oxford University researcher Dr Sandra Wachter, the news that an artificially intelligent system had taught itself to discriminate against women was nothing new.

“From a technical perspective it’s not very surprising, it’s what we call ‘garbage in and garbage out,'” she told Business Insider.

Garbage in, garbage out

The problem boils down to the data Amazon fed its algorithm, Wachter speculated.

“What you would do is you go back and look at historical data from the past and look at successful candidates and feed the algorithm with that data and try to find patterns or similarities,” said Wachter.

“You ask the question who has been the most successful candidates in the past […] and the common trait will be somebody that is more likely to be a man and white.”

Reuters reported that the engineers building the program used résumés from a 10 year period, which were predominantly male. Amazon did not provide Business Insider with the gender split in its engineering department but sent us a link to its diversity pages. Its global gender balance is 60% men, with 74% of managerial roles being held by men.

“So if then somebody applies who doesn’t fit that profile, it’s likely that that person gets filtered out just because the algorithm learned from historical data,” said Wachter. “That happens in recruitment, and that happens in basically everywhere where we use historical data and this data is biased.”

Garbage in, garbage out (sometimes abbreviated to “GIGO”) just means that bad input will result in bad output, and it’s the same with bias. The problem is that it’s incredibly difficult to filter out algorithmic bias, because the algorithms we build pick up on human prejudices.

“What is the algorithm supposed to do? It can only learn from our semantics and our data and how we interact with humans, and the moment there is no gender parity yet, unfortunately,” said Wachter.

Machine learning can produce self-fulfilling prophecies

This is far from the first time a computer program has displayed human bias. “It’s just yet another example of how algorithmic decision-making and AI in general can actually reinforce existing stereotypes that we have in our society,” said Wachter.

In 2016, a ProPublica investigation found that a computer program called COMPAS, designed to assess the risk of criminals re-offending, was discriminating against black people. As an example, the program deemed an 18-year old black girl who briefly stole a child’s scooter to be more likely to re-offend than a 41-year old white man with two prior convictions for shoplifting power tools.

Wachter points out that COMPAS’s software asked questions which led to individuals being judged by their social environment, such as “Was one of your parents ever sent to jail or prison?” or “How many of your friends/acquaintances are taking drugs illegally?”

“This is not about the individual anymore, that is about your social environment, and being judged based on other people,” said Wachter. “If you apply that to every single person, that’s a self-fulfilling prophecy.”

Scanning for bias

That isn’t to say there’s no use in perfecting our algorithms in the meantime. The first thing we can do is come up with effective methods for spotting bias inside them.