Case Studies: When AI and CV Screening Goes Wrong

Plus what to consider when using AI in your hiring processes

šŸ“† Join us in person at our first event! Tue 20th August 2024, 7pm, Newspeak House (near Shoreditch High Street in London). See the event page for full event details and sign-up information.

Imagine applying for your dream job, only for your CV to be dismissed by an algorithm before it even reaches human eyes. This is the unsettling reality of AI in recruitment.

ā

72% of CVs are never seen by human eyes. Computer programs flip through them, pulling out skills and experiences, scoring each one as a match for the job opening. The more candidates they eliminate with this first screening, the fewer human-hours theyā€™ll have to spend processing the top matches.

Cathy Oā€™Neil, Weapons of Math Destruction (2016)23

Algorithms can unknowingly perpetuate unfair bias, which delays job opportunities and pushes candidates to settle for less. It's a silent and demoralising blow to hopeful applicants2 . The Institute for the Future of Work has highlighted the risks and impacts of algorithmic bias3 , noting conflicts with the ten "Good Work" principles4 designed to protect workplace rights.

Distinguishing between ā€œdirectā€ and ā€œindirectā€ algorithmic bias is not always obvious. Algorithms arenā€™t often aware of their own bias and the extent to which we understand the inner workings of complex systems such as Large Language Models (LLMs) is limited.

Understanding direct vs indirect discrimination in algorithms5

Direct discrimination occurs when someone is treated unfairly due to a protected characteristic (e.g. age, gender, race)6. iTutorGroup, an online education platform, was found to have automatically rejected more than 200 qualified U.S.-based tutor applications from women over 55 and men over 607, leading to a lawsuit under the Age Discrimination in Employment Act8. The U.S. Equal Employment Opportunity Commission (EEOC) imposed a penalty of $365,000 on iTutorGroup.

Indirect discrimination occurs when a policy, though seemingly neutral, disadvantages certain groups due to a protected characteristic. For example, Google's word embedding model word2vec, linked "man" with "computer programmer" and "woman" with "homemaker"9 , reinforcing gender biases. Algorithms analyse language patterns and draw inferences from the proximity of words10, which can unintentionally perpetrate societal biases.

Whether you're seeking a job, looking for your next hire, or advocating for equality in the workplace, itā€™s imperative to understand how AI-based CV screening tools can hinder applicantsā€™ chances of a fair process. Let's explore three case studies highlighting the need for greater safety measures within such tools.

  • Discrimination Type: Direct

  • Key Takeaway: ChatGPT discriminates against candidates based on their names.

  • Summary: For many companies, AI tools like ChatGPT are becoming an integral yet largely unmonitored part of their hiring process. Inspired by a similar field experiment,12 Bloomberg prompted the GPT-3.5 model to rank equally qualified resumes 1000 times. They found that for a financial analyst role, names typical of Asian women were top-ranked twice as often as Black men. For a software engineering role, Black women were selected 36% less often than the best-performing group. This is a classic case of direct discrimination, where GPT ā€œreplicated historical biases embedded in the workplaceā€ and simply ā€œmatched patterns that already existā€ rather than predicting new information - as noted by Ifeoma Ajunwa (law professor at the University of North Carolina).

šŸ™…ā€ā™€ļø Case study #2: Amazon deprecates its ā€œsexistā€ AI tool13

  • Discrimination Type: Direct + Indirect

  • Key Takeaway: Biased training data leads to biased AI systems.

  • Summary: A woman with a CV highlighting numerous achievements and valuable experiences applies for an executive role at a top company. When the hiring system automatically rejects her CV, is this on merit, or because she attended a ā€œwomenā€™s leadership conferenceā€œ? According to a Reuters report,14 Amazonā€™s automated resume screening system downgraded resumes that mentioned the word "womenā€ or if the applicant was from an all-womenā€™s college. The algorithm, trained on male-dominated data, favoured male candidates for technical roles. Whilst the automated resume screening system was ā€œnever used by Amazon recruiters to evaluate candidatesā€, this incident highlights the risk and reality of indirect discrimination. Echoing Professor Ajunwa's observation from Case Study #1, the system replicated historical biases embedded in Amazonā€™s training data. According to Nihar Shah, Associate Professor in the Machine Learning and the Computer Science Department at Carnegie Mellon University, we are still far away from ensuring AI algorithms are ā€œfair, interpretable and explainable", with 55% of managers anticipating regular use of AI in HR within five years.

  • Discrimination Type: Indirect

  • Key Takeaway: Explainable AI (XAI) practices can reveal opportunities to de-bias algorithms and improve commercial outcomes.

  • Summary: Your chances of getting hired can be impacted because the algorithm was taught to prioritise the wrong role requirements. In one case, a technical audit of a resume screening tool revealed that the algorithm favoured candidates named ā€œJaredā€ and those who played high school lacrosse15 . This bias was not directly related to protected characteristics like age, gender, or race, making it harder to identify, classify, and correct. As with the above examples, predictions from an algorithm are only as good as the training data. If an algorithm's training data is skewed, it can lead to unfair advantages for some candidates. Itā€™s likely that certain names or sports are correlated with business success in the algorithmā€™s training data. Itā€™s hard to argue that these should be criteria for hiring applicants for unrelated job roles. Mark J. Girouard, the employment attorney at Nilan Johnson Lewis who audited this resume screening tool, has stressed the need for accountability16, advising to ā€œopen the hood and see what the machine is actually doing.ā€

ā

ā€œThere was probably a hugely statistically significant correlation between those two data points and performance, but youā€™d be hard-pressed to argue that those were actually important to performance.ā€

Mark J. Girouard15, employment attorney at Nilan Johnson Lewis

Current efforts to mitigate bias in AI recruitment

The global AI recruitment market is forecasted to grow at a compounded annual rate of 6.9% (from 2024 to 2032)17 , raising urgent questions about fairness and bias. Despite various recommendations from companies and regulatory organisations, currently, there is no one-size-fits-all solution for mitigating unfair bias.

  • OpenAI recommends fine-tuning responses and redacting names from resumes to prevent misuse of their LLMs. OpenAI uses blog posts and system cards to inform developers about what their models can and cannot do. However, this is a complex topic, and educational efforts have left executives confused18.

  • As Michael Kearns discusses in The Ethical Algorithm19 , simply excluding direct indicators such as race and gender is not enough to ensure a bias-free model. LLMs can use other available data like a person's postcode or the car they drive, which often correlates with protected characteristics. LLMs can "skirt preferences by finding and using proxies" for the information we try to omit, thereby maintaining or even worsening the bias.

  • U.S. Federal agencies use adverse impact measures to identify hiring discrimination against protected characteristics. The Four-Fifths Rule states that ā€œthe selection rate of a certain group is substantially different than another if their ratio is less than 80%.ā€ For example, assume a scenario where identically qualified male and female candidates apply for a job. If the AI screening tool selects 30% of the women and 60% of the men for interview, the ā€œdisparate impact ratioā€ is 30/80 = 38%. This is much lower than the 80% benchmark, indicating that the tool discriminates against female applicants. However, this rule only reveals bias without providing a solution to prevent CV screening tools from unfairly filtering out applicants.

  • Complying with different hiring and data privacy laws between the US and the UK can ā€œobscure rather than improve systemic discrimination in the workplaceā€20 when using Automated Hiring Systems (AHS). Companies are left to self-regulate without competing regulations for the fair use of these AI-based systems, which introduces hidden biases and implementation risks. For example, US-centric societal values and laws relating to bias in hiring could inappropriately influence UK workplaces.20

  • The rapid development of AI technology is outpacing our understanding of its implications. AI developers must understand how their algorithms work ā€œunder the hoodā€ and adopt ethical best practices to measuring unfair bias. Regulators like the Equality and Human Rights Commission (EHRC) and the Information Commissioner's Office (ICO) are called upon by the Responsible Technology Adoption Unit (part of the Department for Science, Innovation and Technology) to redefine their guidance and address the nuances of algorithmic recruitment21 . The Institute for the Future Of Work advocates for equality as a ā€œguiding principle in deploying AI and auditing systems, alongside fairness, accountability, sustainability, transparency, and data protectionā€22 for companies, policymakers and developers.

The stark reality is that we often learn about biases in systems like ChatGPT and Amazonā€™s in-house recruitment tool only after thorough investigations. We must learn to be vigilant about AI-enhanced hiring tools and ensure they are systematically examined. The consequences of unchecked biases can cause systematic widespread harm to individuals and entire industries.

Join The Dialogue For Change

ā¬†ļø Share this newsletter with industry professionals in your network.

šŸ—£ļø Attend our virtual and in-person round table discussions. Weā€™re running our first event on 20th August 2024, 7pm, at Newspeak House (near Shoreditch High Street in London). See the event page for full event details and sign-up information. The event is free to attend (another reason to forward this email). There will be fish, chips & refreshments provided. šŸŸšŸŸ

šŸ’¬ Talk to us about your valuable perspectives and domain expertise. You can book a call with us here. Weā€™d love to know if youā€™re doing work in this space!

šŸ“š Resources

  1. Centre for Data Ethics and Innovation (2020). Review into bias in algorithmic decision-making, Page 42. GOV.UK.

  2. European Agency for Safety and Health at Work: EU-OSHA (2021). Impact of artificial intelligence on occupational safety and health.

  3. Dr Gilbert A., Thomas A., Sheir S., Barnard G. (2023). Good Work Algorithmic Impact Assessment Version 1: An approach for worker involvement, Page 12, Table 1: Example risks, impacts and opportunities of algorithmic systems at work. Institute for the Future Of Work.

  4. The Institute for the Future Of Work, Credits: Information Commissionerā€™s Office (ICO), The Joseph Rowntree Charitable Trust (2024). The Good Work Charter Toolkit.

  5. Equality and Human Rights Commission (2019). Direct and indirect discrimination.

  6. Equality and Human Rights Commission (2021). Protected characteristics.

  7. U.S. Equal Employment Opportunity Commission Press Release (2023). iTutorGroup to Pay $365,000 to Settle EEOC Discriminatory Hiring Suit.

  8. Equality and Human Rights Commission (2020). Age discrimination.

  9. Bolukbasi T., Chang K., Zou J., Saligrama V., Kalai A. (2016). Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. arXiv

  10. Kearns M., Roth A. The ethical algorithm: The science of socially aware algorithm design, Chapter 2 Algorithmic Fairness: From Parity to Pareto, Page 57. Google Books.

  11. Yin L., Alba D., Nicoletti L. (2024). OpenAIā€™S GPT is a Recruiterā€™s Dream Tool. Tests Show Thereā€™s Racial Bias. Bloomberg.

  12. Bertrand M., Mullainatha S. (2004). Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94 (4): 991ā€“1013. 

  13. BBC News (2018). Amazon scrapped "sexist AIā€™ tool".

  14. Dastin J. (2018). Insight - Amazon scraps secret AI recruiting tool that showed bias against women. Reuters.

  15. Gershgorn D. (2018). Companies are on the hook if their hiring algorithms are biased. Quartz.

  16. Schellmann H., Strong J., Cillekens E., Green A. (2021). Podcast: Hired by an algorithm. MIT Technology Review Transcript.

  17. Dhapte A. (2024). AI Recruitment Market Research Report Information. Market Research Future.

  18. Truog, D. (2024). Even GenAI-Trained Execs Are Confused About It (Our Survey Shows) - So Give Them Better Training. Forrester.

  19. Kearns M., Roth A. The ethical algorithm: The science of socially aware algorithm design, Chapter 2 Algorithmic Fairness: From Parity to Pareto, Page 66-68. Google Books.

  20. Sanchez-Monedero J., Dencik L., Edwards L. (2019). What Does It Mean to ā€˜Solveā€™ the Problem of Discrimination in Hiring?. SSRN.

  21. Centre for Data Ethics and Innovation (2020). Review into bias in algorithmic decision-making, Page 40. GOV.UK

  22. Graham L., Dr Gilbert A., Simons J., Thomas A., Mountfield H., (2020). AI in hiring: Assessing impacts on equality Work, Page 4. Institute for the Future Of Work.

  23. O'Neil C. (2016) Weapons of Math Destruction. Wikipedia