An independent test found flaws in AI systems designed to evaluate job applicants.
What’s new: MyInterview and Curious Thing, which automate job interviews, gave a candidate who spoke only in German high marks on English proficiency, according to MIT Technology Review.
The test: Reporters created a fake job posting for an office administrator/researcher on both companies’ platforms. They used the tools provided to select questions for applicants to answer and define their ideal candidate. Then one of them applied for the position, completing interviews by reading aloud from a Wikipedia article written in German.
- MyInterview typically conducts a video interview and analyzes a candidate’s verbal and body language, then grades their suitability for a given job. MyInterview interpreted the German-speaking reporter’s responses as nonsensical English (“So humidity is desk a beat-up. Sociology, does it iron?”) but graded her as a 73 percent match for the job. A MyInterview spokesperson said the algorithm inferred personality traits from the interviewee’s voice rather the content of her answers.
- Curious Thing analyzes phone interview responses. Its algorithm gave the reporter 6 out of 9 points for English-language competency after she responded exclusively in German. The company’s cofounder said the bogus application was an “extremely valuable data point.”
Behind the news: A 2019 survey found that 40 percent of companies worldwide use AI to help screen job candidates, but outside investigators have found such systems lacking.
- In February, Bavarian Public Broadcasting showed that accessories like glasses and headscarves and backgrounds including objects like pictures and bookcases dramatically changed a German video-interview platform’s automated assessments.
- In 2018, LinkedIn discovered that a candidate recommendation algorithm preferred male applicants. The company replaced it with a new system intended to counteract that bias.
- A recent study from NYU, CUNY, and Twitter proposed a matrix for rating automated hiring systems to counteract the prevalence of algorithms that rely on dubious features like voice intonation and subtle facial expressions.
Why it matters: Matching prospective employers and employees is a nuanced process, and any attempt to automate it requires the utmost rigor. Applicants subject to a flawed algorithm could be barred from jobs they’re eminently qualified for, while prospective employers who rely on it could miss ideal candidates.
We’re thinking: An AI system that gives high marks to someone who replies to an English-language interview in German — confidently rendering incorrect predictions in response to data that’s dramatically different its training set — is not equipped to handle data drift. Such concepts are not purely academic. They have a huge impact on such systems — and on critical decisions like who gets a job.