For years, activists and academics have been raising concerns that facial analysis software that claims to be able to identify a person’s age, gender, and emotional state can be biased, unreliable, or invasive — and should not be sold.
Acknowledging some of those criticisms, Microsoft said Tuesday that it planned to remove those features from its artificial intelligence service for detecting, analyzing, and recognizing faces. They will stop being available to new users this week and will be phased out for existing users within the year.
The changes are part of a push by Microsoft for tighter controls of its artificial intelligence products. After a two-year review, a team at Microsoft has developed a “Responsible AI Standard,” a 27-page document that sets out requirements for AI systems to ensure they are not going to have a harmful impact on society.
The requirements include ensuring that systems provide “valid solutions for the problems they are designed to solve” and “a similar quality of service for identified demographic groups, including marginalized groups.”
Before they are released, technologies that would be used to make important decisions about a person’s access to financial employment, education, health care, or a life opportunity are subject to a review by a team led by Natasha Crampton, Microsoft’s chief responsible AI officer.
There were heightened concerns at Microsoft around the emotion recognition tool, which labeled someone’s expression as anger, contempt, disgust, fear, happiness, neutral, sadness, or surprise.
“There’s a huge amount of cultural and geographic and individual variation in the way in which we express ourselves,” Crampton said. That led to reliability concerns, along with the bigger questions of whether “facial expression is a reliable indicator of your internal emotional state,” she said.
The age and gender analysis tools being eliminated — along with other tools to detect facial attributes such as hair and smile — could be useful to interpret visual images for blind or low-vision people, for example, but the company decided it was problematic to make the profiling tools generally available to the public, Crampton said.
In particular, she added, the system’s so-called gender classifier was binary, “and that’s not consistent with our values.”
Microsoft will also put new controls on its face recognition feature, which can be used to perform identity checks or search for a particular person. Uber, for example, uses the software in its app to verify that a driver’s face matches the ID on file for that driver’s account. Software developers who want to use Microsoft’s facial recognition tool will need to apply for access and explain how they plan to deploy it.
Users will also be required to apply and explain how they will use other potentially abusive AI systems, such as Custom Neural Voice. The service can generate a human voice print, based on a sample of someone’s speech, so that authors, for example, can create synthetic versions of their voice to read their audiobooks in languages they do not speak.
Because of the possible misuse of the tool — to create the impression that people have said things they have not — speakers must go through a series of steps to confirm that the use of their voice is authorized, and the recordings include watermarks detectable by Microsoft.
In 2020, researchers discovered that speech-to-text tools developed by Microsoft, Apple, Google, IBM, and Amazon worked less well for Black people. Microsoft’s system was the best of the bunch but misidentified 15 percent of words for white people, compared with 27 percent for Black people.
The company had collected diverse speech data to train its AI system but had not understood just how diverse language could be. So it hired a sociolinguistics expert from the University of Washington to explain the language varieties that Microsoft needed to know about. It went beyond demographics and regional variety into how people speak in formal and informal settings.
“Thinking about race as a determining factor of how someone speaks is actually a bit misleading,” Crampton said. “What we’ve learned in consultation with the expert is that actually a huge range of factors affect linguistic variety.”
Crampton said the journey to fix that speech-to-text disparity had helped inform the guidance set out in the company’s new standards.
“This is a critical norm-setting period for AI,” she said, pointing to Europe’s proposed regulations setting rules and limits on the use of artificial intelligence. “We hope to be able to use our standard to try and contribute to the bright, necessary discussion that needs to be had about the standards that technology companies should be held to.”