Facial Recognition Software Needs Human Subject Experiments

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Contact

Tess Malone, Communications Officer

tess.malone@cc.gatech.edu

 

Sidebar Content
No sidebar content submitted.
Summaries

Summary Sentence:

A Georgia Tech researcher says human-subject experiments must be a priority before human intervention is considered a one-size-fits-all solution.

Full Summary:

No summary paragraph submitted.

Media
  • Facial Recognition Facial Recognition
    (image/png)

Facial recognition software is becoming the go-to security measure for businesses, but it can be inaccurate and racially biased. Although many companies have proposed adding human intervention to mitigate this, a Georgia Tech researcher says human-subject experiments must be a priority before human intervention is considered a one-size-fits-all solution.

“Humans are biased themselves, so how can you resolve an issue of bias with a human?” School of Computer Science Ph.D. alumna Samira Samadi said. “It might even make it worse.”

The limits of facial recognition

Facial recognition software supposedly automates building security. The software takes photos as people enter a building, which it  then cross-references with an employee database. If the software finds a match, a person can enter the building.

Despite many advances in image recognition and artificial intelligence, systems are often more accurate for men with lighter skin tones and less for women with darker skin tones. Companies have proposed adding a human evaluator to compensate for the software’s limitations.

Yet Samadi, who researches algorithmic fairness, immediately recognized the potential for more bias. She wanted to know whether adding a human evaluator to the process increases fairness or bias.

Experimental design

Yet designing such a human/user study is challenging as Samadi and colleagues at Microsoft Research realized. Working with actual security guards or receptionists would be ideal, but was not feasible in practice.

Samadi turned to recruiting people through Mechanical Turk as she had done in the past. These users would offer her volume, but they were not trained in recognizing faces. First, she studied how to compare faces. Then she learned how to teach Mechanical Turk users about facial recognition systems, how to make decisions about the accuracy of the system, and how to be confident in that decision.

After research, Samadi developed a user study and did some trials with friends to ensure the study was clear and understandable. Then she ran the study on 300 users on Mechanical Turk.

Each user was trained on how to distinguish faces and evaluate the software. Next, the user saw two images and how they were scored by the software. Samadi expected the human evaluator would show bias between two lighter versus dark-skinned people, but the results were much different.

Future studies

“We really tried to imitate a real world scenario, but that actually made it more complicated for the users,” Samadi said.

The researchers were unsure whether the problem with the study  was because users didn’t understand the study or biased behavior, but they ultimately decided not to publish the research. However, Samadi did publish a position paper, A Human in the Loop is Not Enough: The Need for Human-Subject Experiments in Facial Recognition, with Microsoft Research’s Farough Poursabzi-Sangdeh, Jennifer Wortman Vaughan, and Hanna Wallach. Samadi presented the work at the Conference on Human Factors in Computer Systems (CHI) in April.

The paper argued about both the necessity and issue with studies like these. There are four main challenges about both the efficacy and generalizability with a human-subject study like the one they conducted:

-Datasets: Finding an appropriate dataset is difficult for a number of factors: Sourcing images ethically is challenging because past research has relied on celebrity or politician images who are easily recognizable and thus bias the study. Many datasets are also already biased and contain more lighter-skinned faces than darker. Also, many datasets are higher quality than what would be found in camera footage and not an effective real world comparison.

-Participants: Many available participants for studies like these are students or Mechanical Turk workers who are inexperienced in facial recognition.

-Context: Recognizing faces in an experiment is not comparable to on the job duties when an unfamiliar person may be a threat.

-User Interface: Companies do not release their user interfaces for facial recognition software, leaving it up to researchers to design something that may not reflect what is used in real world software.

“If someone wants to attack this problem in the future, they should know the challenges they have ahead of them,” Samadi said.

Additional Information

Groups

College of Computing, School of Computer Science

Categories
No categories were selected.
Related Core Research Areas
No core research areas were selected.
Newsroom Topics
No newsroom topics were selected.
Keywords
No keywords were submitted.
Status
  • Created By: Tess Malone
  • Workflow Status: Published
  • Created On: Aug 4, 2020 - 2:57pm
  • Last Updated: Aug 4, 2020 - 3:02pm